GPT-4o released with improved text, audio and vision capabilities

May 14, 2024

1 min read

GPT-4o (“o” for “omni”) is OpenAI’s latest multimodal large language model (LLM) and it brings major advancements in text, voice, and image content generation to offer more natural interaction between users and AI.

OpenAI claims its new AI model can respond to audio inputs in as little as 232 milliseconds and it is significantly faster in text response in non-English prompts with support for over 50 languages. You can also interrupt the model with new questions or clarifications while it is talking.

GPT-4o also features a more capable, human-sounding voice assistant that responds…

GPT-4o (“o” for “omni”) is OpenAI’s latest multimodal large language model (LLM) and it brings major advancements in text, voice, and image content generation to offer more natural interaction between users and AI.

OpenAI claims its new AI model can respond to audio inputs in as little as 232 milliseconds and it is significantly faster in text response in non-English prompts with support for over 50 languages. You can also interrupt the model with new questions or clarifications while it is talking.

GPT-4o also features a more capable, human-sounding voice assistant that responds… Read More GSMArena.com – Latest articles