WATCH: OpenAI unveils GPT-4o, a new ChatGPT that listens and talks
The GPT-4o is capable of realistic voice conversation and able to interact across text and vision.
OpenAI researchers showcased the new features at a livestream event in San Fransisco on Monday. Picture: Getty Images
ChatGPT just got smarter with maker OpenAI announcing it would release a new artificial intelligence (AI) model called GPT-4o capable of realistic voice conversation and able to interact across text and vision.
OpenAI researchers showcased the new features at a livestream event in San Fransisco, California, US on Monday.
The new model will bring the faster and more accurate GPT-4 AI model to free users. It was previously reserved for paid customers.
Watch GPT-4o, Open AI’s new flagship model which can reason across audio, vision, and text in real time
Say hello to GPT-4o, our new flagship model which can reason across audio, vision, and text in real time: https://t.co/MYHZB79UqN
— OpenAI (@OpenAI) May 13, 2024
Text and image input rolling out today in API and ChatGPT with voice and video in the coming weeks. pic.twitter.com/uuthKZyzYx
AI from the movies
The new model GPT-4o − the “O” stands for omni − will be rolled out in OpenAI’s products over the next few weeks.
The new audio capabilities enable users to speak to ChatGPT and obtain real-time responses with no delay, as well as interrupt ChatGPT while it is speaking, both hallmarks of realistic conversations that AI voice chatbots have not had until now.
OpenAI CEO Sam Altman said the first key part of the company’s mission is to put “very capable AI tools in the hands of people for free (or at a great price).”
“Second, the new voice (and video) mode is the best computer interface I’ve ever used. It feels like AI from the movies; and it’s still a bit surprising to me that it’s real. Getting to human-level response times and expressiveness turns out to be a big change.”
ALSO READ: WATCH: WhatsApp rolls out update with refreshed design for iOS and Android
Altman said the original ChatGPT showed a hint of what was possible with language interfaces.
“This new thing feels viscerally different. It is fast, smart, fun, natural, and helpful. Talking to a computer has never felt really natural for me; now it does.
“As we add (optional) personalisation, access to your information, the ability to take actions on your behalf, and more, I can really see an exciting future where we are able to use computers to do much more than ever before,” Altman said.
Live demo of GPT-4o voice variation pic.twitter.com/b7lLJkhBt1
— OpenAI (@OpenAI) May 13, 2024
Powers of GPT-4o
Chief technology officer Mira Murati and engineers from OpenAI demonstrated the new powers of GPT-4o at the virtual event, posing challenges to the ChatGPT chatbot.
In a demonstration of the model, OpenAI researchers and Murati held a conversation with the new ChatGPT using just their voices, showing that the tool can reply with an audio response in milliseconds allowing for a more fluid conversation.
Live audience request for GPT-4o vision capabilities pic.twitter.com/FPRXpZ2I9N
— OpenAI (@OpenAI) May 13, 2024
The demo mainly featured OpenAI staff members asking questions to the voiced ChatGPT, which responded with jokes and human-like banter, AFP reported.
The ChatGPT bot also served as an interpreter from English to Italian, interpreted facial expressions and walked one user through a difficult algebra problem.
ALSO READ: WATCH: Apple apologises for controversial iPad Pro ‘crush’ ad
For more news your way
Download our app and read this and other great stories on the move. Available for Android and iOS.