According to some information, OpenAI is showing some customers a new multimodal AI model that is capable of both conversation and object recognition. new report from information. The media outlet cited anonymous sources who saw the incident as saying it could be part of the company’s activities. Scheduled to be shown on Monday.
The new model reportedly provides faster and more accurate interpretation of images and audio than existing separate transcription and text-to-speech models can.. This model apparently helps customer service representatives “better understand the intonation in a caller’s voice and whether they are being sarcastic,” and “in theory” this model could help students They say it can help with math and translate real-world signs. information.
The outlet’s sources say the model can outperform GPT-4 Turbo in terms of “answering certain questions,” but it can still be confidently wrong.
According to developer Ananai Arora, who posted the screenshot of the call-related code above, OpenAI may also be preparing a new built-in ChatGPT feature for making phone calls.Alola too evidence found OpenAI was provisioning servers for real-time audio and video communications.
If it’s announced next week, none of this will be GPT-5. CEO Sam Altman said: explicitly denied It is said that future announcements will have something to do with the model that is said to be “”.materially goodthan GPT-4. information They wrote that GPT-5 could be publicly available by the end of the year.
https://www.theverge.com/2024/5/11/24154307/openai-multimodal-digital-assistant-chatgpt-phone-calls