in

OpenAI rolls out highly anticipated advanced Voice Mode, but there’s a catch

OpenAI

When OpenAI held its Spring Launch event in May, one of the biggest standouts was its demo of the new Voice Mode on ChatGPT, supercharged with GPT-4o’s new video and audio capabilities. The highly anticipated new Voice Mode is finally here (kind of). 

Also: The best AI chatbots of 2024: ChatGPT, Copilot, and worthy alternatives

On Tuesday, OpenAI announced via an X post that the startup was rolling out Voice Mode in alpha to a small group of ChatGPT Plus users, offering them a smarter voice assistant that can be interrupted and respond to their emotions. 

If you participate in the alpha, you will receive an email with instructions and a message in the mobile app, as shown in the video above. If you haven’t received a notification just yet, no worries. OpenAI shared that it will continue to add users on a rolling basis, with the plan for all ChatGPT Plus subscribers to access it in the fall.

In the original demo at the launch event, shown below, the company showcased Voice Mode’s multimodal capabilities, including assisting with content on users’ screens and using the user’s phone camera as context for a response. 

Unfortunately, the Voice Mode alpha will not have these features. OpenAI shared that “video and screen sharing capabilities will launch at a later date.” The startup also said that since originally demoing the technology, it has improved the quality and safety of voice conversations. 

OpenAI tested the voice capabilities with 100+ external red teamers across 45 languages, according to the X thread. The startup also trained the model to speak only in the four preset voices, block outputs that deviate from those designated voices, and implement guardrails to block requests. 

The startup also said that it will take into account user feedback to improve the model further, and it will share a detailed report regarding GPT-4o’s performance, including limitations and safety evaluations, in August. 

Also: Google’s new gen AI tools help hyper-target your ad campaigns

You can become a ChatGPT Plus subscriber for $20 per month. Other membership perks include advanced data analysis features, image generation, and priority access to GPT-4o. 

One week after OpenAI unveiled this feature, Google unveiled a similar feature called Gemini Live, which is not yet available to users. That may change soon at the Made by Google event coming up in a few weeks.


Source: Robotics - zdnet.com

Microsoft 365 went down – again

Apple Vision Pro can be controlled by thoughts now, thanks to BCI integration