Points
- OpenAI launches the highly anticipated Advanced Voice Mode (AVM) for ChatGPT in alpha to select users.
- AVM allows real-time conversations with ChatGPT using text-to-speech synthesis.
- The feature aims to mimic human-to-human discussions with natural cadence.
- OpenAI emphasizes safety, implementing guardrails to prevent misuse.
OpenAI has introduced its much-anticipated “Advanced Voice Mode” (AVM) for ChatGPT, now available in alpha to select users. After several delays for safety and fine-tuning, AVM allows users to engage in real-time conversations with ChatGPT through a text-to-speech synthesis module.
This feature is reminiscent of Google’s 2018 Duplex AI service, which aimed to make AI robust enough to handle casual conversations and confirm information accurately. Although Google’s Duplex project was eventually shuttered, its legacy lives on in OpenAI’s ChatGPT.
AVM features real-time communication that attempts to mimic human-to-human discussions. ChatGPT responds in a human-like voice with natural cadence, and users can interrupt the chatbot mid-sentence, with the AI keeping track of the conversation. This level of interaction aims to create a more immersive and realistic user experience.
To ensure safety and prevent misuse, OpenAI has implemented several measures. The model only speaks in four preset voices, and systems are in place to block outputs that deviate from these voices. Additionally, guardrails are designed to block requests for violent or copyrighted content.
The feature’s launch in limited alpha allows OpenAI to continue evaluating its capabilities and safety implications. While the initial demos in May were impressive, there were some glitches, and potential misuse scenarios remain a concern. OpenAI tested the voice capabilities with 100 external red teamers across 45 languages to address privacy and safety issues.
The rollout of AVM has begun on a timed basis, with more users expected to be added “on a rolling basis.” OpenAI plans to make the feature available to all Plus subscribers by fall, expanding access to this innovative technology.
解説
- Advanced Voice Mode (AVM) represents a significant advancement in AI-human interaction, offering real-time, natural-sounding conversations with ChatGPT.
- The feature’s safety measures, including preset voices and content guardrails, are crucial for preventing misuse and ensuring user privacy.
- OpenAI’s cautious rollout strategy reflects a commitment to refining AVM’s capabilities and addressing potential safety concerns before wider release.
- The success of AVM could pave the way for more advanced AI applications, enhancing user experiences across various industries, from customer service to personal assistants.
- By incorporating feedback from early users, OpenAI aims to continuously improve AVM, ensuring it meets high standards of functionality and safety.