Amazon has officially introduced Nova Sonic, the next gen generative AI voice model. The model is designed to deliver highly natural-sounding speech along with real-time voice interaction and industry-leading speed, with an aim to compete with OpenAI’s and Google’s latest AI voice technologies.
Nova Sonic is perfect for enterprise AI applications because it is integrated into Amazon’s Bedrock developer platform and accessible through a bi-directional streaming API. Amazon positions the model as the most economical option among frontier voice models, claiming that it is not only faster than OpenAI’s GPT-4o but also 80% more economical.
“Available via a new API in Amazon Bedrock, the model simplifies the development of voice applications, such as customer service call automation and AI agents across a broad range of industries, including travel, education, health care, entertainment, and more,” the company added in a blog.
Also read: Samsung Galaxy Z Flip 6 price drops by Rs 26,519 on Amazon
According to a TechCrunch report citing officials, Nova Sonic is more accurate compared to other voice modes. The report continued that the benchmark measuring speech recognition across languages and dialects is almost accurate with the following results:
The enhanced digital assistant Alexa+ from Amazon is also powered by Nova Sonic, which improves its capacity to manage natural conversations, determine mumbled or noisy speech, and react with timing that is human-like. For productivity, accessibility, and customer service applications, the model even produces real-time transcripts of user speech, creating new opportunities for developer integration.
Meanwhile, Amazon also wants to introduce more models that can understand and interact with multiple modalities including voice, vision, and sensory data as its long-term plan. However, the exact details of the upcoming AI plans by the brand remain unknown at the moment.