Microsoft is stepping up its AI game with the launch of two homegrown models: MAI-Voice-1 AI and MAI-1-preview. These new AI models show the company’s ambitions to create its own AI technology instead of relying solely on OpenAI’s models.
Survey
✅ Thank you for completing the survey!
According to Microsoft, the MAI-Voice-1 is a “lightning-fast speech generation model.” It is said to produce a minute of audio in less than a second on a single GPU. This model is already powering some of Microsoft’s features, including Copilot Daily and Podcasts features.
Users can try out MAI-Voice-1 in Copilot and Copilot Labs. “Voice is the interface of the future for AI companions and MAI-Voice-1 delivers high-fidelity, expressive audio across both single and multi-speaker scenarios,” the company claims in a blogpost.
Alongside MAI-Voice-1, Microsoft introduced MAI-1-preview. This AI model was pre-trained and post-trained on roughly 15,000 Nvidia H100 GPUs. It is designed to follow instructions and provide helpful responses for everyday queries. The company has begun public testing of MAI-1-preview on LMArena, a popular platform for community model evaluation. Also, Microsoft will be rolling MAI-1-preview out for certain text use cases within Copilot over the coming weeks to learn and improve the model.
“We have big ambitions for where we go next,” Microsoft said. “Not only will we pursue further advances here, but we believe that orchestrating a range of specialized models serving different user intents and use cases will unlock immense value.”
With these new models, Microsoft is signaling that it wants to expand its AI ecosystem with models built in-house. While Copilot currently depends on OpenAI’s large language models, the company’s new offerings could eventually reduce that dependency.