Microsoft released three new internal AI models for text, speech, and graphics generation

10.04.2026 17 hardware

Microsoft AI launches three new multimodal models

In an effort to strengthen its position in artificial intelligence (AI), Microsoft AI’s research division announced the release of three proprietary models capable of generating text, audio, and images. This move was a response to competition from leading AI labs.

Model	Purpose	Key metrics
MAI‑Transcribe‑1	Converts speech to text	25 languages, 2.5× faster than Azure Fast
MAI‑Voice‑1	Creates an audio track	One minute in one second, voice tuning
MAI‑Image‑2	Generates images from text

The project was developed by the MAI Superintelligence team—a division focused on fundamental research into advanced AI systems. In November 2025, executive director Mustafa Suleyman joined the team.

Cost efficiency Developers placed special emphasis on reducing compute costs compared to Google and OpenAI counterparts:

Service	Price
Text transcription	$0.36/hour
Speech synthesis	$22 per 1 million characters
Image processing	$5 per 1 million input tokens; $33 for generating 1 million output tokens

The models are already deployed on the Microsoft Foundry platform. Transcription and speech synthesis are available in MAI Playground.

Partnership with OpenAI Despite actively developing its own solutions, Mustafa Suleyman confirmed a commitment to collaborating with OpenAI: Microsoft has already invested over $13 billion. The company will continue using OpenAI models in its products under a long‑term contract, applying a diversification strategy similar to its work with microchips.

Thus, Microsoft AI is strengthening its market position by offering fast and cost‑effective multimodal solutions while maintaining close ties with key partners.

Microsoft released three new internal AI models for text, speech, and graphics generation

Related news

Apple-Car could look like this: Ferrari showcases the interior of the electric car Luce, designed by Johnny Aiv.

Sales of Mortal Kombat 1 exceeded 8 million copies, but the record for the preceding game remains out of reach.

Tesla launched a campaign against “deceptive” methods of activating autopilot in regions where its use is prohibited.

Over the next five years, demand for memory is expected to grow more than 600-fold, according to Dell’s chief, driven by the rise in AI workloads.

Comments (0)

Log in to comment

Microsoft released three new internal AI models for text, speech, and graphics generation

Related news

Apple-Car could look like this: Ferrari showcases the interior of the electric car Luce, designed by Johnny Aiv.

Sales of Mortal Kombat 1 exceeded 8 million copies, but the record for the preceding game remains out of reach.

Tesla launched a campaign against “deceptive” methods of activating autopilot in regions where its use is prohibited.

Over the next five years, demand for memory is expected to grow more than 600-fold, according to Dell’s chief, driven by the rise in AI workloads.

Log in to comment

Sales of Mortal Kombat 1 exceeded 8 million copies, but the record for the preceding game remains out of reach.