Models & Releases Products & Apps·Simon Willison·Jun 2

Microsoft's new MAI models

Microsoft is fragmenting its model strategy with two specialized releases: MAI-Thinking-1 targets reasoning workloads at 35B parameters for enterprise partners, while MAI-Code-1-Flash (5B) ships directly into GitHub Copilot's IDE integration. This dual-track approach signals Microsoft's pivot away from monolithic foundation models toward task-specific efficiency, mirroring OpenAI's o1/GPT-4o split. The Code variant's immediate rollout to individual developers matters more than the reasoning model's gated access, as it embeds inference cost reduction directly into the most-used AI development surface.

Modelwire context

Analyst take

The more pointed question is what this means for JetBrains and other IDE vendors: Microsoft is not just shipping a model, it is using GitHub Copilot's distribution scale to make a 5B specialized coding model the default inference layer for millions of developers before competitors can respond.

This lands one day after Microsoft's Build positioning story, where we noted the conference was designed to reassert developer mindshare against OpenAI's own ecosystem ambitions. MAI-Code-1-Flash is the concrete product that follows that signal. It also sharpens the competitive picture around JetBrains' Mellum2 release (covered June 1): JetBrains built a 12B MoE model specifically to control latency and cost inside its own IDE, and Microsoft is now doing the same thing at far greater distribution scale. The asymmetry matters. JetBrains controls enterprise developer loyalty; Microsoft controls the surface where most of that work actually runs.

Watch whether JetBrains accelerates Mellum2's public release timeline or announces a direct Copilot integration within the next 60 days. Either response would confirm that MAI-Code-1-Flash is being read as a direct threat to IDE-native model strategies, not just another cloud model announcement.

Coverage we drew on

Introducing Mellum2: A 12B Mixture-of-Experts Model by JetBrains · Hugging Face

This analysis is generated by Modelwire’s editorial layer from our archive and the summary above. It is not a substitute for the original reporting. How we write it.

MentionsMicrosoft · MAI-Thinking-1 · MAI-Code-1-Flash · GitHub Copilot · Visual Studio Code

Read full story at Simon Willison →(simonwillison.net)

Modelwire Editorial

This synthesis and analysis was prepared by the Modelwire editorial team. We use advanced language models to read, ground, and connect the day’s most significant AI developments, providing original strategic context that helps practitioners and leaders stay ahead of the frontier.

Our mission How we write

Modelwire summarizes, we don’t republish. The full content lives on simonwillison.net. If you’re a publisher and want a different summarization policy for your work, see our takedown page.