Thinking Machines Lab ships its first model and argues interactivity is what OpenAI gets wrong about voice

Mira Murati's Thinking Machines Lab has released its first model, positioning real-time interactivity as a core differentiator against OpenAI and Google's voice offerings. The system processes audio, video, and text simultaneously in 200-millisecond windows, moving beyond the turn-based Q&A paradigm that constrains current voice AI. This architectural choice targets a genuine friction point in conversational AI: latency and the artificial pause-response rhythm users experience. The launch signals renewed competition in the voice-first AI space and tests whether parallel streaming processing can deliver meaningfully smoother interaction than existing alternatives.
Modelwire context
Analyst takeThe more consequential detail here is organizational, not architectural: Murati is the most senior OpenAI alumna to ship a direct product competitor, and the specific framing of OpenAI's voice approach as a design flaw (rather than a resource or timing gap) reads as a deliberate positioning choice meant to attract both users and talent who share that critique.
Modelwire has no prior coverage to anchor this to directly, so context has to come from the broader competitive landscape. The voice AI space has been contested primarily by OpenAI's GPT Realtime offerings and Google Gemini Live, both of which Thinking Machines Lab names explicitly as the comparison set. That framing is a calculated move: it skips the usual new-entrant humility and invites direct benchmarking before the product has any track record. Whether the 200-millisecond processing window holds up under real-world network variance and multilingual input is the question that framing immediately raises.
Watch whether independent developers report consistent sub-300ms response times in non-English languages within the next 60 days. If latency degrades significantly outside English, the architectural claim is narrower than the launch positioning suggests.
This analysis is generated by Modelwire’s editorial layer from our archive and the summary above. It is not a substitute for the original reporting. How we write it.
MentionsThinking Machines Lab · Mira Murati · OpenAI · GPT Realtime 2 · Google Gemini Live
Modelwire Editorial
This synthesis and analysis was prepared by the Modelwire editorial team. We use advanced language models to read, ground, and connect the day’s most significant AI developments, providing original strategic context that helps practitioners and leaders stay ahead of the frontier.
Modelwire summarizes, we don’t republish. The full content lives on the-decoder.com. If you’re a publisher and want a different summarization policy for your work, see our takedown page.