VideoModels & Releases Products & Apps·OpenAI (YouTube)·2h ago

Build Hour: GPT-Realtime-2

OpenAI is advancing its realtime voice infrastructure with GPT-Realtime-2, a model designed for sub-100ms latency voice interactions that combines translation, speech-to-text, and agentic reasoning. The capability set, including 128K context windows, parallel tool calling, and controllable voice expressiveness, signals a shift toward voice as a primary interface for application control and information retrieval. This positions realtime voice agents as a competitive frontier where latency and naturalness become differentiators for enterprise workflows spanning customer service, analytics dashboards, and commerce. The public Build Hour format underscores OpenAI's intent to seed developer adoption early.

Modelwire context

Skeptical read

The Build Hour format is doing real work here: it is not a research release or a product GA announcement, it is a structured demo designed to generate developer momentum before the API is widely stress-tested in production. The latency figure of sub-100ms is a marketing target, not a published benchmark under realistic network conditions.

Modelwire has no prior coverage to anchor this to directly, so context has to come from the broader space. OpenAI has been running Build Hours as a low-friction launch vehicle for several months, and GPT-Realtime-2 follows the same pattern as earlier realtime API previews: announce capability, show a demo, let developers find the edges. The competitive pressure here is real, with Google and ElevenLabs both shipping voice infrastructure, but the specific claims in this session have not been independently validated. Until developers report latency figures from actual API calls rather than a controlled demo environment, the sub-100ms headline should be treated as aspirational.

Watch whether independent developers post reproducible latency measurements from the GPT-Realtime-2 API within the next four to six weeks. If real-world p95 latency consistently clears 150ms rather than the stated sub-100ms, the enterprise workflow pitch loses its core technical premise.

This analysis is generated by Modelwire’s editorial layer from our archive and the summary above. It is not a substitute for the original reporting. How we write it.

MentionsOpenAI · GPT-Realtime-2 · GPT-Realtime-Translate · GPT-Realtime-Whisper · Teri Yu · Erika Kettleson

Read full story at OpenAI (YouTube) →(youtube.com)

Modelwire Editorial

This synthesis and analysis was prepared by the Modelwire editorial team. We use advanced language models to read, ground, and connect the day’s most significant AI developments, providing original strategic context that helps practitioners and leaders stay ahead of the frontier.

Our mission How we write

Modelwire summarizes, we don’t republish. The full content lives on youtube.com. If you’re a publisher and want a different summarization policy for your work, see our takedown page.