Speeding up agentic workflows with WebSockets in the Responses API

OpenAI detailed how WebSockets and connection-scoped caching cut API overhead and latency in the Codex agent loop, offering a technical blueprint for faster agentic workflows. The optimization targets a core bottleneck in agent-based systems where repeated API calls compound latency.

Modelwire context

Analyst take

The real story here isn't WebSockets as a technology — it's that OpenAI is now publishing internal engineering blueprints for Codex's agent loop, which signals a deliberate effort to pull third-party developers into building on their agentic stack rather than a competitor's.

Six days before this post, OpenAI shipped a major Codex upgrade targeting Anthropic's Claude Code directly, as covered in both TechCrunch and The Verge on April 16. That release added computer control, memory, and plugin support. This WebSockets post is the infrastructure layer underneath that product push: faster agent loops make the expanded capabilities in that April 16 update actually usable at scale, where repeated round-trips would otherwise compound into noticeable delays. The sequencing matters. OpenAI is not just shipping features; it is publishing the performance rationale alongside them, which is a developer-relations move as much as a technical one.

Watch whether Anthropic publishes a comparable latency or connection-persistence optimization for the Claude Code API within the next 60 days. If they do, it confirms this is now a baseline expectation for agentic coding tools, not a differentiator.

Coverage we drew on

OpenAI’s big Codex update is a direct shot at Claude Code · The Verge — AI

This analysis is generated by Modelwire’s editorial layer from our archive and the summary above. It is not a substitute for the original reporting. How we write it.

MentionsOpenAI · Codex · WebSockets · Responses API

Read full story at OpenAI →(openai.com)

Modelwire Editorial

This synthesis and analysis was prepared by the Modelwire editorial team. We use advanced language models to read, ground, and connect the day’s most significant AI developments, providing original strategic context that helps practitioners and leaders stay ahead of the frontier.

Our mission How we write

Modelwire summarizes, we don’t republish. The full content lives on openai.com. If you’re a publisher and want a different summarization policy for your work, see our takedown page.