Models & Releases Business & Funding·The Decoder·Apr 25

GPT-5.5 tops benchmarks but still hallucinates frequently at a 20 percent higher API cost

OpenAI's GPT-5.5 reclaims top benchmark performance but costs 20 percent more per API call and continues to produce hallucinations at elevated rates, raising questions about whether capability gains justify the pricing increase for production users.

Modelwire context

Skeptical read

The 20 percent cost increase lands on top of a known reliability problem, not despite one. Paying more for a model that hallucinates at elevated rates isn't a straightforward trade-off for production teams; it's a regression in the cost-per-reliable-output metric that benchmarks don't capture.

This release is directly entangled with the Codex consolidation story we covered from The Decoder on April 26, where OpenAI folded its dedicated coding model into GPT-5.5 and claimed improved agentic coding performance with reduced token consumption. That framing made GPT-5.5 sound like an efficiency win. The hallucination data complicates that narrative considerably: if the model is less reliable on factual outputs, the token-efficiency gains for coding agents may not translate to the broader production use cases OpenAI is pitching. The two stories together suggest a model that is being positioned as a consolidation win while carrying reliability debt that neither announcement leads with.

Watch whether enterprise customers on existing GPT-5 contracts report renegotiating or delaying upgrades over the next 60 days. If adoption among high-volume API users stalls despite the benchmark gains, that's a signal the hallucination rate is the real ceiling here, not pricing.

Coverage we drew on

OpenAI kills its dedicated coding model Codex again, folding it into GPT-5.5 · The Decoder

This analysis is generated by Modelwire’s editorial layer from our archive and the summary above. It is not a substitute for the original reporting. How we write it.

MentionsOpenAI · GPT-5.5

Read full story at The Decoder →(the-decoder.com)

Modelwire Editorial

This synthesis and analysis was prepared by the Modelwire editorial team. We use advanced language models to read, ground, and connect the day’s most significant AI developments, providing original strategic context that helps practitioners and leaders stay ahead of the frontier.

Our mission How we write

Modelwire summarizes, we don’t republish. The full content lives on the-decoder.com. If you’re a publisher and want a different summarization policy for your work, see our takedown page.