Models & Releases Products & Apps·Simon Willison·Apr 21

Where's the raccoon with the ham radio? (ChatGPT Images 2.0)

OpenAI shipped ChatGPT Images 2.0, with Sam Altman claiming the leap matches GPT-3 to GPT-5 in magnitude. Simon Willison tested the model against its predecessor using a Where's Waldo-style prompt, benchmarking real-world output quality.

Modelwire context

Skeptical read

The GPT-3-to-GPT-5 comparison is doing a lot of work in Altman's framing: that prior jump involved reasoning, coding, and instruction-following across thousands of tasks, not image fidelity on a novelty benchmark. Willison's raccoon-with-ham-radio test is a useful sanity check, but it is one data point, and the headline claim is far broader than any single prompt can validate.

OpenAI has been shipping across multiple fronts in rapid succession. The Codex upgrades covered by The Verge and TechCrunch on April 16 added image generation as one of several new agentic capabilities, suggesting image quality is now a competitive input across OpenAI's product surface, not just a standalone feature. The acquisitions and domain-specific launches noted in the April 17 tokenmaxxing coverage (TechCrunch) reinforce that OpenAI is moving fast enough that individual capability claims are hard to evaluate in isolation. Google's Gemini update from April 16, which added personalized image generation via Photos, is the more direct competitive parallel here, though neither story benchmarks against the other.

If independent evaluators running structured image-quality benchmarks (not single prompts) show consistent gains over GPT-Image-1 within the next four to six weeks, Altman's framing gets partial support. If the gains are narrow or prompt-sensitive, the GPT-3-to-GPT-5 analogy collapses on contact with systematic testing.

Coverage we drew on

OpenAI’s big Codex update is a direct shot at Claude Code · The Verge — AI

This analysis is generated by Modelwire’s editorial layer from our archive and the summary above. It is not a substitute for the original reporting. How we write it.

MentionsOpenAI · ChatGPT Images 2.0 · Sam Altman · Simon Willison · GPT-Image-1 · GPT-Image-2

Read full story at Simon Willison →(simonwillison.net)

Modelwire Editorial

This synthesis and analysis was prepared by the Modelwire editorial team. We use advanced language models to read, ground, and connect the day’s most significant AI developments, providing original strategic context that helps practitioners and leaders stay ahead of the frontier.

Our mission How we write

Modelwire summarizes, we don’t republish. The full content lives on simonwillison.net. If you’re a publisher and want a different summarization policy for your work, see our takedown page.