Google's Gemini 3.5 Flash follows Anthropic and OpenAI in making newer AI models significantly pricier

Google's Gemini 3.5 Flash represents a capability leap that comes with a steep cost penalty: 5.5x higher inference pricing than its predecessor, and 75% more expensive than the flagship Gemini 3.1 Pro on agent workloads due to increased interaction steps. This pricing trajectory mirrors moves by Anthropic and OpenAI, signaling an industry-wide shift where frontier model improvements now demand substantially higher operational budgets. For enterprises and API consumers, the tradeoff between performance gains and per-token economics is becoming a critical procurement decision rather than an afterthought.
Modelwire context
Analyst takeThe more pointed detail here is that Gemini 3.5 Flash, a model positioned in the efficiency tier, now costs more on agent workloads than the previous flagship Pro model. That inversion suggests the pricing pressure isn't just about raw capability jumps but about how multi-step agentic usage patterns are becoming the new billing surface labs are optimizing around.
This is largely disconnected from recent activity in our archive, as we have no prior coverage to anchor it to. That said, it belongs to a broader pattern worth naming: all three major labs have now moved in the same direction on inference pricing within a compressed timeframe, which is less likely to be coincidence than a shared read on enterprise willingness to pay. When Anthropic and OpenAI raised prices on their respective newer models, the framing was capability justification. Google is now using the same framing, which normalizes the trajectory rather than challenging it.
Watch whether enterprise API consumption data (from cloud resellers or earnings calls in Q3 2026) shows volume compression at the new price points. If usage growth flattens despite capability gains, the labs will face a real test of whether this pricing holds or triggers a race back down.
This analysis is generated by Modelwire’s editorial layer from our archive and the summary above. It is not a substitute for the original reporting. How we write it.
MentionsGoogle · Gemini 3.5 Flash · Gemini 3.1 Pro · Anthropic · OpenAI
Modelwire Editorial
This synthesis and analysis was prepared by the Modelwire editorial team. We use advanced language models to read, ground, and connect the day’s most significant AI developments, providing original strategic context that helps practitioners and leaders stay ahead of the frontier.
Modelwire summarizes, we don’t republish. The full content lives on the-decoder.com. If you’re a publisher and want a different summarization policy for your work, see our takedown page.