Research Policy & Regulation·The Decoder·Jun 25

Most major AI chatbots still lean left on political questions, even "anti-woke" models are no exception

A Washington Post audit of political bias in major LLMs reveals persistent leftward skew across the industry, even among models explicitly positioned as alternatives to perceived woke alignment. GPT-5.5 presented exclusively left-leaning arguments in 80 percent of responses, while Grok, despite Musk's anti-woke branding, still tilted left more often than right. Google's Gemini 3.1 Pro emerged as the outlier, achieving balanced coverage 93 percent of the time. The finding underscores how training data, RLHF choices, and constitutional AI frameworks embed subtle political orientations into model outputs, raising questions about whether neutrality is achievable or even desirable in systems trained on internet-scale text.

Modelwire context

Skeptical read

The audit's most underreported detail is what 'balanced' actually means in this context: Gemini scoring 93 percent balanced could reflect genuine neutrality, or it could reflect a trained reluctance to engage with political content at all, which is a very different outcome that the headline framing obscures.

This is largely disconnected from recent activity in our archive, as Modelwire has no prior coverage to anchor it to. It belongs to a longer-running debate in AI policy and alignment circles about whether RLHF and constitutional AI methods inevitably encode the political priors of the teams applying them. That debate has been live since at least the early public critiques of InstructGPT, and this audit is one more data point in it rather than a resolution. The Grok finding is worth isolating: a model built partly as a corrective to perceived left-leaning alignment still tilts the same direction, which suggests the problem is upstream in training data more than in fine-tuning choices.

Watch whether the Washington Post releases its full question set and scoring rubric publicly. Without that, the 80 percent and 93 percent figures cannot be independently replicated, and the audit's conclusions remain difficult to evaluate on their own terms.

This analysis is generated by Modelwire’s editorial layer from our archive and the summary above. It is not a substitute for the original reporting. How we write it.

MentionsOpenAI · GPT-5.5 · Elon Musk · Grok · Google · Gemini 3.1 Pro

Read full story at The Decoder →(the-decoder.com)

Modelwire Editorial

This synthesis and analysis was prepared by the Modelwire editorial team. We use advanced language models to read, ground, and connect the day’s most significant AI developments, providing original strategic context that helps practitioners and leaders stay ahead of the frontier.

Our mission How we write

Modelwire summarizes, we don’t republish. The full content lives on the-decoder.com. If you’re a publisher and want a different summarization policy for your work, see our takedown page.