Research Opinion & Analysis·The Decoder·1d ago

Microsoft researcher builds a working neural network out of goats in Age of Empires II to critique AI science

A Microsoft researcher constructed a functional neural network using Age of Empires II game mechanics, goats, and environmental objects to expose methodological flaws in contemporary AI research. Analysis of 315 papers revealed that over half presuppose language models possess human-like cognition before testing begins. The project illustrates how substrate-agnostic mathematical systems acquire anthropomorphic qualities through interface design alone, challenging the scientific rigor of claims about model sentience and reasoning capabilities.

Modelwire context

Explainer

The more pointed finding isn't the goat-based neural network itself, which is a demonstration tool, but the paper audit: more than half of surveyed AI research embeds cognitive assumptions into experimental framing before any measurement begins, meaning the conclusions are partly artifacts of how questions were asked.

This is largely disconnected from recent activity in our archive, as we have no prior coverage to anchor it to. It belongs to a longer-running methodological debate inside AI research about whether benchmark design and interface framing quietly predetermine results. That debate has intensified alongside the proliferation of capability claims for large language models, where the gap between what a model outputs and what a model 'understands' is routinely collapsed in both papers and press coverage. The goat demonstration makes an abstract critique concrete: if the same mathematical operations produce 'reasoning' in a medieval strategy game, the label 'reasoning' is doing work that the math itself is not.

Watch whether the 315-paper audit gets a formal peer-reviewed publication, and whether any of the flagged research teams respond with revised experimental designs. If the methodology holds up to scrutiny and prompts visible corrections, it signals the critique has traction beyond a clever demo.

This analysis is generated by Modelwire’s editorial layer from our archive and the summary above. It is not a substitute for the original reporting. How we write it.

MentionsMicrosoft · Age of Empires II · Language models

Read full story at The Decoder →(the-decoder.com)

Modelwire Editorial

This synthesis and analysis was prepared by the Modelwire editorial team. We use advanced language models to read, ground, and connect the day’s most significant AI developments, providing original strategic context that helps practitioners and leaders stay ahead of the frontier.

Our mission How we write

Modelwire summarizes, we don’t republish. The full content lives on the-decoder.com. If you’re a publisher and want a different summarization policy for your work, see our takedown page.