Modelwire
Subscribe

Trust, Lies, and Long Memories: Emergent Social Dynamics and Reputation in Multi-Round Avalon with LLM Agents

Illustration accompanying: Trust, Lies, and Long Memories: Emergent Social Dynamics and Reputation in Multi-Round Avalon with LLM Agents

Researchers observed LLM agents playing repeated rounds of The Resistance: Avalon develop persistent reputation systems across games, with agents explicitly referencing past behavior and role-conditional trust patterns emerging organically. High-reputation players saw 46% higher team inclusion rates, suggesting agents internalize social memory to inform strategic decisions.

Modelwire context

Explainer

The more striking detail buried in the methodology is that these reputation systems were not designed in — no explicit memory module, no reputation score passed between rounds. Agents reconstructed social history from context alone, which means the behavior is a property of the models themselves, not the scaffolding.

This sits in direct conversation with CoopEval (covered April 16), which found that LLM agents in social dilemmas like prisoner's dilemma default to defection rather than cooperation. That benchmark tested whether external mechanisms like reputation systems could restore cooperative equilibria. The Avalon study offers a partial answer: given a sufficiently rich social context and repeated play, reputation-like structure can emerge without being explicitly engineered. The gap worth noting is that Avalon is an adversarial deduction game, not a pure cooperation problem, so the trust dynamics here are more complex than CoopEval's setup and the results don't map cleanly onto each other. Still, together they suggest the field is converging on repeated interaction as a key variable in how LLM agents develop social strategy.

The critical test is whether these reputation effects persist when agents are swapped to models with shorter effective context windows or when game history is deliberately truncated. If the 46% inclusion differential collapses under those conditions, the finding is about context length, not anything resembling genuine social memory.

This analysis is generated by Modelwire’s editorial layer from our archive and the summary above. It is not a substitute for the original reporting. How we write it.

MentionsThe Resistance: Avalon · LLM agents

MW

Modelwire Editorial

This synthesis and analysis was prepared by the Modelwire editorial team. We use advanced language models to read, ground, and connect the day’s most significant AI developments, providing original strategic context that helps practitioners and leaders stay ahead of the frontier.

Modelwire summarizes, we don’t republish. The full content lives on arxiv.org. If you’re a publisher and want a different summarization policy for your work, see our takedown page.

Trust, Lies, and Long Memories: Emergent Social Dynamics and Reputation in Multi-Round Avalon with LLM Agents · Modelwire