Mapping how LLMs debate societal issues when shadowing human personality traits, sociodemographics and social media behavior

Illustration accompanying: Mapping how LLMs debate societal issues when shadowing human personality traits, sociodemographics and social media behavior

Researchers have constructed a 190,000-record synthetic dataset that reveals how 19 different LLMs shift their stances and reasoning when prompted to adopt specific human personas versus neutral AI roles. The Cognitive Digital Shadows corpus maps LLM outputs across four polarizing topics, encoding sociodemographic and psychological attributes alongside generated text. This work directly addresses a critical blind spot in AI governance: the degree to which language models amplify or moderate societal divisions based on contextual framing. For practitioners building dialogue systems or content moderation tools, the dataset exposes how persona-conditioning can systematically alter model behavior in ways that may not be obvious from standard benchmarks alone.

Modelwire context

Explainer

The dataset's real contribution isn't just scale (190,000 records across 19 models) but the deliberate encoding of psychological and sociodemographic attributes alongside outputs, which makes it possible to audit whether a model's stance shifts track demographic proxies in systematic ways rather than randomly.

This sits in direct tension with the finding from 'Stable Behavior, Limited Variation: Persona Validity in LLM Agents for Urban Sentiment Perception,' published the same day, which found that persona prompting produces reliable within-persona behavior but minimal variation across personas. If that result holds for polarizing societal topics too, the Cognitive Digital Shadows corpus may be capturing noise rather than meaningful divergence. The 'DPN-LE' work on dual personality neurons adds another wrinkle: if persona-driven behavior shifts are partly artifacts of overlapping neuron functions rather than clean semantic conditioning, the dataset's attributions may be harder to interpret than they appear.

Watch whether independent researchers replicate stance-shift patterns from this corpus using a held-out model not in the original 19, since confirmation there would suggest the findings generalize beyond the specific models tested rather than reflecting idiosyncratic training choices.

Coverage we drew on

Stable Behavior, Limited Variation: Persona Validity in LLM Agents for Urban Sentiment Perception · arXiv cs.CL

This analysis is generated by Modelwire’s editorial layer from our archive and the summary above. It is not a substitute for the original reporting. How we write it.

MentionsCognitive Digital Shadows · LLMs

Read full story at arXiv cs.CL →(arxiv.org)

Modelwire Editorial

This synthesis and analysis was prepared by the Modelwire editorial team. We use advanced language models to read, ground, and connect the day’s most significant AI developments, providing original strategic context that helps practitioners and leaders stay ahead of the frontier.

Our mission How we write

Modelwire summarizes, we don’t republish. The full content lives on arxiv.org. If you’re a publisher and want a different summarization policy for your work, see our takedown page.