Mapping how LLMs debate societal issues when shadowing human personality traits, sociodemographics and social media behavior

Researchers have constructed a 190,000-record synthetic dataset that reveals how 19 different LLMs shift their stances and reasoning when prompted to adopt specific human personas versus neutral AI roles. The Cognitive Digital Shadows corpus maps LLM outputs across four polarizing topics, encoding sociodemographic and psychological attributes alongside generated text. This work directly addresses a critical blind spot in AI governance: the degree to which language models amplify or moderate societal divisions based on contextual framing. For practitioners building dialogue systems or content moderation tools, the dataset exposes how persona-conditioning can systematically alter model behavior in ways that may not be obvious from standard benchmarks alone.
Modelwire context
ExplainerThe dataset's real contribution isn't just scale (190,000 records across 19 models) but the deliberate encoding of psychological and sociodemographic attributes alongside outputs, which makes it possible to audit whether a model's stance shifts track demographic proxies in systematic ways rather than randomly.
This sits in direct tension with the finding from 'Stable Behavior, Limited Variation: Persona Validity in LLM Agents for Urban Sentiment Perception,' published the same day, which found that persona prompting produces reliable within-persona behavior but minimal variation across personas. If that result holds for polarizing societal topics too, the Cognitive Digital Shadows corpus may be capturing noise rather than meaningful divergence. The 'DPN-LE' work on dual personality neurons adds another wrinkle: if persona-driven behavior shifts are partly artifacts of overlapping neuron functions rather than clean semantic conditioning, the dataset's attributions may be harder to interpret than they appear.
Watch whether independent researchers replicate stance-shift patterns from this corpus using a held-out model not in the original 19, since confirmation there would suggest the findings generalize beyond the specific models tested rather than reflecting idiosyncratic training choices.
Coverage we drew on
This analysis is generated by Modelwire’s editorial layer from our archive and the summary above. It is not a substitute for the original reporting. How we write it.
MentionsCognitive Digital Shadows · LLMs
Modelwire Editorial
This synthesis and analysis was prepared by the Modelwire editorial team. We use advanced language models to read, ground, and connect the day’s most significant AI developments, providing original strategic context that helps practitioners and leaders stay ahead of the frontier.
Modelwire summarizes, we don’t republish. The full content lives on arxiv.org. If you’re a publisher and want a different summarization policy for your work, see our takedown page.