Revisiting Non-Verbatim Memorization in Large Language Models: The Role of Entity Surface Forms

Researchers introduced RedirectQA, a dataset leveraging Wikipedia redirects to test how 13 LLMs handle factual recall across entity name variants, abbreviations, and misspellings. The work reveals that model outputs shift significantly based on surface form, suggesting memorization is fragile and name-dependent rather than robust factual knowledge.

Modelwire context

Explainer

The deeper implication isn't just that models are brittle under name variation — it's that this brittleness undermines a common assumption in evaluation design: that if a model answers a factual question correctly, it has encoded the underlying fact rather than a specific string pattern associated with it. RedirectQA stress-tests that assumption systematically across 13 models, which is rarer than it sounds.

This connects directly to the hallucination and evaluation threads running through recent coverage. The HalluScope paper (covered same day, 'When Prompts Override Vision') found that textual priors in prompts drive false outputs in vision-language models. RedirectQA is essentially the text-only analogue: surface form is a textual prior, and it's shaping recall in ways that look like knowledge but aren't. Both papers, arriving the same week, push toward the same uncomfortable conclusion — that benchmark scores measuring factual accuracy may be measuring prompt pattern matching as much as genuine retrieval. The MathDuels work from the same batch also grapples with this, trying to separate what models actually know from what static benchmarks reward.

Watch whether any of the 13 tested models show significantly lower surface-form sensitivity after instruction tuning or RLHF updates in the next two release cycles. If the gap persists post-fine-tuning, that would suggest the fragility is baked into pretraining rather than fixable at alignment time.

Coverage we drew on

When Prompts Override Vision: Prompt-Induced Hallucinations in LVLMs · arXiv cs.CL

This analysis is generated by Modelwire’s editorial layer from our archive and the summary above. It is not a substitute for the original reporting. How we write it.

MentionsRedirectQA · Wikipedia · Wikidata

Read full story at arXiv cs.CL →(arxiv.org)

Modelwire Editorial

This synthesis and analysis was prepared by the Modelwire editorial team. We use advanced language models to read, ground, and connect the day’s most significant AI developments, providing original strategic context that helps practitioners and leaders stay ahead of the frontier.

Our mission How we write

Modelwire summarizes, we don’t republish. The full content lives on arxiv.org. If you’re a publisher and want a different summarization policy for your work, see our takedown page.