SERE: Structural Example Retrieval for Enhancing LLMs in Event Causality Identification
Researchers propose SERE, a retrieval-augmented framework that addresses a critical failure mode in LLM reasoning: causal hallucination, where models overpredict relationships between events. The work combines few-shot learning with structural metrics from ConceptNet and syntactic analysis to ground event causality identification in concrete examples rather than learned biases. This tackles a fundamental problem in how LLMs reason about temporal and causal dependencies, with implications for information extraction, question answering, and knowledge graph construction pipelines that depend on accurate causal signal.
Modelwire context
ExplainerSERE's core insight is that causal hallucination stems not from factual gaps but from learned statistical biases that override actual event relationships. The framework doesn't add external knowledge; it uses structural metrics to weight which in-context examples the model attends to, making the reasoning process auditable rather than opaque.
This connects directly to the hallucination detection work (LaaB, May 5) which treated neural uncertainty and self-reasoning as coupled signals. SERE takes a different angle: instead of detecting hallucination after the fact, it prevents causal overconfidence by grounding predictions in concrete structural examples before inference. It also echoes the procedural faithfulness finding from May 1, which showed LLMs lose track of constraints in multi-step tasks. Here, the constraint is causal plausibility, enforced via retrieval rather than training.
If SERE's performance gains hold on out-of-domain event datasets (domains not represented in ConceptNet training), that confirms the approach generalizes. If gains collapse on those splits, the method is just memorizing ConceptNet's existing biases rather than fixing the underlying reasoning problem.
Coverage we drew on
This analysis is generated by Modelwire’s editorial layer from our archive and the summary above. It is not a substitute for the original reporting. How we write it.
MentionsSERE · ConceptNet · Large Language Models
Modelwire Editorial
This synthesis and analysis was prepared by the Modelwire editorial team. We use advanced language models to read, ground, and connect the day’s most significant AI developments, providing original strategic context that helps practitioners and leaders stay ahead of the frontier.
Modelwire summarizes, we don’t republish. The full content lives on arxiv.org. If you’re a publisher and want a different summarization policy for your work, see our takedown page.