Research Models & Releases·arXiv cs.CL·5d ago

CorPipe at CRAC 2026: Empty Nodes and Cross-Lingual Transfer in Multilingual Coreference Resolution

CorPipe 26 advances multilingual coreference resolution by unifying empty node prediction with mention and link detection in a single model, achieving substantial gains over both LLM and unconstrained baselines at CRAC 2026. The system's 9.5 percentage point margin over competing approaches signals that specialized architectures remain competitive against generative models on structured linguistic tasks, even as the shared task expands to 5 new datasets and 2 languages. Cross-lingual zero-shot results suggest the approach generalizes across language families, relevant for teams building production NLP systems that must handle underrepresented languages without task-specific fine-tuning.

Modelwire context

Explainer

CorPipe 26's key contribution is collapsing three traditionally separate subtasks (empty node prediction, mention detection, and coreference linking) into a single end-to-end model. Prior work treated empty nodes (dropped pronouns common in pro-drop languages) as a preprocessing step or separate module; unifying them changes how information flows during training and inference.

This connects directly to the GRUFF benchmark from late May, which exposed how existing evaluations miss morphological complexity in non-English languages. CorPipe 26 addresses a related blind spot: coreference systems built on English assumptions often fail on languages where pronouns are routinely omitted. The zero-shot cross-lingual results here suggest that when you build for linguistic diversity from the ground up (rather than retrofitting English-first architectures), transfer becomes more reliable. That architectural principle echoes the multimodal work on carrier sensitivity from the same period, where surface-level alignment masks deeper structural mismatches.

If CorPipe 26 maintains its 9.5-point margin when evaluated on the two newly added languages at CRAC 2027 (expected early 2027), that confirms the cross-lingual gains are genuine and not an artifact of the five existing datasets. If performance drops below 5 points on the new languages, the zero-shot claim collapses and the model is effectively English-plus-transfer rather than truly multilingual.

Coverage we drew on

GRUFF: LLM Pronoun Fidelity, Reasoning, and Biases in German · arXiv cs.CL

This analysis is generated by Modelwire’s editorial layer from our archive and the summary above. It is not a substitute for the original reporting. How we write it.

MentionsCorPipe 26 · CRAC 2026 · Multilingual Coreference Resolution

Read full story at arXiv cs.CL →(arxiv.org)

Modelwire Editorial

This synthesis and analysis was prepared by the Modelwire editorial team. We use advanced language models to read, ground, and connect the day’s most significant AI developments, providing original strategic context that helps practitioners and leaders stay ahead of the frontier.

Our mission How we write

Modelwire summarizes, we don’t republish. The full content lives on arxiv.org. If you’re a publisher and want a different summarization policy for your work, see our takedown page.