Research Models & Releases·arXiv cs.CL·May 20

Do LLMs Know What Luxembourgish Borrows? Probing Lexical Neology in Low-Resource Multilingual Models

Researchers have exposed a significant gap in multilingual LLM performance on a task that matters for real-world deployment: distinguishing native words from borrowings in low-resource languages. The new LexNeo-Bench benchmark, built from Luxembourgish news data, reveals that state-of-the-art models perform barely above random chance at classifying lexical borrowings without external context. This finding challenges the assumption that multilingual models understand linguistic community norms around word adoption and neology, raising questions about their reliability for writing assistance in minority languages where lexical precision carries cultural weight.

Modelwire context

Explainer

The finding isn't just that models fail on borrowings, but that they fail precisely when forced to rely on linguistic intuition rather than surface statistics. Models trained on mixed-language corpora don't internalize the cultural and linguistic norms that native speakers use to judge whether a word 'belongs' in a language.

This connects directly to the May shared task on multilingual coreference resolution, which expanded to 19 languages with focus on long-range entity chains. Both efforts expose the same underlying problem: multilingual models excel at pattern-matching within local windows but struggle with language-specific structural reasoning that requires deeper linguistic knowledge. The borrowing classification task is narrower but more pointed, showing that even word-level decisions (not just discourse-level ones) expose gaps in how models represent multilingual competence.

If the LexNeo-Bench authors release results showing that fine-tuning on Luxembourgish morphological or etymological features closes the gap to 80%+ accuracy, that confirms the issue is representational rather than architectural. If performance stays near random even with such fine-tuning, the problem runs deeper into how multilingual models partition their embedding space.

Coverage we drew on

Findings of the Fifth Shared Task on Multilingual Coreference Resolution: Expanding Datasets for Long-Range Entities · arXiv cs.CL

This analysis is generated by Modelwire’s editorial layer from our archive and the summary above. It is not a substitute for the original reporting. How we write it.

MentionsLexNeo-Bench · LuxBorrow · Luxembourgish · multilingual LLMs

Read full story at arXiv cs.CL →(arxiv.org)

Modelwire Editorial

This synthesis and analysis was prepared by the Modelwire editorial team. We use advanced language models to read, ground, and connect the day’s most significant AI developments, providing original strategic context that helps practitioners and leaders stay ahead of the frontier.

Our mission How we write

Modelwire summarizes, we don’t republish. The full content lives on arxiv.org. If you’re a publisher and want a different summarization policy for your work, see our takedown page.