Research·arXiv cs.CL·1d ago

Rethinking the Idiomaticity Decomposability Hypothesis: Evidence from Distributional Learning

Researchers challenge a foundational linguistic theory about idioms by using contextualised language models as experimental instruments. Rather than decomposability (how much individual words contribute to meaning) predicting syntactic flexibility, the study finds distributional learning patterns better explain idiom behavior. The work matters because it reveals how neural language models internalize figurative language during pretraining, offering a new lens on what drives model robustness with non-literal expressions. This bridges cognitive linguistics and modern NLP, suggesting that raw exposure frequency shapes how systems handle idiomatic variation more than compositional structure does.

Modelwire context

Explainer

The study doesn't just say decomposability fails to predict idiom flexibility. It proposes a specific alternative mechanism: that raw frequency patterns during pretraining, not compositional structure, determine how robustly models handle idiomatic variation. This shifts the explanatory burden from linguistic structure to data statistics.

This connects to the multi-domain RL work from early June, which revealed that overlapping computational pathways in language models can reinforce or sabotage each other depending on parameter direction. Here, we see a similar insight: idiom handling isn't determined by surface-level linguistic properties but by how the model's internal representations were shaped during training. Both papers suggest that model behavior emerges from learning dynamics rather than from the structure of the input itself. The idiom finding also echoes the FRANZ audit framework's emphasis on how models frame outputs based on learned associations rather than explicit rules.

If researchers can show that artificially skewing pretraining frequency distributions (e.g., oversampling certain idiom variants) predictably changes model flexibility on held-out idioms, that would confirm the distributional learning hypothesis. If decomposability resurfaces as predictive in models trained on frequency-balanced data, the theory survives.

Coverage we drew on

A Local Perturbation Theory for Cross-Domain Interference and Recovery in Multi-Domain RL · arXiv cs.CL

This analysis is generated by Modelwire’s editorial layer from our archive and the summary above. It is not a substitute for the original reporting. How we write it.

MentionsContextualised language models · Idiomaticity decomposability hypothesis

Read full story at arXiv cs.CL →(arxiv.org)

Modelwire Editorial

This synthesis and analysis was prepared by the Modelwire editorial team. We use advanced language models to read, ground, and connect the day’s most significant AI developments, providing original strategic context that helps practitioners and leaders stay ahead of the frontier.

Our mission How we write

Modelwire summarizes, we don’t republish. The full content lives on arxiv.org. If you’re a publisher and want a different summarization policy for your work, see our takedown page.