Modelwire
Subscribe

Is Textual Similarity Invariant under Machine Translation? Evidence Based on the Political Manifesto Corpus

Researchers tested whether semantic relationships between text embeddings survive machine translation, using 2,800+ political manifestos across 28 languages translated via EU eTranslation. By measuring inter-model disagreement as a calibration baseline, they identified which languages preserve embedding structure through translation and which degrade it. The finding matters for practitioners deploying multilingual NLP systems: translation fidelity varies sharply by language pair and embedding model, suggesting that cross-lingual semantic search and similarity tasks require language-specific validation rather than assuming invariance.

Modelwire context

Explainer

The paper's key contribution isn't just that translation degrades embeddings (known), but that it does so inconsistently by language pair and model. This means you can't apply a single cross-lingual strategy; you need language-specific validation before deploying multilingual semantic search.

This connects directly to the ML-Bench&Guard work from May 1st, which flagged that existing multilingual systems rely on machine translation without validating whether semantic meaning survives the conversion. That paper focused on safety guardrails; this one measures the underlying embedding fidelity problem that makes cross-lingual safety enforcement unreliable in the first place. Together they suggest that practitioners building systems across language borders face a two-layer validation burden: first confirming that embeddings preserve meaning, then confirming that safety semantics transfer correctly.

If EU eTranslation or a major embedding provider (OpenAI, Cohere, Mistral) publishes language-pair specific performance cards or recommends language-specific retraining by Q4 2026, that signals the industry is operationalizing this finding. If they don't, multilingual deployments will continue treating translation as a transparent layer, and failures will accumulate quietly in production.

This analysis is generated by Modelwire’s editorial layer from our archive and the summary above. It is not a substitute for the original reporting. How we write it.

MentionsEU eTranslation · Manifesto Corpus

MW

Modelwire Editorial

This synthesis and analysis was prepared by the Modelwire editorial team. We use advanced language models to read, ground, and connect the day’s most significant AI developments, providing original strategic context that helps practitioners and leaders stay ahead of the frontier.

Modelwire summarizes, we don’t republish. The full content lives on arxiv.org. If you’re a publisher and want a different summarization policy for your work, see our takedown page.

Related

Beyond Decodability: Reconstructing Language Model Representations with an Encoding Probe

arXiv cs.CL·

ML-Bench&Guard: Policy-Grounded Multilingual Safety Benchmark and Guardrail for Large Language Models

arXiv cs.CL·

MIT study explains why scaling language models works so reliably

The Decoder·
Is Textual Similarity Invariant under Machine Translation? Evidence Based on the Political Manifesto Corpus · Modelwire