A Local Perturbation Theory for Cross-Domain Interference and Recovery in Multi-Domain RL

Researchers have identified why multi-domain reinforcement learning on language models causes performance collapse in untrained domains, even when gradient conflicts appear minimal. The work reveals that different domains share overlapping computational pathways where small parameter updates can either reinforce or sabotage each other, depending on direction. This finding challenges the prevailing catastrophic forgetting narrative and opens new avenues for training LLMs across multiple capabilities without trade-offs, a persistent bottleneck in post-training that affects reasoning, coding, and creative tasks simultaneously.

Modelwire context

Explainer

The key contribution isn't just diagnosing interference but formalizing it mathematically: the paper models cross-domain effects as local perturbations, which means practitioners could eventually predict, before training, which domain combinations are likely to sabotage each other rather than discovering the damage after the fact.

This connects directly to two threads running through recent coverage. CRAM (from the same publication date) tackles the same underlying problem in multimodal continual learning, routing task-specific patterns into isolated expert modules to prevent shared-parameter collapse. The perturbation theory here offers a potential theoretical foundation for why routing-based approaches like CRAM work when they do. Similarly, AgentCL's concern about whether agents genuinely accumulate knowledge without task interference maps onto the same failure surface, just evaluated behaviorally rather than mechanistically. Together, these three papers suggest the field is converging on interference as the central unsolved problem in post-training, approaching it from architecture, evaluation, and now theory simultaneously.

If follow-up work applies this perturbation framework to predict interference before training and those predictions hold across at least two independent multi-domain RL setups, the theory has practical traction. If it only explains outcomes retroactively, it remains a diagnostic tool rather than a design one.

Coverage we drew on

CRAM: Centroid-Routing and Adaptive MoE for Multimodal Continual Instruction Tuning · arXiv cs.CL

This analysis is generated by Modelwire’s editorial layer from our archive and the summary above. It is not a substitute for the original reporting. How we write it.

MentionsLarge Language Models · Reinforcement Learning · Multi-Domain Training

Read full story at arXiv cs.CL →(arxiv.org)

Modelwire Editorial

This synthesis and analysis was prepared by the Modelwire editorial team. We use advanced language models to read, ground, and connect the day’s most significant AI developments, providing original strategic context that helps practitioners and leaders stay ahead of the frontier.

Our mission How we write

Modelwire summarizes, we don’t republish. The full content lives on arxiv.org. If you’re a publisher and want a different summarization policy for your work, see our takedown page.