A Local Perturbation Theory for Cross-Domain Interference and Recovery in Multi-Domain RL

Researchers have identified why multi-domain reinforcement learning on language models causes performance collapse in untrained domains, even when gradient conflicts appear minimal. The work reveals that different domains share overlapping computational pathways where small parameter updates can either reinforce or sabotage each other, depending on direction. This finding challenges the prevailing catastrophic forgetting narrative and opens new avenues for training LLMs across multiple capabilities without trade-offs, a persistent bottleneck in post-training that affects reasoning, coding, and creative tasks simultaneously.
Modelwire context
ExplainerThe key contribution isn't just diagnosing interference but formalizing it mathematically: the paper models cross-domain effects as local perturbations, which means practitioners could eventually predict, before training, which domain combinations are likely to sabotage each other rather than discovering the damage after the fact.
This connects directly to two threads running through recent coverage. CRAM (from the same publication date) tackles the same underlying problem in multimodal continual learning, routing task-specific patterns into isolated expert modules to prevent shared-parameter collapse. The perturbation theory here offers a potential theoretical foundation for why routing-based approaches like CRAM work when they do. Similarly, AgentCL's concern about whether agents genuinely accumulate knowledge without task interference maps onto the same failure surface, just evaluated behaviorally rather than mechanistically. Together, these three papers suggest the field is converging on interference as the central unsolved problem in post-training, approaching it from architecture, evaluation, and now theory simultaneously.
If follow-up work applies this perturbation framework to predict interference before training and those predictions hold across at least two independent multi-domain RL setups, the theory has practical traction. If it only explains outcomes retroactively, it remains a diagnostic tool rather than a design one.
Coverage we drew on
This analysis is generated by Modelwire’s editorial layer from our archive and the summary above. It is not a substitute for the original reporting. How we write it.
MentionsLarge Language Models · Reinforcement Learning · Multi-Domain Training
Modelwire Editorial
This synthesis and analysis was prepared by the Modelwire editorial team. We use advanced language models to read, ground, and connect the day’s most significant AI developments, providing original strategic context that helps practitioners and leaders stay ahead of the frontier.
Modelwire summarizes, we don’t republish. The full content lives on arxiv.org. If you’re a publisher and want a different summarization policy for your work, see our takedown page.