Forgetting That Sticks: Quantization-Permanent Unlearning via Circuit Attribution

A new research finding exposes a critical gap between unlearning claims and deployed reality: quantized models routinely recover supposedly forgotten information. The work identifies a fundamental mismatch between gradient-based forgetting techniques and the compression methods applied to every production LLM, showing that per-parameter updates are orders of magnitude smaller than quantization bin widths. This sparsity-permanence tradeoff means current unlearning evaluations are misleading benchmarks for real-world systems, forcing the field to rethink both evaluation protocols and forgetting methods that survive compression.

Modelwire context

Explainer

The buried implication here is legal and regulatory, not just technical: organizations citing machine unlearning to satisfy data deletion requests under GDPR or similar frameworks may be relying on guarantees that evaporate the moment a model is compressed for deployment. The gap between a research prototype and a shipped product is precisely where this failure lives.

This story is largely disconnected from recent activity in our archive, as we have no prior coverage of machine unlearning research or quantization methods to anchor it to. It belongs to a cluster of work questioning whether safety and compliance properties demonstrated on full-precision models survive the engineering decisions made before those models reach users. That broader question, whether alignment or safety interventions are robust to post-training modifications, has been a recurring tension in the field, but we have not yet covered it directly. This paper sharpens that tension considerably by providing a mechanistic account rather than an empirical observation.

Watch whether any of the major quantization library maintainers (bitsandbytes, llama.cpp, or similar) respond with unlearning-aware quantization schemes within the next two quarters. If MANSU or a comparable method gets integrated into a production unlearning audit, that would signal the research has crossed into compliance tooling.

This analysis is generated by Modelwire’s editorial layer from our archive and the summary above. It is not a substitute for the original reporting. How we write it.

MentionsMANSU · NF4 quantization · post-training quantization

Read full story at arXiv cs.LG →(arxiv.org)

Modelwire Editorial

This synthesis and analysis was prepared by the Modelwire editorial team. We use advanced language models to read, ground, and connect the day’s most significant AI developments, providing original strategic context that helps practitioners and leaders stay ahead of the frontier.

Our mission How we write

Modelwire summarizes, we don’t republish. The full content lives on arxiv.org. If you’re a publisher and want a different summarization policy for your work, see our takedown page.