
Forgetting That Sticks: Quantization-Permanent Unlearning via Circuit Attribution
A new research finding exposes a critical gap between unlearning claims and deployed reality: quantized models routinely recover supposedly forgotten information. The work identifies a fundamental mismatch between gradient-based forgetting techniques and the compression methods applied to every production LLM, showing that per-parameter updates are orders of magnitude smaller than quantization bin widths. This sparsity-permanence tradeoff means current unlearning evaluations are misleading benchmarks for real-world systems, forcing the field to rethink both evaluation protocols and forgetting methods that survive compression.62

























