Neural Garbage Collection: Learning to Forget while Learning to Reason

Researchers propose Neural Garbage Collection, a technique that trains language models to selectively discard KV cache entries during chain-of-thought reasoning rather than relying on hand-crafted pruning rules. The end-to-end learned approach could unlock longer reasoning chains by reducing memory bottlenecks that currently constrain scaling.

Modelwire context

Explainer

The key distinction here is the word 'learned': most existing KV cache pruning schemes use fixed heuristics like attention score thresholds or recency windows, whereas this work trains the model itself to decide what to discard, making forgetting a first-class part of the reasoning process rather than a post-hoc optimization.

The closest thread in recent coverage is IG-Search (arXiv cs.CL, April 16), which also tackles the problem of making intermediate reasoning steps more efficient, in that case by rewarding only search queries that genuinely improve answer confidence. Both papers are working on the same underlying constraint: chain-of-thought reasoning is expensive per step, and the cost compounds. K-Token Merging from the same week approaches a related pressure point from the compression side, collapsing token sequences in latent space before they ever enter the model's full attention stack. Neural Garbage Collection is the runtime complement to that static compression idea: rather than compressing inputs upfront, it prunes the working memory mid-reasoning. Together these papers sketch a broader effort to make long-horizon reasoning tractable without simply scaling hardware.

The credibility test is whether the learned gating generalizes across reasoning benchmarks beyond the paper's own evaluations. If an independent group reproduces the memory savings on MATH or ARC-AGI without accuracy regression, the end-to-end training claim holds; if accuracy degrades on harder problems, the model is likely learning to forget things that look redundant but aren't.

Coverage we drew on

IG-Search: Step-Level Information Gain Rewards for Search-Augmented Reasoning · arXiv cs.CL

This analysis is generated by Modelwire’s editorial layer from our archive and the summary above. It is not a substitute for the original reporting. How we write it.

MentionsNeural Garbage Collection

Read full story at arXiv cs.LG →(arxiv.org)

Modelwire Editorial

This synthesis and analysis was prepared by the Modelwire editorial team. We use advanced language models to read, ground, and connect the day’s most significant AI developments, providing original strategic context that helps practitioners and leaders stay ahead of the frontier.

Our mission How we write

Modelwire summarizes, we don’t republish. The full content lives on arxiv.org. If you’re a publisher and want a different summarization policy for your work, see our takedown page.