Stability and Generalization in Looped Transformers

Researchers introduce a fixed-point framework for analyzing looped transformers, which enable test-time compute scaling. The work proves that architectures without recall cannot achieve strong input-dependence, while recall plus outer normalization enables stable, reachable fixed points for meaningful predictions.

MentionsLooped Transformers · Fixed-point iteration

Read full story at arXiv cs.LG →(arxiv.org)

Modelwire summarizes — we don’t republish. The full article lives on arxiv.org. If you’re a publisher and want a different summarization policy for your work, see our takedown page.

Research

A Nonlinear Separation Principle: Applications to Neural Networks, Control and Learning

arXiv cs.LG·2d ago

Research

AdaSplash-2: Faster Differentiable Sparse Attention

arXiv cs.CL·2d ago

Research

Structural interpretability in SVMs with truncated orthogonal polynomial kernels

arXiv cs.LG·2d ago

Stability and Generalization in Looped Transformers

Related

A Nonlinear Separation Principle: Applications to Neural Networks, Control and Learning

AdaSplash-2: Faster Differentiable Sparse Attention

Structural interpretability in SVMs with truncated orthogonal polynomial kernels