Baidu scales OCR to dozens of pages by rethinking attention memory

Baidu has cracked a fundamental scaling problem in document processing by redesigning how transformer models manage context memory. Unlimited OCR processes 3x more pages per inference pass than prior systems while maintaining constant memory footprint, achieved through a modified attention mechanism inspired by human memory decay. The breakthrough addresses a critical bottleneck for enterprise document workflows and signals that architectural innovation, not just scale, remains a lever for capability gains in specialized domains.
Modelwire context
Analyst takeThe framing around 'human memory decay' is doing a lot of work here. What actually matters is that Baidu is demonstrating a production-viable path to longer document context without proportional compute cost, which is the constraint that has kept most enterprise OCR pipelines chunked and fragile.
This connects directly to the reading order inference work covered on July 1st, which identified a different but adjacent bottleneck: even when OCR produces accurate text, structural ambiguity in complex layouts limits downstream usability. Baidu's throughput gains only compound in value if the text being processed is correctly ordered and segmented. Together, these two pieces sketch a fuller picture of where document digitization still breaks. The clinical NLP production paper from the same week is also relevant: it showed that multi-stage LLM pipelines in regulated domains hit real scaling walls when failure modes fragment, a warning that applies directly to enterprise document workflows that would adopt Unlimited OCR at volume.
Watch whether any major cloud document processing vendor, Adobe, AWS Textract, or Google Document AI, announces a comparable constant-memory attention variant within the next two quarters. If none respond, that suggests Baidu's architectural approach is harder to replicate than the announcement implies.
This analysis is generated by Modelwire’s editorial layer from our archive and the summary above. It is not a substitute for the original reporting. How we write it.
MentionsBaidu · Unlimited OCR · The Decoder
Modelwire Editorial
This synthesis and analysis was prepared by the Modelwire editorial team. We use advanced language models to read, ground, and connect the day’s most significant AI developments, providing original strategic context that helps practitioners and leaders stay ahead of the frontier.
Modelwire summarizes, we don’t republish. The Decoder originally reported this story as “Baidu's "Unlimited OCR" processes dozens of document pages in one pass by treating memory like human forgetting”. The full content lives on the-decoder.com. If you’re a publisher and want a different summarization policy for your work, see our takedown page.