New review paper argues code is how AI agents think and act, not just what they produce

A new review paper reframes the AI agent bottleneck away from model capability toward the software infrastructure surrounding it. Tools, memory systems, testing frameworks, and permission boundaries transform a stateless language model into a functional autonomous agent. Deepseek's establishment of a dedicated Beijing-based harness team operationalizes this thesis, suggesting the industry is shifting focus from raw model performance to the orchestration layer that makes agents reliable and deployable. This signals a maturation phase where competitive advantage moves from model weights to systems engineering.
Modelwire context
Analyst takeThe review paper's most underreported implication is that the orchestration layer is now a moat-building surface, meaning companies that nail harness engineering can neutralize model capability gaps from competitors. Deepseek's dedicated Beijing harness team is the clearest public signal yet that a frontier lab is treating infrastructure as a first-class product rather than a support function.
This is largely disconnected from the CDT dark-patterns research covered here on May 29th, which focuses on chatbot UX manipulation rather than agent architecture. The more relevant thread is one Modelwire hasn't yet covered directly: the quiet accumulation of scaffolding-layer investment across labs. The orchestration framing does, however, share a structural concern with the dark-patterns story. Both point to the gap between what a model can do and what the surrounding system actually delivers to users, whether that gap is exploited through manipulative design or closed through disciplined engineering.
Watch whether other frontier labs, particularly those competing directly with Deepseek on cost efficiency, announce dedicated harness or agent-infrastructure teams within the next two quarters. If they do, the orchestration layer has officially become a staffing and org-design battleground, not just a research topic.
Coverage we drew on
This analysis is generated by Modelwire’s editorial layer from our archive and the summary above. It is not a substitute for the original reporting. How we write it.
MentionsDeepseek · The Decoder
Modelwire Editorial
This synthesis and analysis was prepared by the Modelwire editorial team. We use advanced language models to read, ground, and connect the day’s most significant AI developments, providing original strategic context that helps practitioners and leaders stay ahead of the frontier.
Modelwire summarizes, we don’t republish. The full content lives on the-decoder.com. If you’re a publisher and want a different summarization policy for your work, see our takedown page.