500 investment bankers review AI outputs and find none ready for client delivery

A benchmark testing GPT-5.4 and Claude Opus 4.6 on investment banking tasks found zero outputs client-ready, though over half the 500 bankers surveyed would use AI drafts as starting points. The gap between capability and production-grade reliability remains stark even for frontier models.
Modelwire context
Analyst takeThe more telling number isn't the zero client-ready outputs — it's that over half of 500 working bankers said they'd use these drafts as a starting point anyway. That's not a rejection of AI; it's a description of how AI actually enters professional workflows: through the back door, as a labor-saving first draft rather than a finished product.
This is largely disconnected from recent activity in our archive, as we have no prior coverage to anchor it to. But it belongs to a broader pattern visible across professional services: the adoption curve for AI in high-liability work tends to follow a 'draft layer' model, where the tool earns its place by reducing time-to-first-draft rather than by replacing final judgment. Investment banking is a particularly sharp test case because the cost of a client-facing error is reputational and regulatory, not just operational.
Watch whether either OpenAI or Anthropic responds to this benchmark by publishing their own task-specific fine-tuning results for financial document work within the next two quarters. If they do, it signals they see professional services reliability as a competitive front worth fighting on publicly; if they don't, the 'draft layer' framing likely becomes the default positioning by default.
This analysis is generated by Modelwire’s editorial layer from our archive and the summary above. It is not a substitute for the original reporting. How we write it.
MentionsGPT-5.4 · Claude Opus 4.6 · The Decoder
Modelwire summarizes — we don’t republish. The full article lives on the-decoder.com. If you’re a publisher and want a different summarization policy for your work, see our takedown page.