Can AI tell if your script will make a hit film?

Quilty, an AI startup claiming to predict film box-office performance from scripts alone, has faced skepticism from industry practitioners testing its product. The gap between the startup's initial promise and real-world validation highlights a recurring pattern in AI applications: overconfidence in predictive models when applied to inherently complex, subjective domains like entertainment. This case underscores how even data-rich scenarios can expose the limits of current AI systems when facing cultural and creative variables that resist quantification.
Modelwire context
Skeptical readThe real question isn't whether Quilty's model works, but what 'works' means in a domain where box-office outcomes depend on marketing spend, release timing, star power, and cultural moment as much as script quality. The startup's claim sidesteps this by isolating one variable.
This echoes the Amazon leaderboard incident from early June, where internal AI ranking systems collapsed under gaming and misaligned incentives. Quilty faces a similar trap: if the model trains on historical box-office data, it's learning correlations baked into past studio decisions (which scripts got greenlit, which got marketing budgets) rather than script quality itself. The Sutton piece on evaluation architecture also applies here. Without a built-in feedback loop that isolates script signal from production confounds, Quilty can't distinguish between predicting what studios will fund versus predicting what audiences will watch.
If Quilty publishes holdout test results showing the model outperforms industry greenlight decisions on scripts that were rejected by studios but later succeeded under different conditions (different studios, different eras, different marketing), that's meaningful. If results only show correlation with scripts that were actually produced and funded, the model is just learning studio taste, not script quality.
Coverage we drew on
This analysis is generated by Modelwire’s editorial layer from our archive and the summary above. It is not a substitute for the original reporting. How we write it.
MentionsQuilty
Modelwire Editorial
This synthesis and analysis was prepared by the Modelwire editorial team. We use advanced language models to read, ground, and connect the day’s most significant AI developments, providing original strategic context that helps practitioners and leaders stay ahead of the frontier.
Modelwire summarizes, we don’t republish. The full content lives on theverge.com. If you’re a publisher and want a different summarization policy for your work, see our takedown page.