Policy & Regulation Research·The Decoder·3d ago

Meta secretly tested ChatGPT, Gemini, and Character.AI with thousands of minor-perspective crisis prompts

Meta conducted undisclosed adversarial testing of competing AI systems by deploying hundreds of contractors to impersonate minors and submit over 45,000 crisis-related prompts to OpenAI, Google, and Character.AI without their knowledge. The operation reveals a widening gap between public safety commitments and private competitive intelligence gathering in the LLM space. This raises questions about industry norms around red-teaming disclosure, consent, and whether safety testing conducted covertly against rival systems constitutes a new form of competitive pressure that could reshape how companies approach child safety benchmarking and model evaluation.

Modelwire context

Analyst take

The buried detail here is not that Meta tested rivals, but that it did so at scale using contractor labor specifically recruited to impersonate minors, which moves this from informal probing into something closer to an organized covert operation with real liability surface area for all parties involved.

This is largely disconnected from recent activity in our archive, so it belongs to a broader pattern that has been building across the industry: the gap between public safety commitments and what companies actually do when competitive pressure is high. Child safety has become a visible battleground in AI policy circles, with Character.AI in particular facing regulatory and legal scrutiny over minor-facing interactions. Meta's decision to target that exact vulnerability in rivals, without disclosure, suggests safety benchmarking is now being used as a competitive instrument rather than a shared infrastructure problem. That framing should make readers skeptical of any future cross-company safety collaboration announcements.

Watch whether OpenAI, Google, or Character.AI file formal complaints with regulators or pursue legal action against Meta in the next 90 days. A lawsuit or FTC referral would confirm that covert adversarial testing has crossed from gray-area research practice into actionable harm, which would force the entire industry to define consent norms for third-party red-teaming.

This analysis is generated by Modelwire’s editorial layer from our archive and the summary above. It is not a substitute for the original reporting. How we write it.

MentionsMeta · OpenAI · Google · Character.AI · ChatGPT · Gemini

Read full story at The Decoder →(the-decoder.com)

Modelwire Editorial

This synthesis and analysis was prepared by the Modelwire editorial team. We use advanced language models to read, ground, and connect the day’s most significant AI developments, providing original strategic context that helps practitioners and leaders stay ahead of the frontier.

Our mission How we write

Modelwire summarizes, we don’t republish. The full content lives on the-decoder.com. If you’re a publisher and want a different summarization policy for your work, see our takedown page.