Modelwire
Subscribe

Humanoid data

Illustration accompanying: Humanoid data

Companies are recruiting humans to generate training data for robotics AI by paying them to perform mundane tasks on camera or remotely operate robotic arms. The practice raises questions about data sourcing economics and labor practices in the AI supply chain.

Modelwire context

Analyst take

The buried issue here is not that humans are doing this work, but that the economics of physical training data are structurally different from text or image scraping: you cannot crawl the internet for robot manipulation data, which means labor costs are a permanent line item rather than a one-time acquisition expense.

This connects directly to the resistance movement covered in MIT Technology Review on April 21 ('Resistance'), which catalogued job displacement as one of the concrete harms driving public backlash against AI. The irony is sharp: the same automation pipeline that threatens downstream jobs currently depends on low-wage human labor to function at all. That tension is not acknowledged in most robotics coverage. The Shenzhen dateline also rhymes with the open-source dynamics covered in 'China's open-source bet' from the same day, where Chinese labs are structurally undercutting Western API economics. If Chinese firms can source physical training data more cheaply through domestic labor markets, that is a durable cost advantage, not just a model architecture story.

Watch whether any major robotics company (Figure, Physical Intelligence, or a Chinese peer) discloses labor sourcing practices or per-hour data generation costs in the next six months. Disclosure pressure, or the absence of it, will signal whether this supply chain gets the same scrutiny that crowdsourced annotation work eventually received.

Coverage we drew on

This analysis is generated by Modelwire’s editorial layer from our archive and the summary above. It is not a substitute for the original reporting. How we write it.

MentionsMIT Technology Review · Shenzhen

Modelwire summarizes — we don’t republish. The full article lives on technologyreview.com. If you’re a publisher and want a different summarization policy for your work, see our takedown page.

Humanoid data · Modelwire