Those Feeding AI in China’s Valleys: Inside the Inland “Labeling Bases” Powering Big Tech
A hidden workforce, brought into view
China’s AI isn’t only trained in gleaming coastal campuses. It is also fed in the country’s interior, where data-annotation “bases” have been built inside poverty-relocation communities. According to fieldwork summarized by Huxiu (虎嗅) from Xinrui Weekly (信睿周报), researchers embedded for years in these sites found a tightly coupled system: coastal model teams, centralized task platforms, and county-edge QA rooms stitched together to turn images, audio, and text into machine-readable fuel. The twist? Much of the hands-on labor comes from local women who clock in, don headsets, and click through tasks that teach models what counts as a person, a pothole—or a polite answer.
Why inland, not offshore?
Western readers know the standard script: AI labs in the Global North, annotation in the Global South. China diverges. The study describes “inland-sourcing” (内陆化): rather than sending sensitive work abroad, tasks move from Beijing, Hangzhou, and Shenzhen to supervised facilities in provinces such as Shanxi, Shaanxi, Gansu, Xinjiang, Guizhou, Chongqing, and Henan. Why not go offshore? Secrecy. Engineers worry that “what gets labeled” reveals a firm’s roadmap. China’s tightening data-governance regime and U.S.–China tech frictions further encourage onshore handling. Local governments sweeten the deal with rent holidays, subsidized utilities, and labor pools in relocation communities; in some cases, the community party secretary reportedly serves as the base’s legal representative. It has been reported that such self-run sites can hit 97–98% accuracy—higher than crowd platforms—while keeping churn and leaks in check.
Women’s work, framed as virtue
These are not “ghost workers” in the cloud; they badge in and sit in shared rooms. Yet the structure remains gendered. Field accounts describe “mom workers” who sprint home at lunch to cook, return to annotate, then leave again at 4:30 to collect children—sometimes bringing them back to the floor. Managers’ attempts at strict hours collide with kinship hierarchies and caregiving norms. Local officials, aiming to “keep people settled” in new communities, push to hire women first and award “Women’s Workshop” (巾帼车间) plaques; some sites even run a “4:30 classroom” so kids can wait until mothers finish shifts. The result? Dignity and proximity to home, but also a polite inequality: when bases reassign “richer” tasks to elite groups to safeguard deadlines, many mothers accept lower-yield queues as their own choice to prioritize care.
The broader stakes for China’s AI race
For China’s tech industry—including giants such as Baidu (百度), Alibaba (阿里巴巴), and Tencent (腾讯)—annotated data remains a strategic asset as firms vie to ship safer, faster foundation models under regulatory scrutiny. Inland bases translate messy reality into stable labels, turning local policy incentives and community networks into a production advantage. But they also reveal who writes social norms into models: not just star engineers, but precarious workers in mountain valleys. As Huxiu (虎嗅) notes, this last mile of AI is stubbornly human. And it raises a sharp question: can the sector’s drive for confidentiality, cost control, and compliance also deliver fairer terms for those quietly teaching machines how the world works?
