Large-scale study maps what people ask Microsoft Copilot about health

What the paper did

Researchers published an arXiv preprint (arXiv:2604.15331) analyzing more than 500,000 de‑identified, health‑related conversations with Microsoft’s Copilot from January 2026 to characterize how people use conversational AI for health. The team developed a hierarchical intent taxonomy of 12 primary categories using a privacy‑preserving, LLM‑based classification pipeline and validated that labeling against expert human annotation. The study is a rare, large‑scale empirical look at real user queries to a major consumer AI assistant; the paper is available on arXiv at https://arxiv.org/abs/2604.15331 and, as a preprint, has not yet been peer‑reviewed.

Key patterns and methods

The authors say the taxonomy captures a broad range of intents — from clinical symptom questions and medication queries to administrative tasks and emotional support — providing a framework to quantify both utility and risk. The methodology emphasizes privacy: conversations were de‑identified and processed with an LLM classification step designed to avoid exposing sensitive content while enabling scalable annotation. Why does that matter? Because access to large, real‑world datasets is rare in health AI research; this paper shows one way to analyze such data without obvious privacy breaches.

Why it matters — and the regulatory backdrop

The findings are relevant beyond Microsoft. Similar efforts are underway at Chinese firms such as Baidu (百度) and Alibaba (阿里巴巴), and regulators from the U.S. to the EU — and in China — are tightening scrutiny of AI in health. What responsibilities follow when millions of users rely on chatbots for medical information? The study underscores persistent tensions: conversational agents can increase access and triage capacity, but they also raise questions about accuracy, liability, and user privacy.

The paper offers a practical taxonomy that researchers, clinicians and policymakers can use to assess risk and design guardrails. It has been reported that the deployment of consumer AI health features has accelerated regulatory attention; this work gives regulators and companies a data‑driven starting point for evaluating what people actually ask and where harm is most likely to arise.