Today's AI is already so powerful that directors have to deliberately leave bugs… Qianwen's "AI taxi" shows agents moving from clicks to real-world fulfilment

Agents, not remotes: a new capability frontier

It has been reported that Qianwen (千问), the large-model initiative inside Alibaba (阿里巴巴), quietly launched an "AI taxi" skill at the end of March — and Phoenix Technology (凤凰网科技) says its beta reveals a qualitative jump. For months much of generative AI has behaved like a smarter remote control: parse a single command, call a function, simulate clicks inside an app. But Qianwen’s agent reportedly decomposes fuzzy user intents, plans multi-step fulfilment and trades off routes, costs and user comfort in real time. In short: this is execution, not emulation.

Why hailing a car is harder than ordering food

Why is calling a ride so different from ordering takeout? Because food and tickets sit inside highly structured, tolerant systems. You can change a restaurant or reschedule a movie; errors are forgiveable. A ride is high-frequency, low-tolerance and tied to physical actors — drivers, traffic, immediate safety and payments. It has been reported that Qianwen can replan when a user says "I get carsick" and choose a longer, less-bumpy route; that requires real-time dispatch, fare calculation and on‑the‑ground data, not just simulated UI clicks.

Ecosystem integration is the competitive moat

Qianwen’s advantage, reportedly, comes from deep integration with Alibaba’s ecosystem: Taobao, Fliggy, Taopiaopiao and logistics and payments rails that already close commercial loops. Western rivals such as Google’s Gemini can automate an app session — opening Uber in a virtual window, simulating touches — but that remains a sandboxed remote-control pattern, often requiring final human confirmation. Who is accountable if the car never arrives or the route is wrong? That question underscores the difference between algorithmic capability and having a commercial fulfilment loop tied to drivers, billing and complaint resolution.

Implications: incumbents, vertical apps and geopolitics

If Phoenix’s findings hold, the rollout marks a tipping point: general agents that can own multi-hop life workflows — book a ticket, order a ride, schedule a return — threaten single-purpose apps’ relevance. We’ve already seen market reactions when agents encroach on vertical tools elsewhere. And there’s a geopolitical dimension: platform access, data sovereignty and differing regulatory regimes make deep, domestic integrations easier for China-based models than for many Western providers constrained by export controls, cross-border data flows and sandboxed integrations. The result? The AI competition is shifting from conversational benchmarks to the messy business of who can actually deliver outcomes — and be held responsible for them.