DeepSeek (深寻) boosts API throughput and scales default concurrency to 500 connections

Product upgrade: faster output, bigger pipe

It has been reported that DeepSeek (深寻) has completed an output acceleration and service-scaling upgrade to its API, delivering faster response throughput and improved stability. The company now defaults to supporting 500 concurrent online connections — a meaningful uplift for enterprises building real-time applications on Chinese AI stacks. Enterprise customers requiring larger concurrency pools can reportedly apply for higher limits.

Pricing move for V4‑Pro

DeepSeek also recently signalled a pricing change for its flagship model. It has been reported that the DeepSeek‑V4‑Pro API’s promotional 2.5‑fold discount will end on May 31, 2026; after that date the service will be adjusted to one‑quarter of the original list price. Customers will want to weigh the new concurrency baseline against the upcoming price shift when planning deployments.

Why this matters — and the wider context

Why should Western readers care? Concurrency and throughput determine whether AI services can handle production traffic from chatbots, search, recommendation and other latency‑sensitive systems. The move is part of a broader trend in China where model providers are scaling service capacity to meet enterprise demand while optimizing for domestic infrastructure. Geopolitics also plays a role: with export controls and restrictions on advanced chips from some Western vendors, Chinese companies are increasingly focused on software-level optimization and local capacity to sustain large‑scale AI offerings. Expect competition on both performance and cost as the market matures.