← Back to stories Detailed close-up view of electronic circuit board, showcasing modern technology.
Photo by Alexa Kei on Pexels
凤凰科技 2026-03-17

Jensen Huang GTC Exclusive: Low‑Latency Inference Will Be the Next Growth Engine of the AI Economy, and Tight Supply–Demand for Power Chips Will Persist Long‑Term

Key takeaway: inference, not just training

At NVIDIA (英伟达) GTC, CEO Jensen Huang said low‑latency inference — running AI models live on devices and in the cloud with minimal delay — will be the next major growth engine of the AI economy. Models that respond in real time unlock new consumer, enterprise and robotics use cases. Want immersive mixed‑reality apps or instant AI assistants? Low latency is the gatekeeper. Huang argued that the economics of AI will shift from occasional large‑scale training runs to constant, widely distributed inference workloads.

Why demand will stay intense

Huang warned that demand for high‑performance, power‑hungry AI accelerators will remain tight. It has been reported that capacity constraints, complex chip designs and long qualification cycles for data‑center hardware mean supply cannot quickly catch up with a surge in inference deployments. The implication: higher bar for system integrators and cloud providers, and continued premium pricing for the top‑tier chips that power low‑latency workloads.

Geopolitics and supply‑chain friction

This technical story is also geopolitical. U.S. export controls on advanced chips and manufacturing equipment have reshaped where and how specialised silicon is produced, and reportedly extend the duration of supply tightness. China is accelerating investments in local fabs such as SMIC (中芯国际) and in domestic AI stacks, but bridging the gap in high‑end process nodes and EDA tools will take time. What does that mean for Western cloud giants and Chinese device makers? Expect more regionalisation, longer lead times, and strategic stockpiles.

Market consequences and the winner’s circle

For investors and product teams the message is simple: the AI value chain is broadening. Companies that can deliver low‑latency inference at scale — through chips, optimized software stacks or edge deployments — will capture recurring, high‑margin revenue. It also means incumbents with manufacturing relationships and advanced packaging capabilities will stay advantaged. Reportedly, this dynamic will sustain a tight market for “power chips” long after the next hardware refresh cycle ends.

AISemiconductorsResearchSpace
View original source →