NVIDIA unveils Vera Rubin supercomputer, pairs GPUs with LPUs and pushes open-source Nemotron at GTC

Big infrastructure, bigger claim

Jensen Huang (黄仁勋) used a nearly two‑hour GTC keynote to argue that the next wave of AI will be decided by integrated infrastructure — not just chips. NVIDIA unveiled the Vera Rubin architecture: a rack‑scale NVL72 system combining 72 Rubin GPUs, 36 Vera CPUs and NVLink 6 interconnects, plus an adjacent LPX rack populated with Groq 3 Language Processing Units (LPUs). NVIDIA says NVL72 cuts the GPU count for training large hybrid expert models to a quarter of the prior generation and delivers order‑of‑magnitude gains in inference efficiency and token cost.

New hardware meets special‑purpose processors

The company positioned LPUs as the solution to the two‑stage nature of large‑model inference — a parallel, compute‑heavy “prefill” and a bandwidth‑hungry, sequential “decode.” Groq 3 LPUs are presented as SRAM‑centric accelerators designed for the decode phase; it has been reported that NVIDIA acquired Groq’s core assets for about $20 billion at the end of 2025. NVIDIA’s pitch: Rubin GPUs handle prefill recompute, Groq LPUs handle ultra‑low‑latency token generation, and together they enable real‑time agents at scale.

Open models, closed loop? Nemotron and NemoClaw

On software, Huang announced the Nemotron open‑model alliance — a roster that includes Mistral AI, Perplexity, Cursor, LangChain and others — and a distribution called NemoClaw that packages OpenClaw with NVIDIA’s Agent Toolkit and OpenShell safety sandboxes. The plan is to train a foundational model on NVIDIA’s DGX Cloud and open‑source the result; why? Because thriving open‑source model ecosystems still require massive infrastructure to train and serve models, and that infrastructure is NVIDIA’s market. NemoClaw’s hybrid scheduler lets privacy‑sensitive workloads run locally (even on GeForce RTX laptops) while routing complex tasks to the cloud.

Autos, partnerships and the geopolitical caveat

Huang also showcased partnerships in autonomous driving — BYD (比亚迪), Geely (吉利), Nissan (日产) and Isuzu (五十铃) are listed as adopting NVIDIA’s DRIVE Hyperion platform — and an expanded deal with Uber to deploy NVIDIA‑stack autonomous fleets in 28 cities by 2028. But will all of this be globally deployable? US export controls and broader trade frictions complicate access to the most advanced AI silicon for Chinese customers and other markets, raising questions about where NVIDIA’s top‑end systems will ultimately sit. Who wins if the software is open but the hardware is gated? That was the unsaid question hovering over the GTC stage.