Old Huang's old schemes, DeepSeek's deep calculations
NVIDIA (英伟达) has punted the gauntlet down the AI stack. Ahead of GTC 2026 it released Nemotron 3 Super — a nearly fully transparent open-source model with published weights, datasets and a full training recipe — a move that, in openness, it has been reported, eclipses China’s DeepSeek (a Chinese open‑source model company, hereafter “DeepSeek”), which so far has opened only model weights. Jensen Huang (黄仁勋) framed the play in an internal post calling AI a “five‑layer cake” — energy, chips, infrastructure, models and applications — and argued that open models activate demand all the way down the stack. Who benefits when a model is truly open? NVIDIA thinks the answer is its GPU ecosystem.
What NVIDIA disclosed
NVIDIA says Nemotron 3 Super is a 120‑billion‑parameter model with a Mixture‑of‑Experts design that delivers about 12 billion active parameters and million‑token context windows for long‑memory agent work. It reportedly uses a hybrid Mamba‑Transformer backbone, trains largely on a 10 trillion curated token corpus plus reinforcement‑learning rollouts and millions of supervised samples, and leans on a new 4‑bit floating format (NVFP4) optimized for Blackwell hardware to cut memory and speed up inference. NVIDIA also published the full pretraining, fine‑tuning and RL evaluation pipeline — including interactive environments — so developers can reproduce or adapt the entire lifecycle, not just download weights.
Why openness is a strategic lever
This is not just about model quality. NVIDIA’s core business is selling compute — GPUs, interconnects and datacenter systems — and open‑sourcing an end‑to‑end recipe is a powerful demand engine for that hardware. By publishing a stack that favors NVFP4 and Blackwell, the company creates an open‑source path that is naturally optimized for its chips. The transparency also draws researchers and tool vendors into NVIDIA’s orbit, accelerating papers, toolchains and cloud integrations that further entrench its platform. Open models have become a marketing channel for hardware — but also a new kind of soft power in the global AI ecosystem.
Stakes for China and DeepSeek
China’s DeepSeek, by contrast, has prioritized opening weights to accelerate local deployment and ecosystem growth; that strategy channels demand toward domestic compute and gives Chinese developers a path to local, controllable inference. That matters: each application built on an open Chinese model can direct workloads toward domestic suppliers like Huawei’s Ascend (华为昇腾), Hygon (海光), Cambricon (寒武纪), Moore Threads (摩尔线程) and Suiyuan (燧原). In a world of sanctions, export controls and geopolitically shaped supply chains, models are no longer only technical artifacts — they are instruments of industrial policy. So which wins: the company that leads the best single model, or the ecosystem that locks in compute, tools and customers? The answer will shape the next phase of the AI platform race.
