NVIDIA unveiled Nemotron 3 Ultra at Computex on June 1, 2026 — a 550-billion-parameter open-weights model, now the most capable US-origin open model ever released. The move turns the world’s dominant AI chipmaker into a direct model provider, competing with its own customers.
Open weights mean anyone can download, fine-tune, and deploy Nemotron 3 Ultra without licensing fees. For enterprises wary of per-token API costs, that economics is the headline.
How it stacks up against the open field
Nemotron 3 Ultra enters a crowded open-weights market with a specific positioning: the most capable model that originates in the United States.
| Model | Parameters | Origin |
|---|---|---|
| Nemotron 3 Ultra | 550B | USA (NVIDIA) |
| Moonshot Kimi K2.6 | 1T | China |
| Meta Llama 4 Behemoth | 288B | USA (Meta) |
Kimi K2.6 is larger at one trillion parameters, but Nemotron 3 Ultra is positioned as the leading US open option — a distinction that matters to enterprises and governments with sovereignty requirements.
Built for agentic enterprise work
NVIDIA designed Nemotron 3 Ultra for enterprise agentic workflows, reasoning, and code generation — the use cases where open-weights deployment inside a private network is most valuable. Companies that cannot send data to a third-party API can run the full model on their own NVIDIA hardware.
NVIDIA is now competing with its best customers
The strategic tension is obvious. NVIDIA sells GPUs to OpenAI, Anthropic, and Google — and now ships a model that competes with theirs. Releasing open weights is a hedge: it expands the market for NVIDIA compute regardless of which lab wins, because every fine-tune and deployment runs on NVIDIA silicon.
It also pressures the closed-model economics that labs like Anthropic — recently valued at $965 billion — depend on. A free, capable, US-origin open model raises the bar that paid APIs must clear to justify their price.
What open-vs-closed comes down to now
The release sharpens a question every AI buyer faces: pay for a managed frontier API, or self-host a capable open model and own the stack. With a 550B open-weights option from the company that makes the chips, the self-host path just got more credible.
For technical teams, the next step is a direct cost comparison — Nemotron 3 Ultra inference on owned hardware versus per-token frontier API pricing for the same workload.