Nvidia unveiled the Nemotron 3 Super 120B—a public LLM with a fivefold speed boost for AI agents.

22.03.2026 10 hardware

New Nvidia Model – Nemotron 3 Super

Nvidia announced the launch of Nemotron 3 Super, an open AI model of the Mixture‑of‑Experts (MoE) type.

* 120 billion total parameters, of which 12 billion are active.
* Designed for agent-based AI—systems where multiple “agents” interact with each other and with the external world.

Architecture
The model uses a hybrid Mamba‑Transformer approach (combining Mamba layers and Transformer elements).

Nemotron 3 Super is the first to apply the LatentMoE paradigm, Multi‑Token Prediction layers, and pretraining via the NVFP4 protocol. According to Nvidia, this stack improves accuracy and speeds up inference.

Performance
* Throughput – up to 5× faster than the previous Nemotron Super version.
* Accuracy – up to 2× higher.
* Support for a context window of 1 million tokens allows agents to retain the full state of their workflow, reducing the risk of drifting from goals.

Practical Applications
Nemotron 3 Super is well suited for complex tasks within multi‑agent systems:

Task	Example Use
Code generation and debugging without document fragmentation	Automatic writing and testing of large programs
Financial analysis	Incorporating thousands of report pages into the model’s memory

Training
The model was trained on synthetic data generated with logical thinking models. Nvidia discloses the full methodology:

* over 10 trillion tokens before and after training;
* 15 environments for reinforcement learning;
* evaluation recipes.

Researchers can use the Nvidia NeMo platform to fine‑tune or create their own versions of the model.

Technical Details
* NVFP4 support on the Nvidia Blackwell architecture.
* Reduced memory requirements and fourfold inference speed improvement compared with FP8 on Nvidia Hopper, without loss of accuracy.

Availability
The model is already available:

* Via build.nvidia.com, Hugging Face, OpenRouter, and Perplexity.
* Cloud partners: Google Cloud Vertex AI, Oracle Cloud Infrastructure, CoreWeave, Together AI, Baseten, Cloudflare, DeepInfra, Fireworks AI, Modal.
* As an Nvidia NIM microservice, enabling local or cloud deployment.

Nemotron 3 Super opens new possibilities for agent-based AI, combining high accuracy, scalability, and customization flexibility.

Nvidia unveiled the Nemotron 3 Super 120B—a public LLM with a fivefold speed boost for AI agents.

Related news

Samsung is working on HBM5 with the possibility of using even 2‑nm crystals

DDR5 now brings more profit than HBM, according to leading memory manufacturers

Intel admitted that its new desktop Core Ultra Plus CPUs are almost no faster than Ryzen in games

NASA is working to rescue the falling Swift Observatory, which could leave orbit by the end of the year

Comments (0)

Log in to comment