Nvidia unveiled the Nemotron 3 Super 120B—a public LLM with a fivefold speed boost for AI agents.

Nvidia unveiled the Nemotron 3 Super 120B—a public LLM with a fivefold speed boost for AI agents.

10 hardware

New Nvidia Model – Nemotron 3 Super

Nvidia announced the launch of Nemotron 3 Super, an open AI model of the Mixture‑of‑Experts (MoE) type.

* 120 billion total parameters, of which 12 billion are active.
* Designed for agent-based AI—systems where multiple “agents” interact with each other and with the external world.

Architecture
The model uses a hybrid Mamba‑Transformer approach (combining Mamba layers and Transformer elements).

Nemotron 3 Super is the first to apply the LatentMoE paradigm, Multi‑Token Prediction layers, and pretraining via the NVFP4 protocol. According to Nvidia, this stack improves accuracy and speeds up inference.

Performance
* Throughput – up to 5× faster than the previous Nemotron Super version.
* Accuracy – up to 2× higher.
* Support for a context window of 1 million tokens allows agents to retain the full state of their workflow, reducing the risk of drifting from goals.

Practical Applications
Nemotron 3 Super is well suited for complex tasks within multi‑agent systems:

TaskExample Use
Code generation and debugging without document fragmentationAutomatic writing and testing of large programs
Financial analysisIncorporating thousands of report pages into the model’s memory

Training
The model was trained on synthetic data generated with logical thinking models. Nvidia discloses the full methodology:

* over 10 trillion tokens before and after training;
* 15 environments for reinforcement learning;
* evaluation recipes.

Researchers can use the Nvidia NeMo platform to fine‑tune or create their own versions of the model.

Technical Details
* NVFP4 support on the Nvidia Blackwell architecture.
* Reduced memory requirements and fourfold inference speed improvement compared with FP8 on Nvidia Hopper, without loss of accuracy.

Availability
The model is already available:

* Via build.nvidia.com, Hugging Face, OpenRouter, and Perplexity.
* Cloud partners: Google Cloud Vertex AI, Oracle Cloud Infrastructure, CoreWeave, Together AI, Baseten, Cloudflare, DeepInfra, Fireworks AI, Modal.
* As an Nvidia NIM microservice, enabling local or cloud deployment.

Nemotron 3 Super opens new possibilities for agent-based AI, combining high accuracy, scalability, and customization flexibility.

Comments (0)

Share your thoughts — please be polite and stay on topic.

No comments yet. Leave a comment — share your opinion!

To leave a comment, please log in.

Log in to comment