Nvidia unveiled the Nemotron 3 Super 120B—a public LLM with a fivefold speed boost for AI agents.
New Nvidia Model – Nemotron 3 Super
Nvidia announced the launch of Nemotron 3 Super, an open AI model of the Mixture‑of‑Experts (MoE) type.
* 120 billion total parameters, of which 12 billion are active.
* Designed for agent-based AI—systems where multiple “agents” interact with each other and with the external world.
Architecture
The model uses a hybrid Mamba‑Transformer approach (combining Mamba layers and Transformer elements).
Nemotron 3 Super is the first to apply the LatentMoE paradigm, Multi‑Token Prediction layers, and pretraining via the NVFP4 protocol. According to Nvidia, this stack improves accuracy and speeds up inference.
Performance
* Throughput – up to 5× faster than the previous Nemotron Super version.
* Accuracy – up to 2× higher.
* Support for a context window of 1 million tokens allows agents to retain the full state of their workflow, reducing the risk of drifting from goals.
Practical Applications
Nemotron 3 Super is well suited for complex tasks within multi‑agent systems:
| Task | Example Use |
|---|---|
| Code generation and debugging without document fragmentation | Automatic writing and testing of large programs |
| Financial analysis | Incorporating thousands of report pages into the model’s memory |
Training
The model was trained on synthetic data generated with logical thinking models. Nvidia discloses the full methodology:
* over 10 trillion tokens before and after training;
* 15 environments for reinforcement learning;
* evaluation recipes.
Researchers can use the Nvidia NeMo platform to fine‑tune or create their own versions of the model.
Technical Details
* NVFP4 support on the Nvidia Blackwell architecture.
* Reduced memory requirements and fourfold inference speed improvement compared with FP8 on Nvidia Hopper, without loss of accuracy.
Availability
The model is already available:
* Via build.nvidia.com, Hugging Face, OpenRouter, and Perplexity.
* Cloud partners: Google Cloud Vertex AI, Oracle Cloud Infrastructure, CoreWeave, Together AI, Baseten, Cloudflare, DeepInfra, Fireworks AI, Modal.
* As an Nvidia NIM microservice, enabling local or cloud deployment.
Nemotron 3 Super opens new possibilities for agent-based AI, combining high accuracy, scalability, and customization flexibility.
Comments (0)
Share your thoughts — please be polite and stay on topic.
Log in to comment