Nvidia released the Groq 3 LPU chip, which accelerates AI model inference to a token-level speed.

Nvidia Reveals New Capabilities for the Vera Rubin Platform

At this year’s GTC conference, Nvidia CEO Jensen Huang announced an expansion of the Vera Rubin platform. The new capabilities are built on intellectual property acquired from Groq, and the Rubin now includes the *Groq 3 LPU* chip—a inference accelerator designed to deliver tokens at high speed with low latency.

What Already Exists in Vera Rubin
The platform consists of six key components that Nvidia assembles into rack‑mounted systems and scales up to large AI factories:

Component	Description
GPU Rubin	288 GB HBM4 graphics card
CPU Vera	Central processor
NVLink 6	Intra‑system scaling system
ConnectX‑9	Intelligent network adapter
BlueField‑4	Data processing unit
Spectrum‑X	Inter‑system scaling switch with integrated optics

The Groq 3 LPU is now added as a new building block that will be used when deploying large systems.

Why the Groq 3 LPU Stands Out
The main difference is memory architecture. While most accelerators use HBM as working memory, each Groq 3 LPU contains 500 MB of SRAM. Comparison:

Parameter	GPU Rubin (HBM4)	Groq 3 LPU (SRAM)
Capacity	288 GB	0.5 GB
Bandwidth	~22 TB/s	up to 150 TB/s

For inference tasks that are bandwidth‑sensitive, the advantage of SRAM is obvious. That’s why Nvidia included Groq 3 in Rubin—to increase token delivery speed.

Groq 3 LPX Rack
The rack contains 256 Groq 3 LPU chips, providing:

- 128 GB of SRAM
- 40 PB/s total bandwidth
- 640 TB/s intra‑system interface

Vice President for Hyper‑Scalable Solutions Ian Buck called this rack a coprocessor for Rubin, emphasizing its role in boosting decoding performance at every model layer and token.

Impact on Multi‑Agent Systems
Buck noted that the Groq 3 LPX will be a key element for the future AI market—multi‑agent systems. When agents exchange data directly rather than through chatbots, response requirements change: from 100 tokens/s to over 1,500+ tokens/s and beyond.

Competitors and Outlook
The text mentions competitor Cerebras, which uses a Wafer‑Scale Engine (WSE) with massive SRAM for low‑latency inference. OpenAI has already deployed Cerebras in its cutting‑edge models thanks to favorable latency.

Buck also noted that the introduction of Groq 3 LPU could reduce dependence on the Rubin CPX accelerator. While Nvidia focuses on integrating the Groq 3 LPX rack with the platform, both chips are intended to boost inference without requiring large amounts of GDDR7 memory.

Conclusion:

The new Groq 3 LPU chip and its LPX rack strengthen Vera Rubin in the low‑latency inference segment, paving the way for faster multi‑agent AI systems and competing with players like Cerebras.

Nvidia released the Groq 3 LPU chip, which accelerates AI model inference to a token-level speed.

Related news

xAI attracted a major client from OpenAI by using an "implementation" strategy

Jeff Bezos is investing in physical artificial intelligence by creating a $100 billion fund to purchase manufacturing and transition it to AI technologies

Tesla aims to achieve 100 GW of solar capacity with China's support

More reliability through fewer AI: Microsoft explained how to restore lost trust for Windows 11 users

Comments (0)

Log in to comment

Nvidia released the Groq 3 LPU chip, which accelerates AI model inference to a token-level speed.

Related news

xAI attracted a major client from OpenAI by using an "implementation" strategy

Jeff Bezos is investing in physical artificial intelligence by creating a $100 billion fund to purchase manufacturing and transition it to AI technologies

Tesla aims to achieve 100 GW of solar capacity with China's support

More reliability through fewer AI: Microsoft explained how to restore lost trust for Windows 11 users

Log in to comment

Nvidia released the Groq 3 LPU chip, which accelerates AI model inference to a token-level speed.

Tesla aims to achieve 100 GW of solar capacity with China's support

More reliability through fewer AI: Microsoft explained how to restore lost trust for Windows 11 users