Here’s a Look at One of the World’s Most Complex AI Systems, the NVIDIA Vera Rubin, Integrating a Million Components

Feb 25, 2026 at 11:20am EST
A rack of server units featuring NVIDIA-branded hardware visible in a dimly lit data center setting.

NVIDIA's next-gen Vera Rubin is currently under full production, and the company has provided us with an extensive overview of the rack architecture, diving into individual components.

When we talk about rack generations, NVIDIA is set to feature major upgrades with Vera Rubin, which we'll discuss in depth, but based on a recent video by CNBC diving into the Vera Rubin architecture, we saw an extensive look at multiple components, ranging from the main compute node to networking and cooling elements. More importantly, NVIDIA's Senior Director of Infrastructure, Dion Harris, calls Vera Rubin one of the "world's most complex AI systems, arguing that what NVIDIA does is unique and difficult to execute.

Related Story Intel Crescent Island “Xe3P” GPU Scales To 480 GB of “Cost-Optimized” LPDDR5X Memory, Beating NVIDIA Rubin & AMD MI450X With Highest Capacity

Given that Rubin is expected to see customer commitments soon, it is important that we dive into what an NVL72 rack actually looks like. And, one of the most essential elements of the rack out there is, of course, the Vera Rubin SuperChip itself. We have already talked about how the Rubin GPU and Vera CPU configuration looks from a technical perspective, but one important point to note is that major performance improvements come from NVIDIA integrating HBM4 with the GPU, along with dedicated SOCAMM modules. Altogether, memory bandwidth reaches a whopping 1.2 TB/s.

NVIDIA's major upgrade with Vera Rubin also comes within the cooling department, since Team Green plans to integrate modular liquid cooling designs, covering SuperChip elements such as Rubin GPU and Vera CPU, through dedicated cold plates. NVIDIA's executives argue that Rubin deployment will indeed convince hyperscalers to switch to upgraded liquid-cooling systems, and, interestingly, the current implementation reduces water use, another benefit touted by NVIDIA.

NVLink is an important aspect of Vera Rubin NVL72, and with the 6th-generation interconnection fabric, often called the "NVLink Spine", NVIDIA plans to deliver a total aggregate bandwidth of 260 TB/s per rack. Harris says that with the latest NVLink generation, the company has taken modularity to a whole new level, which is why it claims the NVLink 6 spine supports zero-downtime maintenance and rack-level RAS services.

While estimates suggest that Vera Rubin will debut with a decent price hike, NVIDIA says that the architecture brings in a 10x reduction in inference token cost and 4x reduction in the number of GPUs to train MoE models vs Blackwell GB200, which means that the "most you buy, the more you save" rule by NVIDIA's CEO is still intact.

About the author: Muhammad Zuhair is a hardware and technology reporter for Wccftech, specializing in the semiconductor industry and the complex interplay between technology, manufacturing, and geopolitics. His coverage focuses on the corporate strategies and technological roadmaps of industry giants like TSMC, NVIDIA, Samsung, and Intel. Zuhair's expertise lies in deconstructing complex topics such as fabrication nodes (e.g., 2nm process), the economic impact of policies like the CHIPS Act, and the strategic development of AI infrastructure from NVIDIA, AMD and Intel.

Follow Wccftech on Google to get more of our news coverage in your feeds.