Meta has shared the building blocks of its Catalina AI system, which is based on NVIDIA's GB200 NVL72 solution with Open Rack v3 & Liquid Cooling.
Meta's Custom NVIDIA GB200 NVL72 Blackwell Platform, The Catalina Pod, is Liquid Cooling-Ready & Open Rack v3 Compliant
Back in 2022, Meta mainly focused on clusters that were around 6,000 GPUs in terms of size. These were mainly designed for traditional ranking and recommendation models, so essentially running workloads that spanned 128-512 GPUs.

A year later, thanks to the advent of GenAI & LLMs, clusters grew to 16-24K GPUs (a 4x increase), and just last year, Meta was running 100,000 GPUs and continues to add more. Meta is also a software enabler with models such as LLama, and anticipates a 10x increase in cluster sizes by the next few years.

Meta states that they started on the Catalina project very early with NVIDIA, and utilize their NVL72 GPU solution as the baseline but while the name is same, it switches to a NVL36x2 configuration. Meta also worked with NVIDIA to customize the system to meet their needs, and both also contributed the reference design for MGX and NVL72 to open source, with Catalina being online on the Open Compute website.

So jumping into Meta's Catalina, this is what is being deployed by them in their data centers. Meta calls each system a pod, and they essentially copy/paste it for scale-up reasons.
One difference between the standard NVL72 versus Meta's custom version is that they have two IT racks that consist of a single 72 GPU scale-up domain. Each of these IT racks has the same configuration. They have 18 compute trays split between the top and the bottom of the rack. And they have nine NV switches within each IT rack on the left and the right. Between each system is a big, thick bundle of cables.

This is something that basically allows all of these GPUs across the two racks to be combined, connecting through the NV switches to create a single 72-GPU scale-up domain. On the left and right of the racks, you can see large ALCs, or air-assisted liquid cooling devices. These allow Meta to deploy liquid-cooled, high-power density racks into their existing data centers that are being deployed all over the US and the world.

Meta states that with two racks, they can essentially increase the number of CPUs and the amount of total memory within a rack, so going from 17 to 34 TB LPDDR memory, which helps them get all the way up to 48 TB of total cache-coherent memory that's between both the GPUs and the CPUs within a rack. The PSU takes 480 volts or 277 volts single-phase and converts it to 48 volts DC, which is distributed through the buck bar in the back, and that's what powers all of the individual server blades, NV switches, and networking devices within the rack.
So at the top and bottom of the rack, you can see there's one power supply shelf, and then two more at the bottom of each. Meta also has its own fiber path panel, which is what all of the in-rack fiber cabling is connected to for the back-end network, which then goes out to the data center to essentially connect to the networking switches that sit at the end of the row for the scale-up domain. There's the rack management controller, Wedge 400, which is a front-end network switch, and then there are several IT and switch trades.

To support all of this, Meta requires a range of new technologies, some of which are already a part of the NVIDIA NVL72 GB200 Blackwell system. Unique to Meta, there were a few things they have, like the high-power version of their open racks, essentially higher power supplies and CPUs. They also had liquid cooling, so the air-assisted liquid cooling needed to support those racks and traditional data centers. The rack management controller, which is basically a safety and orchestration device that helps enable and disable cooling, also monitors for leaks in the racks. They have their network topology, the disaggregated scheduled fabric, which is what allows them to connect multiple of these pods to make larger clusters.
This is also the first deployment of Meta's high-powered rack version of OpenRack v3. This allows Meta to increase the amount of power for each rack up to 94 kW for the busbar (600A). This also supports newer buildings that have facility liquid cooling that actually lets you just run liquid straight to the rack. To manage liquid, Meta is using something called the RMC, or the Rack Management Controller. It sits within the rack, and it basically is constantly monitoring a number of different components within the rack for leaks. It's safely at the top of the rack here, essentially to make sure that if there is a leak, the leak doesn't happen to drip on it and shut it off. But it's what connects to the ALCs, helps them shut off, or connects to the valve train at the facility level, which basically shuts the valves off from the liquid coming in from the buildings that are at issue.

Meta is also using their own disaggregated scheduled fabric for Catalina. This allows them to connect multiple pods together within a single data center building or suite, and lets them connect multiple buildings together. And maybe even like go larger than that to basically provide these really large-scale clusters. It's tuned for AI and helps provide flexibility and speed. This is essentially how all the GPUs talk to each other.
Follow Wccftech on Google to get more of our news coverage in your feeds.


















