NVIDIA Unveils Its Newest ‘Rubin CPX’ AI GPUs, Featuring 128 GB GDDR7 Memory & Targeted Towards High-Value Inference Workloads

Muhammad Zuhair
AMD Instinct MI300A chip on a black background.
NVIDIA Rubin CPX | Image Credits: NVIDIA

NVIDIA has surprisingly unveiled a rather 'new class' of AI GPUs, featuring the Rubin CPX AI chip that offers immense inferencing power when combined with a rack-scale cluster.

NVIDIA's Rubin CPX GPU Will Be Available In a Rack-Scale Configuration, Scaling To new Performance Levels

Team Green has realized that AI inferencing is probably the next place to focus on when it comes to computing capabilities, and the firm has now announced a new class of AI chips under the 'CFX' lineup, with initial debut coming with the Rubin series. Announced at the AI Infra Summit, Team Green unveiled the Rubin CPX GPU, which is targeted towards long-context AI, and more importantly, will co-exist alongside Rubin GPUs and Vera CPUs. NVIDIA claims that the chip will bring in a 'revolution' when it comes to performing AI inference efficiently.

Related Story Tensordyne’s 3nm Napier AI Chip Promises 13x Higher Token Throughput Than Blackwell & Blazes Past Rubin With 1000 Tokens/s In Multi-Trillion Parameter Models

In terms of specifications, the Rubin CPX features 30 petaFLOPs of NVFP4 compute, 128 GB of GDDR7 memory, and will feature in the 'exclusive' NVIDIA Vera Rubin NVL144 CPX rack, which will integrate 144 Rubin CPX GPUs, 144 Rubin GPUs, and 36 Vera CPUs to deliver eight exaFLOPs of NVFP4 compute. This figure alone is 7.5x times higher than Blackwell Ultra, and with technologies such as Spectrum-X Ethernet, NVIDIA plans to deliver a whopping million-token context AI inference workloads, scaling to new levels of performance.

The platform is claimed to deliver " 30x to 50x return on investment", and the Vera Rubin NVL144 CPX rack will break the computing barriers present in "building the next generation of generative AI applications". Rubin CPX will also be available in other configurations as well, but they are yet to be announced, however, the chip is seen as a relatively low-cost solution, considering the integration of GDDR7 memory, rather than HBM.

Team Green is covering all corners of the AI industry, leaving competitors little room to outpace them. NVIDIA has now swiftly transitioned towards focusing on inferencing, and with next-gen Rubin AI lineup dropping next year, we can see a huge leap in computing capabilities.

Muhammad Zuhair Photo

About the author: Muhammad Zuhair is a hardware and technology reporter for Wccftech, specializing in the semiconductor industry and the complex interplay between technology, manufacturing, and geopolitics. His coverage focuses on the corporate strategies and technological roadmaps of industry giants like TSMC, NVIDIA, Samsung, and Intel. Zuhair's expertise lies in deconstructing complex topics such as fabrication nodes (e.g., 2nm process), the economic impact of policies like the CHIPS Act, and the strategic development of AI infrastructure from NVIDIA, AMD and Intel.

Follow Wccftech on Google to get more of our news coverage in your feeds.

Deal of the Day

Button