NVIDIA Rubin CPX GPU Is Designed For Super AI Tasks Including Million-Token Coding & GenAI, Up To 128 GB GDDR7 Memory, 30 PFLOPs of FP4

Hassan Mujtaba
NVIDIA Rubin CPX | Image Credits: NVIDIA

NVIDIA is unveiling new details of its next-gen Rubin AI platform, which will feature Vera CPUs alongside a new Rubin CPX chip with up to 128 GB GDDR7 memory.

NVIDIA Rubin AI Platform Doubles Down On AI With Groundbreaking Speed & Efficiency, Rubin CPX GPUs Offer Up To 128 GB GDDR7 Memory

NVIDIA has already distilled a lot of information about its next-gen Rubin AI platforms and even teased its next-next-gen Feynman platform. Today, NVIDIA is providing additional information on its Rubin GPUs and the respective platform, which will feature a range of new technologies such as Vera CPUs and the ConnectX-9 SuperNICs.

Related Story JEDEC Approves SPHBM4 to Break HBM’s Costly Packaging Bottleneck, Retaining HBM4-level Speeds With Standard Packages

NVIDIA today announced NVIDIA Rubin CPX, a new class of GPU purpose-built for massive-context processing. This enables AI systems to handle million-token software coding and generative video with groundbreaking speed and efficiency.

Rubin CPX works hand in hand with NVIDIA Vera CPUs and Rubin GPUs inside the new NVIDIA Vera Rubin NVL144 CPX platform. This integrated NVIDIA MGX system packs 8 exaflops of AI compute to provide 7.5x more AI performance than NVIDIA GB300 NVL72 systems, as well as 100TB of fast memory and 1.7 petabytes per second of memory bandwidth in a single rack. A dedicated Rubin CPX compute tray will also be offered for customers looking to reuse existing Vera Rubin 144 systems.

NVIDIA Rubin CPX enables the highest performance and token revenue for long-context processing — far beyond what today’s systems were designed to handle. This transforms AI coding assistants from simple code-generation tools into sophisticated systems that can comprehend and optimize large-scale software projects.

To process video, AI models can take up to 1 million tokens for an hour of content, pushing the limits of traditional GPU compute. Rubin CPX integrates video decoder and encoders, as well as long-context inference processing, in a single chip for unprecedented capabilities in long-format applications such as video search and high-quality generative video.

Built on the NVIDIA Rubin architecture, the Rubin CPX GPU uses a cost‑efficient, monolithic die design packed with powerful NVFP4 computing resources and is optimized to deliver extremely high performance and energy efficiency for AI inference tasks.

via NVIDIA

The brand new addition to the Rubin family is also a new class of GPUs that are purpose-built for AI tasks such as million-token software coding and GenAI. These new GPUs are said to deliver "Groundbreaking" speed and efficiency.

The NVIDIA Rubin CPX chips will be accommodated alongside NVIDIA's next-gen Vera CPUs, the successor to the Grace CPU, inside the Vera Rubin NVL 144 CPX platform. This is an MGX system which offers up to 8 Exaflops of AI compute, a 7.5x uplift over the Grace Blackwell GB300 NVL72 platform. The system will also offer 100 TB of fast memory and a memory bandwidth of 1.7 Petabytes. The system offers 3x higher Attention performance than GB300 NVL72.

The difference between the Vera Rubin NVL144 and Vera Rubin NVL144 CPX platforms is the addition of the CPX chips. The non-CPX platform features four Rubin GPUs, 2 Vera CPUs, and offers 3.6 Exaflops of NVFP4 compute, 1.4 PB/s of HBM4 bandwidth, 75 TB of capacity, and is planned for availability in 2H 2026.

So, to compare the CPX and Non-CPX platforms:

  • 8.0 Exaflops vs 3.6 Exaflops NVFP4
  • 1.7 PB/s vs 1.4 PB/s Memory Bandwidth
  • 100 TB vs 75 TB of Memory Capacity
  • End 2026 vs 2H 2026 Availability

Some features of the NVIDIA Vera Rubin CPX platform versus the Grace Blackwell platform:

  • 7.5x higher AI compute (8 Exaflops NVFP4)
  • 3.0x higher bandwidth (1.7 PB/s bandwidth)
  • 4.0x higher memory (150 TB in GDDR7)

Talking about each chip, the NVIDIA Rubin CPX GPU will offer 30 PFLOPs of NVFP4 AI compute power & pack up to 128 GB of GDDR7 memory. Now, GDDR7 memory on a data center platform is an interesting choice. NVIDIA says that they have chosen GDDR7 instead of HBM for Rubin CPX due to its cost-efficient nature. These also come with 4x the NVENC and NVDNC capabilities. These expanded video capabilities will help a lot in GenAI tasks.

Interestingly, while the Rubin platform will feature 2-reticle-sized GPUs and Rubin Ultra will feature 4-reticle-sized GPUs, the CPX chip will feature a singular die and monolithic configuration. The process technology remains unknown, but we can expect either TSMC N3 or N2 for Rubin AI chips.

Also, the chip seems to be an early teaser of what consumer Rubin chips for "GeForce" and "PRO" platforms might look like. The CPX could very much be an alternation or the same version of the chip that will eventually replace the Blackwell GB202. This GR20X chip features the same 192 SMs with a max 512-bit bus configuration (8x64-bit IMCs), with support for up to 4 Gb ICs. More details on this chip will be available soon.

NVIDIA expects the availability of the first Rubin CPX systems by the end of 2026, while Vera Rubin itself is expected to enter production soon, with a proper unveiling planned by GTC 2026.

Hassan Mujtaba Photo

About the author: A Software Engineer by training and a PC enthusiast by passion, Hassan Mujtaba serves as Wccftech's Senior Editor for hardware section. With years of experience in the industry, he specializes in deep-dive technical analysis of next-generation CPU and GPU architectures, motherboards, and cooling solutions. His work involves not only breaking news on upcoming technologies but also extensive hands-on reviews and benchmarking.

Follow Wccftech on Google to get more of our news coverage in your feeds.

Button