NVIDIA's Rubin CPX chip was surprisingly not shown at GTC, and according to a new update, it appears the solution is currently 'delayed' and positioned for Feynman.
NVIDIA's CPX Chip Would Now Feature With Feynman, As Groq Fills the Inference Gap
For those unaware, NVIDIA has been trying to crack the inference segment by releasing dedicated solutions since ASICs gained traction around Q3 of last year, and one of those launches was the Rubin CPX chip. It was one of the first rack-focused solutions to feature GDDR7 memory on board, and the idea was to target prefill workloads in inference. However, at this year's GTC, Rubin CPX wasn't present at all when Jensen showcased the Rubin lineup, suggesting the solution might have been canceled or delayed. But NVIDIA's VP Ian Buck has an update (via ComputerBase).
While discussing NVIDIA's roadmap, Buck revealed that Rubin CPX has been pushed forward, but the idea hasn't been dropped. Instead, we could expect a similar solution to debut with Feynman, which is scheduled for a few years from now. It appears that the idea of a CPX chip is currently unfeasible for NVIDIA, given that workload demands have evolved from long-context to prioritizing TTFT. And with that, the Rubin LPX tray, featuring Groq's LPU units, has gained greater significance, as it focuses on the decode stage of an inference workload.
NVIDIA has been hyper-focused on what it has brought to the table with the Groq partnership, and based on what we are seeing, the firm loves the idea of achieving impressive inference throughput. Since LPUs feature an SRAM implementation, the individual bandwidth scales up to 150 TB/s, and the rack as a whole delivers 640 TB/s of scale-up bandwidth, which is why NVIDIA decided to stick with the LPX tray rather than CPX. There were also reports that Team Green was revising the CPX design, looking to replace GDDR7 with HBM, suggesting that the Feynman CPX wouldn't be exactly like what we saw with Rubin.
Jensen calls NVIDIA the "inference king", and the Groq solution is one way that the company is eager to maintain this lead. As for the cancellation of Rubin CPX, well, it does free up GDDR7 capacity that would've gone to an AI chip, so there's some joy for gamers out there, too.
Follow Wccftech on Google to get more of our news coverage in your feeds.
