NVIDIA Blackwell GB202 Gaming GPUs To Utilize TSMC 4NP Node, Significant Improvement To Cache & SM Throughput

Mar 19, 2024 at 04:40am EDT
NVIDIA GeForce RTX 50 "Blackwell" Mobile GPUs PCI IDs Revealed: 5090 Desktop With GB102, Mobile Lineup With RTX 5090, 5080, 5070 Ti, 5070, 5060, 5050 1

NVIDIA just announced its Blackwell GPUs for AI and now eyes are all set on its gaming parts which are rumored to feature the same TSMC 4NP node.

NVIDIA Blackwell AI Tensor Core & Gaming GPUs Might Share The Same TSMC 4NP Process Node, Big Cache & Throughput Improvements Expected

It was previously expected that NVIDIA was going to leverage the TSMC 3nm process node for the gaming chip but that plan has seemingly changed as Kopite7kimi now states both Blackwell AI Tensor Core and Gaming GPUs to be fabricated on a very similar process node. Just a few hours ago, we came to know that NVIDIA will be using TSMC's 4NP node, a variation of the 5nm node that was already used for Ada Lovelace and Hopper GPUs.

Related Story Microsoft’s Brings The “NVIDIA Power” To Devs With Passive-Cooled Surface RTX Spark Dev Box, Coming Later This Year With 128 GB Memory

It is stated that the new process node will allow a 30% increase in transistor density which can lead to higher performance gains but the actual efficiency advantages are yet to be explained. TSMC doesn't explicitly state the 4NP process node anywhere on its webpage. They only mention N4P & which is also mentioned as an extension of the N5 platform with an 11% performance boost over N5 and a 6% boost over N4.

We know that the previously used 4N process node for Ada GPUs was simply N5 (5nm) in disguise with some NVIDIA-exclusive optimizations. NVIDIA also revealed that it has worked with TSMC and Synopsys to leverage the game-changing CuLitho technology to make sure that the production and manufacturing of these new-gen Blackwell AI Tensor & Gaming GPUs go smoothly so that they can be delivered to customers on time.

https://twitter.com/kopite7kimi/status/1769898435367620933

Other than the process node, NVIDIA is also expected to deliver some big gains on the L1 cache side. It is stated that GB202, the flagship Blackwell Gaming GPU, will have significant improvements versus AD102 and GA102 which will allow the increase in SM throughput. Kopite7kimi also shed some light on the configuration of Blackwell GB202 Gaming GPU earlier.

He stated that the chip is going to offer 12 GPCs, each with 8 TPCs for a total of 96 TPCs and if we take into account the Ada structure, we can expect up to 192 SMs or 24,567 CUDA cores assuming that there will be 128 FP32 cores per SM. That's going to be 33% more CUDA cores than the full AD102 GPU which so far hasn't been released.

NVIDIA Blackwell 'GB202' GPU Specs 'Preliminary':

GPU NameGB202AD102
GPC12 (Per GPU)?12 (Per GPU)
TPC8 (Per GPC)?6 (Per GPC)
SM2 (Per TPC)?2 (Per TPC)
Total SMs192?144
Sub-CoreTBD4 (Per SM)
FP32128 (Per SM)?128 (Per SM)
FP32+INT32TBD128 (Per SM)
CUDA Cores24,567?18,432
WarpsTBD64 (Per SM)
ThreadsTBD2048 (Per SM)
L1 CacheTBD192 KB (Per SM)
L2 CacheTBD96 MB (Per GPU)
ROPsTBD32 (Per GPC)
Memory StandardGDDR7GDDR6X
Max Memory Bus512-bit384-bit
Max Memory Cap32 GB24 GB

He also mentions that the GB203 GPU, the next in the Blackwell Gaming GPU lineup, will be half of the GB202, similar to AD102 and AD103 GPUs. This will lead to a huge disparity in performance if NVIDIA equips the next 90-series cards with GB202 and the 80-series cards with GB203. The biggest question is whether NVIDIA will utilize MCM (Multi-Chip-Module) packaging for its Blackwell Gaming GPUs or keep them monolithic for now. Given the increasing costs and yield issues associated with GPU/chip development, the chiplet route is indeed the way of the future & AMD's Radeon division has already embraced it.

NVIDIA's Blackwell Gaming GPUs will launch under the GeForce RTX 50 series family with the support of next-gen technologies such as GDDR7 memory, DisplayPort 2.1, and more. We can expect to hear more about them later this year.

NVIDIA GeForce GPU SKUs:

GenerationBlackwellAda LovelaceAmpereTuringPascal
Process NodeTSMC 3nm?TSMC 5nmSamsung 8nmTSMC 12nmTSMC 16nm
Launch Year20242022202020182016
Ultra-Enthusiast SKUGB202AD102GA102TU102GP102
Enthusiast SKUGB203AD103GA102TU104GP104
High-End SKUGB205AD104GA104TU106GP104
Mainstream SKUGB206AD106GA106TU106GP106
Entry-Level SKUGB207AD107GA107TU116/117GP107

About the author: A Software Engineer by training and a PC enthusiast by passion, Hassan Mujtaba serves as Wccftech's Senior Editor for hardware section. With years of experience in the industry, he specializes in deep-dive technical analysis of next-generation CPU and GPU architectures, motherboards, and cooling solutions. His work involves not only breaking news on upcoming technologies but also extensive hands-on reviews and benchmarking.

Follow Wccftech on Google to get more of our news coverage in your feeds.