The GA102 GPU is by far NVIDIA's largest Ampere GPU for the gaming & consumer segment. It is used in the top of the line GeForce RTX 3090 and GeForce RTX 3080 graphics cards and also has various other products planned which we will see in the months ahead in both GeForce & Quadro segments. So far, we have seen the physical GPU itself but we haven't seen a proper die shot of the chip aside from the renders provided by NVIDIA.
NVIDIA's Flagship Ampere Gaming GPU, The GA102, Gets First Die-Shot - Powers GeForce RTX 3090, GeForce RTX 3080 & Upcoming Quadro Line
The first die shots are now presented by none other than Fritzchens Fritz who is well-known for taking high-resolution pictures of various CPUs and GPUs. Fritz's most recent photography gave us better look at the Zen 3 die featured on the Ryzen 5000 CPUs and today, we have the first high-res picture of what lies beneath the hood of the flagship Ampere-based GeForce RTX 30 series GPUs.
To get these high-resolution pictures, a GeForce RTX 3090 graphics card was used. The cooler was removed and since modern GPUs don't feature IHS like older ones, the die is pretty much exposed and with special equipment, Fritz was able to get a detailed picture of the full Ampere GA102 GPU die.
NVIDIA GA102 GPU Specifications Recap
To recap everything we know about the NVIDIA GA102 Ampere GPU, the chip itself has so far seen no fully enabled SKU in the market (yet). The full GA102 GPU is made up of 7 graphics processing clusters with 12 SM units on each cluster. That makes up 84 SM units for a total of 10752 cores in a 28.3 billion transistor package measuring 628.4mm2. There is also 10 MB of L1, 6 MB of L2 cache, & several ROPs, TMUs, memory controllers, and NVLINK HighSpeed I/O hub.
The Ampere SM is partitioned into four processing blocks, each with 32 FP32 Cores, 16 INT32 Cores, one Tensor Core, one warp scheduler, and one dispatch unit. This is made possible with an updated datapath with one data path offering 16 FP32 execution units while the other offers either 16 FP32 or 16 INT32 execution units. This adds to 128 FP32 Cores, 64 INT 32 Cores,4 Tensor, 4 Wrap Schedulers, and 4 Dispatch Units on a single Ampere SM. Each block also includes a new L0 instruction cache and a 64 KB register file for a total of 256 KB register file per SM.
More insight on the specific parts of the GPU die is detailed by Twitter fellow, Locuza, who always shares insightful knowledge over at his Twitter feed and has mapped down various GPU/CPU die shots in the past. Locuza's awesome work can be seen below where he mapped out the Ampere GA102 GPU with each aspect and area of the chip accurately mapped. His work can be seen in the picture below:
Once again, this is impressive material from both Fritz and Locuza and we can't wait for them to do the same with AMD's Big Navi (Navi 21) GPU which has been featured on the Radeon RX 6900 & the RX 6800 series graphics cards.