Hardware Rumor

NVIDIA Ampere GA100 GPU Rumored Specifications Detailed – 8192 CUDA Cores, Up To 48 GB HBM2e Memory, Up To 2.2 GHz Clocks & 300W TDP

Hassan Mujtaba • Mar 6, 2020 at 02:32pm EST

NVIDIA GeForce RTX 40 Series To Feature 5nm Ada Lovelace GPUs

As NVIDIA's GTC 2020 closes in, new specifications of the Ampere GA100 GPU have been leaked which once again shows that the next-gen GPU architecture from the green team is going to be an absolute beast of a Compute powerhouse.

NVIDIA Ampere GPU Specifications Rumored To Include 8192 CUDA Cores, Up To 48 GB HBM2e Memory & Core Clocks Beyond 2 GHz On The Flagship GA100 Chip

The latest specifications come from the Stage1 Chinese forums where a user who's know to post leaks before has listed down key details for the flagship Ampere GPU, the GA100. NVIDIA's Ampere GPU family has been known for a while now but it is something that NVIDIA has yet to reveal to the public. There are several GPUs of the Ampere family that have appeared in various leaks such as the GA100 itself but there hasn't been any conclusive evidence if Ampere is the name of the family of GPUs which NVIDIA is going to introduce next for the HPC / Data Center segment.

According to the forum member, the flagship Ampere GPU would be the GA100 and as expected, the full configuration would feature 128 streaming multi-processor units or 8192 CUDA cores. It is not known which process node NVIDIA is using but 7nm has been highlighted in previous reports.

Utilizing the new process and GPU architecture, the chip is rumored to feature a maximum boost clock of up to 2.2 GHz on the GPU core. This is a huge bump in clock speed which if true is at least 35% faster than the GV100 GPU featured on the Quadro GV100 graphics card. The Quadro GV100 features the fastest clock for the GV100 GPU at 1627 MHz and delivers 16.6 TFLOPs of FP32 Compute performance.

Based on the number of cores and the boost clock of the GA100 GPU, we are looking at a massive 36 TFLOPs of FP32 Compute performance which is literally insane. That's more than a 2x increase in FP32 Compute and if these numbers are legit, we would be looking at an insane 18 TFLOPs of FP64 compute horsepower which is far ahead of any FP64 numbers that modern GPUs can crunch out.

The GPU is stated to feature a 300W TDP and would feature HBM2e memory and come in two flavors, a 24 GB and a 48 GB model. These memory configurations could be for the top variant only as we have also seen other variants with 32 GB HBM2e memory. NVIDIA is also rumored to double its tensor cores on the new Ampere GPUs. The current 5120 CUDA Core Volta GV100 GPU features 640 Tensor cores so based on that, an Ampere GPU with 8192 CUDA Core would feature 1024 cores for tensor operations. But since the rumor states that NVIDIA is likely to increase the tensor core count by 2x, we will be looking at 2048 tensor cores for an 8192 CUDA core chip. The specs for the rest of the variants which leaked last week are listed below:

NVIDIA's Next-Gen GPU #1 Specifications & Performance

This first GPU features a total SM count of 124 which equals 7936 CUDA cores since NVIDIA's professional GPU architecture comes with a 64 CUDA Core design per streaming multiprocessor. This is also a 55% increase in CUDA cores over the Tesla V100's 5120 Cores. The GPU has a maximum clock speed of 1.1 GHz and at this unfinalized clock, it should deliver around 17.5 - 18 TFLOPs of FP32 horsepower.

It carries 32 GB of HBM2e memory clocking in at 1200 MHz and runs across a 4096-bit bus interface. The reason I mention HBM2e is that it is the latest standard and NVIDIA has been known to utilize the most advanced memory standards on its HPC parts at the time of its launch.

In addition to the core and memory specifications, the GPU packs a 32 MB L2 cache which is a 5.33x increase over the Volta GV100 GPU which packs an L2 cache of just 6 MB in comparison. Given the massive amount of cache, we can expect some huge performance uplifts and a huge architectural change on NVIDIA's next-generation GPU which has been years in development.

As far as the performance is concerned, the GPU scores 222377 points in the OpenCL benchmark (CUDA) on Geekbench 5. The platform is running CUDA 8.0 and it is highly likely that the GPU was not fully optimized for it at the time of testing. With that said, the specifications of this card are looking literally insane so let's get on with the other two variants.

NVIDIA's Next-Gen GPU #2 Specifications & Performance

The second GPU features a total of 118 SMs or 7552 CUDA cores. This is a 47.5% increase in CUDA cores over the Tesla V100 with its 5120 CUDA Cores packed in 80 SMs and a total of 24 MB L2 cache. This GPU is also clocked at a maximum speed of 1.10 GHz and features 24 GB of HBM2e memory running along a 3072-bit bus at 1200 MHz clock speed. At these speeds, this chip should deliver a total theoretical compute horsepower of around 16.7 TFLOPs but once again, the clock speeds definitely don't look final and it could be higher.

This particular GPU was tested in both OpenCL and CUDA Compute benchmarks. In the OpenCL benchmark, the chip scored 184096 points while in the CUDA benchmark, it scored 169368 points. Both the 124 and 118 SM parts were running on CUDA 8.0 which once again shows that these GPUs aren't yet fully optimized for the Geekbench 5 benchmark. There's a huge difference in score for both parts despite just a 5% difference in core count.

NVIDIA's Next-Gen GPU #3 Specifications & Performance

Lastly, we have the 108 SM or 6912 CUDA core variant which has a reported clock speed of 1.01 GHz or the slowest of all three GPUs. The GPU offers a 35% increase in CUDA core count over the Tesla V100 and apparently packs 46.8 GB of HBM2e memory. This could be an error with how the Geekbench benchmark sees the total memory and it could actually be 48 GB which makes more sense. This GPU scores 141654 points in the Geekbench 5 (CUDA) benchmark which once again, is not the final score due to the lower clock speeds.

NVIDIA Tesla Graphics Cards Comparison

Tesla Graphics Card Name	NVIDIA Tesla M2090	NVIDIA Tesla K40	NVIDIA Telsa K80	NVIDIA Tesla P100	NVIDIA Tesla V100	NVIDIA Tesla Next-Gen #1	NVIDIA Tesla Next-Gen #2	NVIDIA Tesla Next-Gen #3
GPU Architecture	Fermi	Kepler	Maxwell	Pascal	Volta	Ampere?	Ampere?	Ampere?
GPU Process	40nm	28nm	28nm	16nm	12nm	7nm?	7nm?	7nm?
GPU Name	GF110	GK110	GK210 x 2	GP100	GV100	GA100?	GA100?	GA100?
Die Size	520mm2	561mm2	561mm2	610mm2	815mm2	TBD	TBD	TBD
Transistor Count	3.00 Billion	7.08 Billion	7.08 Billion	15 Billion	21.1 Billion	TBD	TBD	TBD
CUDA Cores	512 CCs (16 CUs)	2880 CCs (15 CUs)	2496 CCs (13 CUs) x 2	3840 CCs	5120 CCs	6912 CCs	7552 CCs	7936 CCs
Core Clock	Up To 650 MHz	Up To 875 MHz	Up To 875 MHz	Up To 1480 MHz	Up To 1455 MHz	1.08 GHz (Preliminary)	1.11 GHz (Preliminary)	1.11 GHz (Preliminary)
FP32 Compute	1.33 TFLOPs	4.29 TFLOPs	8.74 TFLOPs	10.6 TFLOPs	15.0 TFLOPs	~15 TFLOPs (Preliminary)	~17 TFLOPs (Preliminary)	~18 TFLOPs (Preliminary)
FP64 Compute	0.66 TFLOPs	1.43 TFLOPs	2.91 TFLOPs	5.30 TFLOPs	7.50 TFLOPs	TBD	TBD	TBD
VRAM Size	6 GB	12 GB	12 GB x 2	16 GB	16 GB	48 GB	24 GB	32 GB
VRAM Type	GDDR5	GDDR5	GDDR5	HBM2	HBM2	HBM2e	HBM2e	HBM2e
VRAM Bus	384-bit	384-bit	384-bit x 2	4096-bit	4096-bit	4096-bit?	3072-bit?	4096-bit?
VRAM Speed	3.7 GHz	6 GHz	5 GHz	737 MHz	878 MHz	1200 MHz	1200 MHz	1200 MHz
Memory Bandwidth	177.6 GB/s	288 GB/s	240 GB/s	720 GB/s	900 GB/s	1.2 TB/s?	1.2 TB/s?	1.2 TB/s?
Maximum TDP	250W	300W	235W	300W	300W	TBD	TBD	TBD

Yesterday, AMD announced that they will be splitting its GPUs into separate Gaming and Compute segments, similar to how NVIDIA has been doing since its Pascal architecture. The new CDNA GPU family is expected to launch this year and will be based on the 7nm process node, going against NVIDIA's HPC lineup. According to the Vice President of Information Technology and the Chief Information Officer at Indiana University, who will be deploying their Big Red supercomputer this summer, it was revealed that NVIDIA's next-generation GPUs offer a massive 75% performance uplift over existing Volta-based GPUs. There are also similar reports which we have heard in the past with the GPUs offering up to 50% performance increase with twice the efficiency which would be an incredible feat to pull off.

Given that NVIDIA would be on process parity with AMD with its next-generation GPU and with a brand new architecture too, we can see some real disruptive performance. These are definitely some big specifications & numbers reported in the rumor for NVIDIA's next-generation GPUs and while we would advise our readers to take them with a grain of salt, we can definitely expect a full-blown 'official' announcement of the next-gen GPUs by NVIDIA at its GTC 2020 online keynote on 22nd of March.

About the author: A Software Engineer by training and a PC enthusiast by passion, Hassan Mujtaba serves as Wccftech's Senior Editor for hardware section. With years of experience in the industry, he specializes in deep-dive technical analysis of next-generation CPU and GPU architectures, motherboards, and cooling solutions. His work involves not only breaking news on upcoming technologies but also extensive hands-on reviews and benchmarking.

Follow Wccftech on Google to get more of our news coverage in your feeds.

Read all comments on NVIDIA Ampere GA100 GPU Rumored Specifications Detailed – 8192 CUDA Cores, Up To 48 GB HBM2e Memory, Up To 2.2 GHz Clocks & 300W TDP

NVIDIA Ampere GA100 GPU Rumored Specifications Detailed – 8192 CUDA Cores, Up To 48 GB HBM2e Memory, Up To 2.2 GHz Clocks & 300W TDP

NVIDIA Ampere GPU Specifications Rumored To Include 8192 CUDA Cores, Up To 48 GB HBM2e Memory & Core Clocks Beyond 2 GHz On The Flagship GA100 Chip

NVIDIA Tesla Graphics Cards Comparison

Trending Stories

Xbox Studio Leaders Reportedly Detest Game Pass, Arguing it Destroyed the Value of Their $40+ Games Now Available for Pennies

A Modder Fits Entire Grand Theft Auto PS2 Trilogy Inside a Single Game, While Rockstar Continues to Prepare GTA 6

Over 80% Of Samsung Foundry Workers Are Planning To Leave Amid A Yawning Pay Gap With The Memory Division

CXMT Supply Chain To Witness Major Process Transition To Seize DDR6 Opportunity Before Commercialization, Threatening Samsung’s And SK hynix’s Global Hold

NVIDIA’s AI GPUs Face Overwhelming Data Growth Bottleneck, But Samsung’s V10 NAND Production For Next-Generation CMX Storage To Offer Relief, At The Industry’s Expense

Popular Discussions

AMD Medusa Point 10-Core “Zen 6” CPU Beats Strix Point 10-Core “Zen 5” By Nearly 35% While Operating at 5.4 GHz

AMD Ryzen 7 7700X3D 4.5 GHz “3D V-Cache” CPU Review: The Budget X3D Champ For AM5

NVIDIA GeForce RTX 50 SUPER GPUs Have Reportedly Arrived At AIBs, But Are On Hold Due To Undecided Memory Prices

AMD Ryzen 7 5800X3D Outsells Ryzen 7 7800X3D For The Same Price On Amazon Despite Being Weaker

AMD Ryzen 7 7800X3D CPU Drops To $299 A Day Ahead of 7700X3D’s Launch, Bringing 3D V-Cache Goodness To Mainstream Gamers

NVIDIA Ampere GA100 GPU Rumored Specifications Detailed – 8192 CUDA Cores, Up To 48 GB HBM2e Memory, Up To 2.2 GHz Clocks & 300W TDP

NVIDIA Ampere GPU Specifications Rumored To Include 8192 CUDA Cores, Up To 48 GB HBM2e Memory & Core Clocks Beyond 2 GHz On The Flagship GA100 Chip

Related Story Palit Officially Brings Back RTX 3060 12 GB As The Budget Segment Continues To Suffer

NVIDIA Tesla Graphics Cards Comparison

Further Reading

Trending Stories

Popular Discussions