Hardware Leak Rumor

NVIDIA’s Next-Gen GPU Specifications And Performance Leaks Out – Massive Die With 7936 CUDA Cores (8192 Full Die), Up To 48 GB HBM2e Memory

Hassan Mujtaba • Mar 4, 2020 at 12:49am EST

NVIDIA GeForce RTX 40 Series To Feature 5nm Ada Lovelace GPUs

NVIDIA's next-gen GPU is getting unveiled really soon and while GTC has been moved to an online-only event, that isn't holding the green giant back from announcing its biggest GPU to date. We got to see leaked specifications of two unreleased GPUs a few days ago but it looks like a new SKU has been spotted by Twitter fellow W_At_Ar_U, and the latest chip is a beast of its own with a total core count of almost 8K cores.

NVIDIA's Next-Gen GPUs Performance & Specifications Leaked - The Ultimate HPC Powerhouse With Up To 8K Cores & 48 GB HBM2e Memory

The NVIDIA Next-Generation GPU architecture, which is reportedly codenamed Ampere, has been known for a while. It will go on to power the company's latest Tesla GPUs which are going to be used by the top HPC and cloud datacenter organizations.

According to the Vice President of Information Technology and the Chief Information Officer at Indiana University, who will be deploying their Big Red supercomputer this summer, it was revealed that NVIDIA's next-generation GPUs offer a massive 75% performance uplift over existing Volta-based GPUs. There are also similar reports which we have heard in the past with the GPUs offering up to 50% performance increase with twice the efficiency which would be an incredible feat to pull off.

So coming to the specifications of the latest GPU which has been spotted in Geekbench, I will also be comparing it to the previously leaked parts to see what kind of performance uplift we should be expecting from all of the variants. Do note that these GPUs were tested all the way back in October and November of 2019 so they have been hiding in the Geekbench database for a few months now, but the specifications would definitely have seen big changes as these are still early samples. The other thing to note here is the lower clock speeds which point out the early designs as I have mentioned.

NVIDIA's Next-Gen GPU #1 Specifications & Performance

The first GPU to talk about is the one that was just recently spotted. This GPU features a total SM count of 124 which equals 7936 CUDA cores since NVIDIA's professional GPU architecture comes with a 64 CUDA Core design per streaming multiprocessor. This is also a 55% increase in CUDA cores over the Tesla V100's 5120 Cores. The GPU has a maximum clock speed of 1.1 GHz and at this unfinalized clock, it should deliver around 17.5 - 18 TFLOPs of FP32 horsepower.

It carries 32 GB of HBM2e memory clocking in at 1200 MHz and runs across a 4096-bit bus interface. The reason I mention HBM2e is that it is the latest standard and NVIDIA has been known to utilize the most advanced memory standards on its HPC parts at the time of its launch.

In addition to the core and memory specifications, the GPU packs a 32 MB L2 cache which is a 5.33x increase over the Volta GV100 GPU which packs an L2 cache of just 6 MB in comparison. Given the massive amount of cache, we can expect some huge performance uplifts and a huge architectural change on NVIDIA's next-generation GPU which has been years in development.

As far as the performance is concerned, the GPU scores 222377 points in the OpenCL benchmark (CUDA) on Geekbench 5. The platform is running CUDA 8.0 and it is highly likely that the GPU was not fully optimized for it at the time of testing. With that said, the specifications of this card are looking literally insane so let's get on with the other two variants.

NVIDIA's Next-Gen GPU #2 Specifications & Performance

The second GPU features a total of 118 SMs or 7552 CUDA cores. This is a 47.5% increase in CUDA cores over the Tesla V100 with its 5120 CUDA Cores packed in 80 SMs and a total of 24 MB L2 cache. This GPU is also clocked at a maximum speed of 1.10 GHz and features 24 GB of HBM2e memory running along a 3072-bit bus at 1200 MHz clock speed. At these speeds, this chip should deliver a total theoretical compute horsepower of around 16.7 TFLOPs but once again, the clock speeds definitely don't look final and it could be higher.

For some context :

GV100 : 142837 (Open CL)
Tesla V100 : 154606 (Open CL)
Titan RTX : 132804 (Open CL)

— _rogame 🇵🇸 (@_rogame) February 28, 2020

This particular GPU was tested in both OpenCL and CUDA Compute benchmarks. In the OpenCL benchmark, the chip scored 184096 points while in the CUDA benchmark, it scored 169368 points. Both the 124 and 118 SM parts were running on CUDA 8.0 which once again shows that these GPUs aren't yet fully optimized for the Geekbench 5 benchmark. There's a huge difference in score for both parts despite just a 5% difference in core count.

NVIDIA's Next-Gen GPU #3 Specifications & Performance

Lastly, we have the 108 SM or 6912 CUDA core variant which has a reported clock speed of 1.01 GHz or the slowest of all three GPUs. The GPU offers a 35% increase in CUDA core count over the Tesla V100 and apparently packs 46.8 GB of HBM2e memory. This could be an error with how the Geekbench benchmark sees the total memory and it could actually be 48 GB which makes more sense. This GPU scores 141654 points in the Geekbench 5 (CUDA) benchmark which once again, is not the final score due to the lower clock speeds.

NVIDIA Tesla Graphics Cards Comparison

Tesla Graphics Card Name	NVIDIA Tesla M2090	NVIDIA Tesla K40	NVIDIA Telsa K80	NVIDIA Tesla P100	NVIDIA Tesla V100	NVIDIA Tesla Next-Gen #1	NVIDIA Tesla Next-Gen #2	NVIDIA Tesla Next-Gen #3
GPU Architecture	Fermi	Kepler	Maxwell	Pascal	Volta	Ampere?	Ampere?	Ampere?
GPU Process	40nm	28nm	28nm	16nm	12nm	7nm?	7nm?	7nm?
GPU Name	GF110	GK110	GK210 x 2	GP100	GV100	GA100?	GA100?	GA100?
Die Size	520mm2	561mm2	561mm2	610mm2	815mm2	TBD	TBD	TBD
Transistor Count	3.00 Billion	7.08 Billion	7.08 Billion	15 Billion	21.1 Billion	TBD	TBD	TBD
CUDA Cores	512 CCs (16 CUs)	2880 CCs (15 CUs)	2496 CCs (13 CUs) x 2	3840 CCs	5120 CCs	6912 CCs	7552 CCs	7936 CCs
Core Clock	Up To 650 MHz	Up To 875 MHz	Up To 875 MHz	Up To 1480 MHz	Up To 1455 MHz	1.08 GHz (Preliminary)	1.11 GHz (Preliminary)	1.11 GHz (Preliminary)
FP32 Compute	1.33 TFLOPs	4.29 TFLOPs	8.74 TFLOPs	10.6 TFLOPs	15.0 TFLOPs	~15 TFLOPs (Preliminary)	~17 TFLOPs (Preliminary)	~18 TFLOPs (Preliminary)
FP64 Compute	0.66 TFLOPs	1.43 TFLOPs	2.91 TFLOPs	5.30 TFLOPs	7.50 TFLOPs	TBD	TBD	TBD
VRAM Size	6 GB	12 GB	12 GB x 2	16 GB	16 GB	48 GB	24 GB	32 GB
VRAM Type	GDDR5	GDDR5	GDDR5	HBM2	HBM2	HBM2e	HBM2e	HBM2e
VRAM Bus	384-bit	384-bit	384-bit x 2	4096-bit	4096-bit	4096-bit?	3072-bit?	4096-bit?
VRAM Speed	3.7 GHz	6 GHz	5 GHz	737 MHz	878 MHz	1200 MHz	1200 MHz	1200 MHz
Memory Bandwidth	177.6 GB/s	288 GB/s	240 GB/s	720 GB/s	900 GB/s	1.2 TB/s?	1.2 TB/s?	1.2 TB/s?
Maximum TDP	250W	300W	235W	300W	300W	TBD	TBD	TBD

It is interesting however that the lower-end GPU features more memory capacity which may mean two things, either NVIDIA would have lower-end GPUs with higher memory capacities for specific workloads or each GPU would have different memory configurations and the 48 GB HBM2e could be the highest memory configuration for this particular GPU SKU. The other most interesting thing you can tell from this specifications leak is that while the next-gen Tesla lineup will have various GPU SKUs, the full GPU should definitely peak at 8192 CUDA cores packed in 128 SMs.

Just like the Volta GV100 GPU, the full fat (next-gen) GPU may never be available to the public since the Tesla V100 peaked at 5120 CUDA cores (80 SMs) despite the full chip containing 5376 CCs or 84 SMs. In a previous interview, NVIDIA's CEO, Jensen Huang, had confirmed that the majority of the orders for their next-generation 7nm GPU will be handled by TSMC while a small portion will be sent to Samsung for production.

Finally, Jensen was asked about the launch timeframe of their next-generation 7nm GPU, but he simply replied that it wasn't a convenient time for them to disclose any date at the moment. We know from a recent interview with NVIDIA's CFO, Colette Kress, that they want to surprise everyone with their own 7nm GPU announcement, but they are waiting for the right time to do so.

AMD, on the other hand, is also expected to make an announcement of its Radeon Instinct Mi100 HPC accelerator based on the Arcturus GPU soon which is also reportedly packing 8192 SPs and is based on the latest 7nm GPU architecture. However, as NVIDIA has proved in the past, that they can optimize their architecture to the point where it's super-efficient and competitive against GPUs from its competitors that are based on more advanced nodes (16nm vs 12nm & 12nm vs 7nm).

Given that NVIDIA would be on process parity with AMD with its next-generation GPU and with a brand new architecture too, we can see some real disruptive performance. These are definitely some huge specifications for NVIDIA's next-generation GPUs and we can definitely expect a full-blown announcement by NVIDIA at its GTC 2020 online keynote on 22nd of March.

About the author: A Software Engineer by training and a PC enthusiast by passion, Hassan Mujtaba serves as Wccftech's Senior Editor for hardware section. With years of experience in the industry, he specializes in deep-dive technical analysis of next-generation CPU and GPU architectures, motherboards, and cooling solutions. His work involves not only breaking news on upcoming technologies but also extensive hands-on reviews and benchmarking.

Follow Wccftech on Google to get more of our news coverage in your feeds.

Read all comments on NVIDIA’s Next-Gen GPU Specifications And Performance Leaks Out – Massive Die With 7936 CUDA Cores (8192 Full Die), Up To 48 GB HBM2e Memory

NVIDIA’s Next-Gen GPU Specifications And Performance Leaks Out – Massive Die With 7936 CUDA Cores (8192 Full Die), Up To 48 GB HBM2e Memory

NVIDIA's Next-Gen GPUs Performance & Specifications Leaked - The Ultimate HPC Powerhouse With Up To 8K Cores & 48 GB HBM2e Memory

NVIDIA Tesla Graphics Cards Comparison

Trending Stories

Amazon Backpedals on 007 First Light Sequel Threat, Admits IO Interactive Should Probably Make the James Bond Sequel

PlayStation 6 Controller Could Ditch the Part That Wears Out, After Years of DualSense Stick Drift Complaints

AMD EPYC “Venice” Gives Us A Preview of Zen 6-Based Ryzen “Olympic Ridge” CPUs: More Cores, More (3D V-)Cache, Clocks & Scalable Configs

Qualcomm Admits Its Snapdragon 8 Elite Gen 6 Will Become More Expensive, As Chipset Maker Aims For Double-Digit Hike Due To Higher Supplier Costs

CXMT Evicted Huawei-Linked Engineers From Its R&D Facility, As The Once-Humble Memory Maker Has Acquired A Taste Of Newfound Power Amid The AI Boom

Popular Discussions

AMD Medusa Point 10-Core “Zen 6” CPU Beats Strix Point 10-Core “Zen 5” By Nearly 35% While Operating at 5.4 GHz

Watch The AMD “Advancing AI 2026” Event Live Here – Next-Gen Zen 6 EPYC CPUs, Instinct MI400 Series & Helios AI Rack Launch

AMD Unveils Helios, Its Next-Gen AI Powerhouse With MI455X & 6th Gen EPYC, Challenging NVIDIA’s Rack-Scale Dominance

AMD Zen 7 “2028” and Zen 8 “2030” CPU Architectures Confirmed – EPYC Florence “Zen 7” To Feature Next-Gen Node, & ACE Extensions

NVIDIA DLSS 5 Hands Over Full Control To Artists To “Direct The Final Frame”, As SIGGRAPH Technical Demo Shows How Neural Rendering Solved Big Challenge To Achieve 4K “Life-Like” Visuals On A Single GPU

NVIDIA’s Next-Gen GPU Specifications And Performance Leaks Out – Massive Die With 7936 CUDA Cores (8192 Full Die), Up To 48 GB HBM2e Memory

NVIDIA's Next-Gen GPUs Performance & Specifications Leaked - The Ultimate HPC Powerhouse With Up To 8K Cores & 48 GB HBM2e Memory

Related Story NVIDIA Is Now All For Open-Weight AI Models Despite Doing Everything To Maintain Its CUDA Moat

NVIDIA Tesla Graphics Cards Comparison

Further Reading

Trending Stories

Popular Discussions