⋮  

Nvidia Pascal GTX 1080 Has 8GB GDDR5X & 320GB/s Of Bandwidth, GTX 1070 Has 8GB GDDR5 & 256GB/s – GP104 GPU Supports GDDR5/X

Author Photo
Apr 11, 2016
44Shares
Submit

According to the latest whispers Nvidia has allegedly designed two reference PCBs with GDDR5X and GDDR5 compatibility for its GP104 GPU based GTX 1080 and GTX 1070 graphics cards. The latest whispers claim that Nvidia has decided to create a “premium” GP104 board based on the GP104-400 GPU that is going to power the flagship Pascal GeForce GTX graphics card this year. Otherwise known as the GTX 1080 in the web’s echochambers, this “premium” board will allegedly feature GDDR5X rather than GDDR5.

NVIDIA 364.47 WHQL Drivers

Whilst Nvidia’s more mainstream GP104 based graphics card, the purported GTX 1070, will be based on a cut down version of the same GP104 chip code named GP104-200 and feature 8Gbps GDDR5 chips instead. This rumor comes straight from the chiphell forums via bitsandchips.it, which have also brought us the leaked GP104 die shots a few days ago. So while there maybe veracity to these claims, we’d still advise our readers to take this with the usual grain of salt.

Nvidia GeForce GTX 1080 And GeForce GTX 1070 To Feature Different PCBs Due to Different GDDR5X & GDDR5 Pin Layout

According to the same source two different PCB designs are necessary due to the different pin layout of GDDR5X and GDDR5 chips. So whilst the GP104 GPU is claimed to be compatible with both memory technologies, the different pin layout doesn’t allow GDDR5X to be a simple drop-in replacement.

NVIDIA Pascal GP104 GPu

The leaked GP104 die shot revealed that the pictured graphics board features 8Gbps Samsung GDDR5 memory chips. Unfortunately the nscripted info on the die has been omitted, otherwise we would’ve been able to determine whether this is GP104-400 or GP104-200 and validate the rumored claims of GP104-400 using GDDR5X. Assuming the whispers are true, this die shot should be of GP104-200 and this should be a GTX 1070 board rather than a GTX 1080.

The first wave of GDDR5X memory chips that Micron has started sampling last month and will be mass producing in the summer are rated at 10Gbps, 11Gbps and 12Gbps. Which means that the fastest GDDR5X configuration will yield up to 50% more bandwidth vs the 8Gbps GDDR5 memory chips pictured above.

Because the GP104 GPU is configured with a 256bit memory interface. With 10gbps GDDR5X chips chosen, the GTX 1080 will have access to320GB/s of memory bandwidth. That’s up to 43% more compared to the GTX 980 and just 5% less than the GTX 980 Ti.

Nvidia Pascal Specs

WCCF GTX 980 Ti GTX 980 GTX 1080 GTX 1070 TESLA P100 (GP100)
GPU GM200 GM204 GP104 GP104 GP100
Process Node 28nm 28nm 16nm FinFET 16nm FinFET 16nm FinFET
Transistors 8 Billion 5.2 Billion TBA TBA 15.3 Billion
CUDA Cores 2816 CUDA Cores 2048 CUDA Cores 2560 CUDA Cores? 2048 CUDA Cores? 3840 CUDA Cores
VRAM 6 GB GDDR5 4 GB GDDR5 8 GB GDDR5X 8 GB GDDR5 16GB HBM2
Memory Bus 384-bit 256-bit 256-bit 256-bit 4096-bit
Memory Speed 7Gbps 7Gbps 10Gbps 8Gbps 1.4Gbps
Bandwidth 336GB/s 224GB/s 320GB/s 256GB/s 720GB/s
TDP 250W 165W TBA TBA 300W
Launch Date May 2015 September 2014 June 2016 June 2016 Q1 2017

Micron announced late last month that it’s already shipping 10Gbps, 11Gbps and 12Gbps samples to its customers. Which means that Nvidia, as well as AMD, have already got access to GDDR5X chips to test and will be ready to roll out graphics cards featuring the new memory technology as production ramps up this summer. This indicates that the decision to use both GDDR5X and GDDR5 memory technologies as opposed to just GDDR5X was driven mainly by a desire from Nvidia to reduce cost.

So far all rumors and leaks point towards a Computex, late May, announcement and June launch of Nvidia’s next generation Pascal GP104 based GTX 1080 and GTX 1070 graphics cards. Whether Nvidia will actually name their next generation GTX 980 and GTX 970 replacements GTX 1080 and GTX 1070 is subject to speculation at this point. But I fully expect Nvidia to roll out a new naming scheme for its new products this year.

Nvidia’s Pascal Architecture – Fewer, Faster CUDA Cores With Significantly Higher Per Thread Throughput

We dove deep into Nvidia’s Pascal architecture last week after the company’s GTC 2016 reveal of the Tesla P100 and the flagship Pascal GP100 GPU that will be launching in 2017. We discussed all the architectural updates that Nvidia has made to Pascal which I’d highly recommend that you check out if you’re interested in finding out how much faster Pascal is going to be.

NVIDIA Pascal SMP

A few very significant changes from Maxwell to Pascal stick out. Each Pascal CUDA core has been beefed up considerably compared to Maxwell and clock speeds have gone up by 33%. So core for core, Pascal will be much faster than Maxwell.

Tesla Products Tesla K40 Tesla M40 Tesla P100
GPU GK110 (Kepler) GM200 (Maxwell) GP100 (Pascal)
SMs 15 24 56
TPCs 15 24 28
FP32 CUDA Cores / SM 192 128 64
FP32 CUDA Cores / GPU 2880 3072 3584
FP64 CUDA Cores / SM 64 4 32
FP64 CUDA Cores / GPU 960 96 1792
Base Clock 745 MHz 948 MHz 1328 MHz
GPU Boost Clock 810/875 MHz 1114 MHz 1480 MHz
Compute Performance - FP32 5.04 TFLOPS 6.82 TFLOPS 10.6 TFLOPS
Compute Performance - FP64 1.68 TFLOPS 0.21 TFLOPS 5.3 TFLOPS
Texture Units 240 192 224
Memory Interface 384-bit GDDR5 384-bit GDDR5 4096-bit HBM2
Memory Size Up to 12 GB Up to 24 GB 16 GB
L2 Cache Size 1536 KB 3072 KB 4096 KB
Register File Size / SM 256 KB 256 KB 256 KB
Register File Size / GPU 3840 KB 6144 KB 14336 KB
TDP 235 Watts 250 Watts 300 Watts
Transistors 7.1 billion 8 billion 15.3 billion
GPU Die Size 551 mm² 601 mm² 610 mm²
Manufacturing Process 28-nm 28-nm 16-nm

This should come as a relief to those who have been wondering if Nvidia’s GP104 based, GTX 1080 and GTX 1070, graphics cards will offer a reasonable speed-up over the GTX 980 Ti and GTX 980. The leaked die shot of GP104 revealed that the chip in question is roughly only 300mm² large, half that of the 3840 CUDA core GP100 GPU.

The GP100 GPU features 6 GPCs, Graphics Processing Clusters. Each contains 10 Pascam SMs, Streaming Multiprocessors. Each SM contains 64 Pascal CUDA cores. Which means that each GPC houses 640 Pascal CUDA cores. Since the GP104 GPU is almost exactly half the size of GP100, If Nvidia maintains the same 10 SM per GPC design,the GTX 1080 should feature 3 GPCs and 1920 CUDA cores. And up to 2048 CUDA cores if Nvidia decides to tweak the design and opt for a 4 GPC layout with n 8 SMs per GPC.

GPU Kepler GK110 Maxwell GM200 Pascal GP100 Volta GV100
Compute Capability 3.5 5.3 6.0 7.0
Threads / Warp 32 32 32 32
Max Warps / Multiprocessor 64 64 64 64
Max Threads / Multiprocessor 2048 2048 2048 2048
Max Thread Blocks / Multiprocessor 16 32 32 32
Max 32-bit Registers / SM 65536 65536 65536 65536
Max Registers / Block 65536 32768 65536 65536
Max Registers / Thread 255 255 255 255
Max Thread Block Size 1024 1024 1024 1024
CUDA Cores / SM 192 128 64 64
Shared Memory Size / SM Configurations (bytes) 16K/32K/48K 96K 64K 96K

In either case, with Pascal’s significant architectural improvements and very high frequency increase in mind. A 1920-2048 CUDA core GTX 1080 graphics card should end up being faster than a GTX 980 Ti. The performance delta should be reminiscent of what GTX 980 brought to the table when it launched compared to the GTX 780 Ti. A 15-20% performance increase at a lower price point. That being said the GTX 1070 should be the star of the lineup. Offering GTX 980 Ti comparable performance at a much more affordable price point, again similar to what the 970 delivered compared to the 780 Ti.

AMD’s Polaris GCN 4.0 architecture is purported to deliver similar gains over the company’s current GCN iteration, we’ll be detailing those in an upcoming architectural deep dive piece as well. Polaris 10 and Polaris 11 based cards, R9 490, 480 and 470 series, are also pegged for a June launch against Pascal. So we can’t wait to see both Pascal and Polaris in action this summer.

 

Submit