According to the latest whispers Nvidia has allegedly designed two reference PCBs with GDDR5X and GDDR5 compatibility for its GP104 GPU based GTX 1080 and GTX 1070 graphics cards. The latest whispers claim that Nvidia has decided to create a "premium" GP104 board based on the GP104-400 GPU that is going to power the flagship Pascal GeForce GTX graphics card this year. Otherwise known as the GTX 1080 in the web's echochambers, this "premium" board will allegedly feature GDDR5X rather than GDDR5.
Whilst Nvidia's more mainstream GP104 based graphics card, the purported GTX 1070, will be based on a cut down version of the same GP104 chip code named GP104-200 and feature 8Gbps GDDR5 chips instead. This rumor comes straight from the chiphell forums via bitsandchips.it, which have also brought us the leaked GP104 die shots a few days ago. So while there maybe veracity to these claims, we'd still advise our readers to take this with the usual grain of salt.
Nvidia GeForce GTX 1080 And GeForce GTX 1070 To Feature Different PCBs Due to Different GDDR5X & GDDR5 Pin Layout
According to the same source two different PCB designs are necessary due to the different pin layout of GDDR5X and GDDR5 chips. So whilst the GP104 GPU is claimed to be compatible with both memory technologies, the different pin layout doesn't allow GDDR5X to be a simple drop-in replacement.
The leaked GP104 die shot revealed that the pictured graphics board features 8Gbps Samsung GDDR5 memory chips. Unfortunately the nscripted info on the die has been omitted, otherwise we would've been able to determine whether this is GP104-400 or GP104-200 and validate the rumored claims of GP104-400 using GDDR5X. Assuming the whispers are true, this die shot should be of GP104-200 and this should be a GTX 1070 board rather than a GTX 1080.
The first wave of GDDR5X memory chips that Micron has started sampling last month and will be mass producing in the summer are rated at 10Gbps, 11Gbps and 12Gbps. Which means that the fastest GDDR5X configuration will yield up to 50% more bandwidth vs the 8Gbps GDDR5 memory chips pictured above.
Because the GP104 GPU is configured with a 256bit memory interface. With 10gbps GDDR5X chips chosen, the GTX 1080 will have access to320GB/s of memory bandwidth. That's up to 43% more compared to the GTX 980 and just 5% less than the GTX 980 Ti.
Nvidia Pascal Specs
|WCCF||GTX 980 Ti||GTX 980||GTX 1080||GTX 1070||TESLA P100 (GP100)|
|Process Node||28nm||28nm||16nm FinFET||16nm FinFET||16nm FinFET|
|Transistors||8 Billion||5.2 Billion||TBA||TBA||15.3 Billion|
|CUDA Cores||2816 CUDA Cores||2048 CUDA Cores||2560 CUDA Cores?||2048 CUDA Cores?||3840 CUDA Cores|
|VRAM||6 GB GDDR5||4 GB GDDR5||8 GB GDDR5X||8 GB GDDR5||16GB HBM2|
|Launch Date||May 2015||September 2014||June 2016||June 2016||Q1 2017|
Micron announced late last month that it's already shipping 10Gbps, 11Gbps and 12Gbps samples to its customers. Which means that Nvidia, as well as AMD, have already got access to GDDR5X chips to test and will be ready to roll out graphics cards featuring the new memory technology as production ramps up this summer. This indicates that the decision to use both GDDR5X and GDDR5 memory technologies as opposed to just GDDR5X was driven mainly by a desire from Nvidia to reduce cost.
So far all rumors and leaks point towards a Computex, late May, announcement and June launch of Nvidia's next generation Pascal GP104 based GTX 1080 and GTX 1070 graphics cards. Whether Nvidia will actually name their next generation GTX 980 and GTX 970 replacements GTX 1080 and GTX 1070 is subject to speculation at this point. But I fully expect Nvidia to roll out a new naming scheme for its new products this year.
Nvidia's Pascal Architecture - Fewer, Faster CUDA Cores With Significantly Higher Per Thread Throughput
We dove deep into Nvidia's Pascal architecture last week after the company's GTC 2016 reveal of the Tesla P100 and the flagship Pascal GP100 GPU that will be launching in 2017. We discussed all the architectural updates that Nvidia has made to Pascal which I'd highly recommend that you check out if you're interested in finding out how much faster Pascal is going to be.
A few very significant changes from Maxwell to Pascal stick out. Each Pascal CUDA core has been beefed up considerably compared to Maxwell and clock speeds have gone up by 33%. So core for core, Pascal will be much faster than Maxwell.
|Tesla Products||Tesla K40||Tesla M40||Tesla P100|
|GPU||GK110 (Kepler)||GM200 (Maxwell)||GP100 (Pascal)|
|FP32 CUDA Cores / SM||192||128||64|
|FP32 CUDA Cores / GPU||2880||3072||3584|
|FP64 CUDA Cores / SM||64||4||32|
|FP64 CUDA Cores / GPU||960||96||1792|
|Base Clock||745 MHz||948 MHz||1328 MHz|
|GPU Boost Clock||810/875 MHz||1114 MHz||1480 MHz|
|Compute Performance - FP32||5.04 TFLOPS||6.82 TFLOPS||10.6 TFLOPS|
|Compute Performance - FP64||1.68 TFLOPS||0.21 TFLOPS||5.3 TFLOPS|
|Memory Interface||384-bit GDDR5||384-bit GDDR5||4096-bit HBM2|
|Memory Size||Up to 12 GB||Up to 24 GB||16 GB|
|L2 Cache Size||1536 KB||3072 KB||4096 KB|
|Register File Size / SM||256 KB||256 KB||256 KB|
|Register File Size / GPU||3840 KB||6144 KB||14336 KB|
|TDP||235 Watts||250 Watts||300 Watts|
|Transistors||7.1 billion||8 billion||15.3 billion|
|GPU Die Size||551 mm²||601 mm²||610 mm²|
This should come as a relief to those who have been wondering if Nvidia's GP104 based, GTX 1080 and GTX 1070, graphics cards will offer a reasonable speed-up over the GTX 980 Ti and GTX 980. The leaked die shot of GP104 revealed that the chip in question is roughly only 300mm² large, half that of the 3840 CUDA core GP100 GPU.
The GP100 GPU features 6 GPCs, Graphics Processing Clusters. Each contains 10 Pascam SMs, Streaming Multiprocessors. Each SM contains 64 Pascal CUDA cores. Which means that each GPC houses 640 Pascal CUDA cores. Since the GP104 GPU is almost exactly half the size of GP100, If Nvidia maintains the same 10 SM per GPC design,the GTX 1080 should feature 3 GPCs and 1920 CUDA cores. And up to 2048 CUDA cores if Nvidia decides to tweak the design and opt for a 4 GPC layout with n 8 SMs per GPC.
|GPU||Kepler GK110||Maxwell GM200||Pascal GP100||Volta GV100|
|Threads / Warp||32||32||32||32|
|Max Warps / Multiprocessor||64||64||64||64|
|Max Threads / Multiprocessor||2048||2048||2048||2048|
|Max Thread Blocks / Multiprocessor||16||32||32||32|
|Max 32-bit Registers / SM||65536||65536||65536||65536|
|Max Registers / Block||65536||32768||65536||65536|
|Max Registers / Thread||255||255||255||255|
|Max Thread Block Size||1024||1024||1024||1024|
|CUDA Cores / SM||192||128||64||64|
|Shared Memory Size / SM Configurations (bytes)||16K/32K/48K||96K||64K||96K|
In either case, with Pascal's significant architectural improvements and very high frequency increase in mind. A 1920-2048 CUDA core GTX 1080 graphics card should end up being faster than a GTX 980 Ti. The performance delta should be reminiscent of what GTX 980 brought to the table when it launched compared to the GTX 780 Ti. A 15-20% performance increase at a lower price point. That being said the GTX 1070 should be the star of the lineup. Offering GTX 980 Ti comparable performance at a much more affordable price point, again similar to what the 970 delivered compared to the 780 Ti.
AMD's Polaris GCN 4.0 architecture is purported to deliver similar gains over the company's current GCN iteration, we'll be detailing those in an upcoming architectural deep dive piece as well. Polaris 10 and Polaris 11 based cards, R9 490, 480 and 470 series, are also pegged for a June launch against Pascal. So we can't wait to see both Pascal and Polaris in action this summer.