NVIDIA Pascal GPU To Feature 17 Billion Transistors and 32 GB HBM2 VRAM – Full CUDA Compute Architecture Arrives in 2016
NVIDIA will be introducing their next generation Pascal GPU in 2016 which will introduce several new and key technologies to the green team. The Pascal GPU will be the successor to the current generation Maxwell GPU and from the looks of it, it is going to be a beast of a chip. Featuring the latest HBM2 and 16nm FinFET based designs, Pascal GPUs will leverage NVIDIA’s dominance in both the consumer and corporate world.
NVIDIA Pascal GPU Might Feature 17 Billion Transistors, Almost Twice The Transistors of Fiji
In an exclusive report published by Fudzilla, the site reveals that NVIDIA’s next generation Pascal GPU will feature 17 billion transistors crammed inside its core. Currently, the flagship GM200 core found on the GeForce GTX Titan X comes with 8.0 Billion transistors while the competitor, the Radeon R9 Fury X has a total of 8.9 Billion transistors inside its Fiji GPU. The 17 Billion transistors on the Pascal GPU are twice the transistors found on the GM200 Maxwell and the Fiji XT GPU core which is literally insane. Pascal is meant to be NVIDIA’s next high performance, compute focused graphics architecture which will be found on all market segments that will include GeForce, Quadro and even Tesla. Based on TSMC’s 16nm process node, NVIDIA’s Pascal GPU will not only feature the best performance in graphics but also the most power efficient architecture ever made by a GPU manufacturer.
It was revealed a few days ago that NVIDIA’s Pascal GP100 chip has already been taped out on TSMC’s 16nm FinFET process, last month. This means that we can see a launch of these chips as early as Q2 2016. Given that the transistor count is correct, we can expect a incremental performance increase from Pascal across the range of graphics cards that will be introduced.
TSMC’s 16FF+ (FinFET Plus) technology can provide above 65 percent higher speed, around 2 times the density, or 70 percent less power than its 28HPM technology. Comparing with 20SoC technology, 16FF+ provides extra 40% higher speed and 60% power saving. By leveraging the experience of 20SoC technology, TSMC 16FF+ shares the same metal backend process in order to quickly improve yield and demonstrate process maturity for time-to-market value.
The 17 Billion transistors are an insane amount but what’s more insane is the amount of VRAM that is going to be featured on the new cards. With HBM2, NVIDIA gets the leverage to feature far more memory than what’s currently allocated on HBM1 cards (4 GB HBM on Fury X, Fury, Nano, Fury X2). With HBM2, NVIDIA gets access to more denser chips that will result in cards with 16 GB and up to 32 GB of HBM memory across a massive 4096bit memory interface which will dominate the next high-resolution 4K and 8K gaming panels.Although they may have to wait a little bit longer thanks to AMD’s priority access to HBM2 with SK Hynix, the makers of HBM. With 8Gb per DRAM die and 2 Gbps speed per pin, we get approximately 256 GB/s bandwidth per HBM2 stack. With four stacks in total, we will get 1 TB/s bandwidth on NVIDIA’s GP100 flagship Pascal which is twice compared to the 512 GB/s on AMD’s Fiji cards and three times that of the 980 Ti’s 334GB/s.
The Pascal GPU would also introduce NVLINK which is the next generation Unified Virtual Memory link with Gen 2.0 Cache coherency features and 5 – 12 times the bandwidth of a regular PCIe connection. This will solve many of the bandwidth issues that high performance GPUs currently face. One of the latest things we learned about NVLINK is that it will allow several GPUs to be connected in parallel, whether in SLI for gaming or for professional usage. Jen-Hsun specifically mentioned that instead of 4 cards, users will be able to use 8 GPUs in their PCs for gaming and professional purposes.
With Pascal GPU, NVIDIA will return to the HPC market with new Tesla products. Maxwell, although great in all regards was deprived of necessary FP64 hardware and focused only on FP32 performance. This meant that the chip was going to stay away from HPC markets while NVIDIA offered their year old Kepler based cards as the only Tesla based options. Pascal will not only improve FP64 performance but also feature mixed precision that allows NVIDIA cards to compute at 16-bit at double the accuracy of FP32. This means that the cards will enable three tiers of compute at FP16, FP32 and FP64. NVIDIA’s far future Volta GPU will further leverage the compute architecture as it is already planned to be part of the SUMMIT and Sierra super computers that feature over 150 PetaFlops of compute performance and launch in 2017 which indicates the launch of Volta just a year after Pascal for the HPC market.