At GTC 2014, NVIDIA announced their next-generation high-performance Pascal GPU which will feature new innovations in addition to the high performance graphics core. The NVIDIA Pascal GPU will launch in 2016 and feature next generation technologies to solve bandwidth issues faced by graphics processing units.
NVIDIA Next Generation Pascal GPU To Feature 3D Stacked Memory and NVLINK
NVIDIA's Pascal GPU would replace Maxwell going in 2016 and would feature the latest core architecture from NVIDIA that will use the latest 3D Stacked memory which will enable memory to be stacked on the GPU die and enable bandwidth speeds of upto 1 TB/s. This 3D chip on wafer integration will not only enable much more BW (bandwidth) but will also deliver upto 4 times the efficiency and 2.5 times more VRAM capacity of the graphics unit to deliver amazing performance on higher resolution screens.
The Pascal GPU would also introduce NVLINK which is the next generation Unified Virtual Memory link with Gen 2.0 Cache coherency features and 5 - 12 times the bandwidth of a regular PCIe connection. This will solve many of the bandwidth issues that high performance GPUs currently face.
First technology we’ll announce today is an important invention called NVLink. It’s a chip-to-chip communication channel. The programming model is PCI Express but enables unified memory and moves 5-12 times faster than PCIe. “This is a big leap in solving this bottleneck,” Jen-Hsun says. NVIDIA
Many people would be thinking as to what happened to Volta which was said to be NVIDIA's GPU architecture after Maxwell? The Volta GPU has been replaced by Pascal (named after Blaise Pascal) who was the mind that developed the Mechanical Calculator, Probability Theory, Pascal's Theorem and Pascal's Law. However, this does note means that Volta has been removed from the roadmap as the GPU architecture is still planned for launch but after Pascal.
NVIDIA wants to showcase their 3D Stacked memory and NVLINK design early with a GPU architecture that is feasible for the 2016 timeframe and Pascal seems to be the right choice. NVIDIA didn't reveal any details about their upcoming Maxwell GPUs. The Maxwell GPU architecture is still scheduled for 2014 launch but no details regarding the architecture have been released. There's one disappointing news that Maxwell would not be getting Unified Virtual Memory support but instead get the base Unified Memory support that comes with CUDA 6.
NVIDIA hasn't talked about Maxwell GPU yet but the architectural roadmap confirms that Maxwell GPU will be the first to adopt the unannounced features of DirectX 12 which were hinted just a few days ago. The Pascal GPU is innovative and would change the way of GPU development and utilization. The Pascal module showed during the event was an engineering unit and it remains to be seen what the final design of the retail Pascal graphics cards would look like. The 1 TB/s bandwidth is a massive increase compared to 336 GB/s on NVIDIA's current flagship GeForce GTX Titan Black graphics card. From these numbers, we can guess that Maxwell would push these numbers to atleast 500 GB/s bandwidth delivering extreme power efficiency.
The GPU has a lot of pins, it’s the biggest chip in the world, interface is extremely wide. Can we go wider? It would make package enormous. Can we go faster? Uses too much energy. So, our next-enabling technology is 3D packaging. We’re going to build chips on other chips. It starts with a base wafer where interconnects are done on the wafer – thousands of bumps on these chips are flipped and bumped onto base wafer. Memory interfaces went from hundreds to thousands of bits. We stacked all the memory chips on top of each other and punch holes through them. The stacked DRAM is stacked on a wafer that sits on a substrate, which has wires that connect to the GPU and together form an interface that delivers an unbelievable amount of bandwidth, which will grow 5X over the next two years. And it will operate at 4X the energy efficiency. NVIDIA
- 3D Memory: Stacks DRAM chips into dense modules with wide interfaces, and brings them inside the same package as the GPU. This lets GPUs get data from memory more quickly – boosting throughput and efficiency – allowing us to build more compact GPUs that put more power into smaller devices. The result: several times greater bandwidth, more than twice the memory capacity and quadrupled energy efficiency.
- Unified Memory: This will make building applications that take advantage of what both GPUs and CPUs can do quicker and easier by allowing the CPU to access the GPU’s memory, and the GPU to access the CPU’s memory, so developers don’t have to allocate resources between the two.
- NVLink: Today’s computers are constrained by the speed at which data can move between the CPU and GPU. NVLink puts a fatter pipe between the CPU and GPU, allowing data to flow at more than 80GB per second, compared to the 16GB per second available now.
- Pascal Module: NVIDIA has designed a module to house Pascal GPUs with NVLink. At one-third the size of the standard boards used today, they’ll put the power of GPUs into more compact form factors than ever before.