NVIDIA Pascal GPU Will Be Manufactured on TSMC 16nm FF Node – Flagship Single Chip Card To Feature 16 GB HBM2 VRAM
NVIDIA’s upcoming Pascal GPU has been confirmed by sources to be manufactured on the TSMC 16nm FinFET node as Samsung has failed to win the contract for the latest GPU. The NVIDIA Pascal GPU is the latest NVIDIA Compute oriented graphics architecture which will be launching in 2016 and feature new technologies such as HBM2, NVLINK and Mixed Precision support.
TSMC Wins Contract To Manufacturer Next Generation NVIDIA Pascal GPU
There’s no specific reason given for NVIDIA choosing TSMC over Samsung and there were reports that NVIDIA would use both semiconductor companies to mass produce Pascal GPUs but at the end of the day, NVIDIA had to choose TSMC and their 16nm FinFET node even though Samsung already has 14nm FinFET in production as was demonstrated by Apple with their A9 SOC demonstrated just a few weeks back.
It was revealed a few weeks ago that NVIDIA’s Pascal GP100 chip has already been taped out on TSMC’s 16nm FinFET process, last month. This means that we can see a launch of these chips as early as Q2 2016. Doubling of the transistor density would put Pascal to somewhere around 16-17 Billion transistors since the Maxwell GPUs already feature 8 Billion transistors on the flagship GM200 GPU core.
TSMC’s 16FF+ (FinFET Plus) technology can provide above 65 percent higher speed, around 2 times the density, or 70 percent less power than its 28HPM technology. Comparing with 20SoC technology, 16FF+ provides extra 40% higher speed and 60% power saving. By leveraging the experience of 20SoC technology, TSMC 16FF+ shares the same metal backend process in order to quickly improve yield and demonstrate process maturity for time-to-market value.
According to industry sources on Sept. 15, Nvidia decided to let TSMC mass produce the Pascal GPU, which is scheduled to be released next year, using the production process of 16-nm FinFETs. Some in the industry predicted that both Samsung and TSMC would mass produce the Pascal GPU, but the U.S. firm chose only the Taiwanese firm in the end. Since the two foundries have different manufacturing process of 16-nm FinFETs, the U.S. tech company selected the world’s largest foundry for product consistency.
The reason for Samsung’s determination to win the contract for the Pascal GPU lies in the fact that Nvidia’s new GPU is highly likely to mark a milestone in the next-gen graphic market. Experts are saying that Samsung’s failure to obtain the contract is mainly attributable to its lack of experience. The fact that the Korean tech giant has become TSMC’s rival only two years after it started to produce GPUs itself is considered to have special meaning at the moment. via Business Korea
Furthermore, NVIDIA has their GTC conference being held in Japan on 18th September 2015 and have already released some material in regards to the new announcements that will be made at the graphics focused event. While not much in details in regards to specific architecture specifications or features, the presentation does show some new details about the NVIDIA Pascal GPU. As we already know, NVIDIA Pascal GPU will feature Mixed Precision, 3D Memory integration and NVLINK. The new details show that the top of the line NVIDIA Pascal (Single chip) card will feature up to 16 GB HBM2 VRAM that will deliver up to 1 TB/s bandwidth. More details about Pascal are expected to be revealed at the conference.
With HBM2, NVIDIA gets the leverage to feature far more memory than what’s currently allocated on HBM1 cards (4 GB HBM on Fury X, Fury, Nano, Fury X2). With HBM2, NVIDIA gets access to more denser chips that will result in cards with 16 GB and up to 32 GB of HBM memory across a massive 4096bit memory interface which will dominate the next high-resolution 4K and 8K gaming panels. With 8Gb per DRAM die and 2 Gbps speed per pin, we get approximately 256 GB/s bandwidth per HBM2 stack. With four stacks in total, we will get 1 TB/s bandwidth on NVIDIA’s GP100 flagship Pascal which is twice compared to the 512 GB/s on AMD’s Fiji cards and three times that of the 980 Ti’s 334GB/s. Samsung and SK Hynix are planning to commence mass production later this year so NVIDIA will have the choice to select from two manufacturers while AMD is expected to stick with SK Hynix as their partner for HBM2 memory that will be featured on their Arctic Islands cards that are also suggested to feature around 16-32 GB HBM2 VRAM. The flagship NVIDIA Pascal GPU will come with up to 16 GB of HBM2 VRAM while the dual chip offerings that will be aimed at GeForce and Tesla markets later on will ship with 32 GB VRAM.
The Pascal GPU would also introduce NVLINK which is the next generation Unified Virtual Memory link with Gen 2.0 Cache coherency features and 5 – 12 times the bandwidth of a regular PCIe connection. This will solve many of the bandwidth issues that high performance GPUs currently face. One of the latest things we learned about NVLINK is that it will allow several GPUs to be connected in parallel in HPC focused platforms that will feature several nodes fitted with Pascal GPUs for compute oriented workloads. The latest NVLINK interconnect path will allow multi-processors featured inside HPC blocks to have faster interconnect than traditional PCI-e Gen3 lanes at 80 GB/s speeds. Pascal GPUs will also feature Unified memory support allowing the CPU and GPU to share the same memory pool and finally we have Mixed precision support.
With Pascal GPU, NVIDIA will return to the HPC market with new Tesla products. Maxwell, although great in all regards was deprived of necessary FP64 hardware and focused only on FP32 performance. This meant that the chip was going to stay away from HPC markets while NVIDIA offered their year old Kepler based cards as the only Tesla based options. Pascal will not only improve FP64 performance but also feature mixed precision that allows NVIDIA cards to compute at 16-bit at double the accuracy of FP32. This means that the cards will enable three tiers of compute at FP16, FP32 and FP64.
|GPU Family||AMD Vega||AMD Navi||NVIDIA Pascal||NVIDIA Volta|
|Flagship GPU||Vega 10||Navi 10||NVIDIA GP100||NVIDIA GV100|
|GPU Process||14nm FinFET||7nm FinFET||TSMC 16nm FinFET||TSMC 12nm FinFET|
|GPU Transistors||15-18 Billion||TBC||15.3 Billion||21.1 Billion|
|GPU Cores (Max)||4096 SPs||TBC||3840 CUDA Cores||5376 CUDA Cores|
|Peak FP32 Compute||13.0 TFLOPs||TBC||12.0 TFLOPs||>15.0 TFLOPs (Full Die)|
|Peak FP16 Compute||25.0 TFLOPs||TBC||24.0 TFLOPs||120 Tensor TFLOPs|
|VRAM||16 GB HBM2||TBC||16 GB HBM2||16 GB HBM2|
|Memory (Consumer Cards)||HBM2||HBM3||GDDR5X||GDDR6|
|Memory (Dual-Chip Professional/ HPC)||HBM2||HBM3||HBM2||HBM2|
|HBM2 Bandwidth||484 GB/s (Frontier Edition)||>1 TB/s?||732 GB/s (Peak)||900 GB/s|
|Graphics Architecture||Next Compute Unit (Vega)||Next Compute Unit (Navi)||5th Gen Pascal CUDA||6th Gen Volta CUDA|
|Successor of (GPU)||Radeon RX 500 Series||Radeon RX 600 Series||GM200 (Maxwell)||GP100 (Pascal)|