NVIDIA GeForce GTX 1080 Official Slides Leaked – Async Compute, SMTP VR Processing, Higher Efficiency and More Detailed
The official slides of NVIDIA’s upcoming GeForce GTX 1080 graphics card have been leaked ahead of launch by Videocardz. The GeForce GTX 1080 will be the world’s first, commercially available graphics card based on the latest FinFET node which delivers an insane increase in transistor density and allows GPUs to deliver high clock speeds compared to previous generations.
The specifications, features and various details of GeForce 10 series cards have been leaked by Videocardz.
Official NVIDIA GeForce GTX 1080 Slides Leaked – Async Compute, SMTP and Various Features In The Spotlight
The NVIDIA GeForce GTX 1080 reviews arrive next week which will soon be followed by launch on 27th May. The GeForce GTX 1080 will be available globally in reference and non-reference flavors. NVIDIA has set their Founders Edition at a price point of $699 US while the non-reference models will be available at a starting price of $599 US.
The NVIDIA GeForce GTX 1080 will cost $50 more than the launch price of the GTX 980 and $50 less than the launch price of GeForce GTX 980 Ti but at the same time, delivers performance better than a GTX Titan X and GTX 980 in SLI.
The NVIDIA GeForce GTX 1080 Founders Edition will be available to purchase on 27th May for $699 US.
The huge increase in performance is not only due to the move to the FinFET process but also due to update to a new architecture. The GeForce 10 series cards are powered by NVIDIA’s Pascal architecture which is powering a broad range of GPUs. We have already seen two chips in action, GP100, GP106 and now GP104. NVIDIA GP104 GPUs will first power the GeForce 10 series cards and will bring a range of new enhancements to the NVIDIA GPU architecture.
NVIDIA GP104 GPU Block Diagram In Detail – 314mm2 Powerhouse For GeForce 10 Series Graphics Cards
The NVIDIA GeForce 10 series graphics cards are powered by a Pascal GPU known as GP104. This GPU will have several SKUs which will be housed on several graphics cards. The GP104 GPU houses 7.2 Billion transistors and up to 2560 CUDA cores. This arrangement is achieved through 128 cores per streaming multiprocessor which is the same as GM204 but different compared to the GP100 GPU which has 64 CUDA Cores per SM unit.
The NVIDIA GeForce GTX 1080 GPU Block Diagram. (Image Credits: Videocardz)
Tearing down the GP104 block diagram, we can see that there are a total of 4 graphics processing clusters on the GP104 GPU. Each GPC houses 10 streaming multiprocessor units. These streaming multi processor units have 128 cores, 8 ROPs while each GPC partition has 16 TMUs in total. These round up to give 2560 CUDA cores, 160 TMUs and 64 ROPs on the entire chip. There are a total of 8 32-bit memory controllers on the GPU that will be able to drive GDDR5 and GDDR5X memory.
The top-end GP104 GPU can deliver up to 9 TFLOPs of compute performance on its stock configuration. All of this circuitry is housed in a 314mm2 die size which is impressive considering it could achieve better efficiency and performance stats compared to a Titan X which is based on a 601mm2 die and houses 8 Billion transistors. The complete transistor density has jumped from 10 million to 23 million per square millimeters which is a huge increase. Our detailed run down on the NVIDIA Pascal architecture can be found here.
NVIDIA GeForce GTX 1080 Official Specifications – 8 GB GDDR5X, 2560 Cores, 180W TDP and A Beautiful Reference Cooler
The NVIDIA GeForce GTX 1080 features the GP104-400-A1 GPU with 7.2 Billion transistors. The GPU comes with 2560 CUDA cores, 160 Texture Mapping Units and 64 Raster Operation Units. The card has clock speeds of 1604 MHz and 1733 MHz which deliver up to 9 TFLOPs of single precision compute performance and can boost to higher clock speeds with the new GPU Boost 3.0 algorithm.
NVIDIA GPU Boost 3.0 allows for per voltage point frequency offset which delivers maximum clock speeds per voltage point. This allows huge gains in clock speeds with stock configurations where the GPU can boost even beyond the reference boost clocks. The GPU can theoretically boost beyond 2 GHz frequency and can overclock in a similar way.
NVIDIA GeForce GTX 1080 GPUz screenshot reveals the full specifications of the graphics card.
NVIDIA features 8 GB of GDDR5X memory on their GeForce GTX 1080 graphics card. The next generation GDDR memory runs at 10 GB/s effective speeds along a 256-bit bus. This delivers a total cumulative bandwidth of 320 GB/s. The card has a pixel fill rate of 102.8 GPixels/s and texture fill rate of 257.1 GTexels/s. The card has a 180W TDP which will be powered through a single 8-Pin connector. Display outputs include 3 Display Port 1.4a, HDMI 2.0b and a single DVI-D port. A more detailed look at the PCB can be seen here.
NVIDIA Pascal GP102 GTX Titan X Specifications:
|Graphics Card Name||NVIDIA GeForce GTX 980||NVIDIA GeForce GTX 980 Ti||NVIDIA GeForce GTX Titan X||NVIDIA GeForce GTX 1070||NVIDIA GeForce GTX 1080||NVIDIA GeForce GTX Titan (Pascal)|
|Process Node||28nm||28nm||28nm||16nm FinFET||16nm FinFET||16nm FinFET|
|Transistors||5.2 Billion||8 Billion||8 Billion||7.2 Billion||7.2 Billion||12.0 Billion|
|CUDA Cores||2048 CUDA Cores||2816 CUDA Cores||3072 CUDA Cores||1920 CUDA Cores||2560 CUDA Cores||3584 CUDA Cores|
|Base Clock||1126 MHz||1000 MHz||1000 MHz||1506 MHz||1607 MHz||1417 MHz|
|Boost Clock||1216 MHz||1075 MHz||1075 MHz||1683 MHz||1733 MHz||1531 MHz|
|FP32 Compute||5.6 TFLOPs||6.5 TFLOPs||7.0 TFLOPs||6.5 TFLOPs||9.0 TFLOPs||10.1 TFLOPs|
|VRAM||4 GB GDDR5||6 GB GDDR5||12 GB GDDR5||8 GB GDDR5||8 GB GDDR5X||12 GB GDDR5X|
|Bus Interface||256-bit bus||384-bit bus||384-bit bus||256-bit bus||256-bit bus||384-bit bus|
|Power Connector||6+6 Pin Power||8+6 Pin Power||8+6 Pin Power||Single 8-Pin Power||Single 8-Pin Power||8+6 Pin Power|
|Display Outputs||3x Display Port
1x HDMI 2.0
|3x Display Port
1x HDMI 2.0
|3x Display Port
1x HDMI 2.0
|3x Display Port 1.4
1x HDMI 2.0b
|3x Display Port 1.4
1x HDMI 2.0b
|3x Display Port 1.4
1x HDMI 2.0b
|Launch Date||September 2014||May 2015||March 2015||10th June 2016||27th May 2016||August 2016|
|Launch Price||$549 US||$649 US||$999 US||$379 US||$599 US||$1200 US|
NVIDIA GeForce GTX 1080 Pictures (Image Credits: GamerSky)
Performance based off these specs can be seen in 3DMark Firestrike and Ashes of The Singularity. NVIDIA also demoed their GTX 1080 graphics card in DOOM running Vulkan API and achieving up to 200 FPS. Expect to see more performance metrics when the reviews go live next week.
Talking about display capabilities, the GeForce GTX 1080 can drive up to 6 connectors with display resolution of 7680×4320 at 60 Hz. The card comes with H.264 Encode(2x 4K @ 60 Hz)/Decode (2x 4K @ 120 Hz up 240 Mbps) , HEVC Encode (2x 4K @ 60 Hz)/ Decode (2x 4K @ 120 Hz / 8K @ 30 Hz up 320 Mbps), 10-bit HEVC Encode and HEVC 10-bit Decode and VP9 Decode. The new graphics cards are fully HDR compliant and can even drive HDR through gamestream technology provided you are using a HDR TV in your living room.
NVIDIA GeForce 10 Series “Pascal” Features Detailed
Some of the features shown in the new slides detail how Simultaneous Multi-Projection works, use of Async compute in NVIDIA’s Pascal architecture and Pascal’s dynamic load balancing system.
NVIDIA Pascal GeForce 10 Series – Async Compute / Preemption (Pixel/Instruction Level):
First up, NVIDIA is mentioning Asynchronous compute in their slides which is referring to the same feature that has been a highlight since the release of DirectX 12 API. NVIDIA slide shows that their GeForce 10 series cards will be using Async compute in broad amount of applications that include Physics, VR and post processing.
This will allow NVIDIA to better utilize the compute aspects of their Pascal chips in various tasks. While NVIDIA has confirmed this in their slides, we wait to see how big performance impacts will the new architecture provide in games that utilize Async compute. You can dive into our report on NVIDIA’s Async Compute claims to learn more about how asynchronous computing works on modern GPUs.
NVIDIA has also built deeper and more finer grained preemption techniques in Pascal for VR. Preemption in VR would provide improved latency, better speed and fully asynchronous time wrap in VR titles. NVIDIA is calling it the first Pixel-level graphics preemption which will allow graphics resources to preempt command buffer, pixels and triangles. The Pixel level graphics and thread level compute preemption will provide sub-100us preemption in games while instruction level compute preemption will deliver maximum performance in general computing tasks.
NVIDIA Pascal GeForce 10 Series – Dynamic Load Balancing:
Another feature that is closely related to Async compute is Pascal’s dynamic load balancing system. Usually, GPU resources such as graphics and compute will have different levels of utilization. Graphics resources are able to run tasks faster with more utilization leaving the rest of the part idle whereas compute retains its lower utilization percentage even after the graphics resources are utilized.
Dynamic balancing gives compute full utilization after the graphics tasks is completed and vice versa. This leads to better utilization of the GPU in applications that leaves no idle state. In both scenarios, compute and graphics are running at the same time.
NVIDIA Pascal GeForce 10 Series – Simultaneous Multi-Projection:
While we detailed Simultaneous Multi-Projection in another post over here, we now know more about this technology. According to NVIDIA, SMTP will allow the GeForce cards to perform better in VR applications by focusing on content that is being seen by the user hence not rendering the rest of the portions to conserve performance.
For instance, Oculus Rift can display a total of 4.2 MPixels but not all of it will be displayed per eye. SMTP renders only 2.8 MPixels using a conservative preset which allows faster performance by displaying frames which are being seen. SMTP also delivers 2 times the geometry processing by applying geometry before the picture is rendered. All of this leads to better performance figures with Pascal’s new and modern architecture design that is geared for VR gaming.
SMTP may work with older NVIDIA architectures but it may not work as good as Pascal. This is because NVIDIA decided to update the Polymorph Engine 4.0 with a dedicated block to handle SMTP. The new block will provide GeForce 10 series and future cards with more than 2 times the performance increase over their predecessors.
NVIDIA Pascal GeForce 10 Series – Fourth Generation Memory Compression:
NVIDIA’s Maxwell architecture introduced the most modern memory compression technique to date on NVIDIA graphics cards but Pascal will be taking that one step ahead. The GeForce 10 series graphics card feature a new, fourth generation delta compression algorithm which delivers up to 1.7x bandwidth conservation compared to Maxwell. Hence the GPU and application dependency on memory bandwidth is vastly reduced.
NVIDIA Pascal GeForce 10 Series – High-Bandwidth SLI Bridges:
NVIDIA has also introduced a new technology on their Pascal cards (GTX 1080 and GTX 1070) known as the HB SLI (High Bandwidth Scale Link Interface). Compared to regular SLI, the HB SLI solution provides twice the band width by utilizing both SLI connectors on the GeForce Pascal boards to provide a seamless experience when you run a multi-GPU setup on higher resolutions.
The new HB SLI bridges deliver better latency and performance gains on higher resolutions and surround gaming so it’s best to use these if you aim to run a multi-GPU setup.
First NVIDIA GeForce GTX 1080 Graphics Card With Water Block – Hits Over 2 GHz With Ease
Yesterday, we provided you some insight on several models of the GeForce GTX 1080 graphics card. We detailed that watercooled graphics card will be able to hit some impressive clocks over the 2 GHz range. A Chinese water block manufacturer has showed their upcoming solution for the reference GeForce GTX 1080 with the card and it looks great. You can find picture below:
We can’t wait to learn more about the GeForce GTX 1080 and GeForce GTX 1070 in the weeks ahead. Expect AIBs to have custom models prepped at Computex which begins in the first week of June.