Nvidia GTX 1080 Benchmarks & Review Roundup – 25% Faster Than GTX 980 Ti, Launching May 27

Khalid Moammer
Posted May 17, 2016
214Shares
Share Tweet Submit

Independent reviews and comprehensive performance benchmarks of Nvidia’s GeForce GTX 1080 graphics card have just hit the web. That’s right folks, the day that every prospective GTX 1080 buyer has been waiting for has arrived. Nvidia’s embargo on GTX 1080 reviews has just lifted and we finally have the opportunity to closely examine and get a good look at what Nvidia’s new graphics card can really do.

We know that some reviewers received their GTX 1080 samples nearly 10 days ago, and most have had more than a week to run the card through the paces. This gave room to a lot of very interesting and some even creative testing. Although we should not neglect to remind everyone that while Nvidia is allowing reviews to go up today, the Founder’s Edition – reference design – GTX 1080 will not be available for purchase today but you’ll be able to grab it for $699 on the 27th of May. It should be mentioned that the May 27 launch date that Nvidia has set is strictly for the Founder’s Edition GTX 1080.

Custom GTX 1080 graphics cards from Nvidia’s partners will be available shortly after and those will carry an official MSRP of $599, exactly $100 less than the Founder’s Edition. So you’ll definitely want to hang on for a few days before pulling the trigger. Especially considering that Nvidia’s partners are rolling out GTX 1080 cards that feature better cooling, better power delivery and offer better performance via higher clock speeds right out of the box.

Nvidia GeForce GTX 1080 Benchmarked And Reviewed – Up To 25% Faster Than GTX 980 Ti At 2560×1440

Now without any further delay, let’s get to the juicy bits! Below you’ll find a list of all GTX 1080 reviews that have been published to date. We’ll update the list as more reviews become available.

Anandtech
ComputerBase
PCPer
PCGamesHardware
HardOCP
KitGuru
Hexus
TechPowerUp
Guru3D
PCWorld
HotHardware
OverclockersClub
ArsTechnica

GTX 1080 ComputerBase 4KCredit : ComputerBase – GTX 1080 Max = 100% Fan Speed

Anandtech’s take :

Ryan Smith – Anandtech
Translating this into numbers, at 4K we’re looking at 30% performance gain versus the GTX 980 and a 70% performance gain over the GTX 980, so we’re looking at a very significant jump in efficiency and performance over the Maxwell generation.

Power consumption, temperatures & noise via Anandtech

Nvidia GTX 1060 Already Available In Stores, Officially Launching July 19th

 

Official Geforce GTX 1080 and Geforce GTX 1070 Specifications

WCCFTechNvidia Geforce GTX 1080Nvidia Geforce GTX 1070
ArchitecturePascalPascal
Transistors 7.2 Billion7.2 Billion
CUDA Cores25601920
Core Clock 1607 MhzTBA
Boost Clock 1733 Mhz1683 Mhz
Memory TypeG5X (GDDR5X)GDDR5
Memory Speed 10 Gbps8 Gbps
Memory Configuration 8GB8GB
Bus Width256-bit256-bit
Memory Bandwidth 320 GB/s256 GB/s
Multi Projection YesYes
HB SLI Bridge SupportYesYes
Nvidia GPU Boost3.03.0
DirectX 12 Feature Level12_112_1
OpenGL4.54.5
Vulkan API YesYes
Maximum Digital Resolution 7680x4320@60Hz7680x4320@60Hz
Display ConnectorsDP 1.42, HDMI 2.0b, DL-DVIDP 1.42, HDMI 2.0b, DL-DVI
HDCP2.22.2
Power Draw 180W150W
Power ConnectorSingle 8-PinSingle 8-Pin
Maximum Operating Temp94 C94 C
Partner Price (MSRP)$599$379
FE Price (MSRP)$699$449

The GTX 1080 Dissected – Diving Into Nvidia’s GP104 Pascal Architecture

Historically speaking both graphics vendors Nvidia and AMD have successfully maintained a cadence that insured significant performance and power efficiency advances are delivered with every new architecture and Pascal is no exception. The very basic building block of Nvidia’s Pascal architecture has been stripped apart and redesigned. This building block which Nvidia dubs Streaming Multiprocessor or SM for short is the engine that drives the graphics and compute horsepower of every Pascal chip.

With Maxwell, Pascal’s predecessor powering the GTX 900 series, Nvidia introduced the Streaming Maxwell Multiprocessor. The SMM built on the strengths of Nvidia’s Kepler SM – introduced with the GTX 600 and 700 series – which Nvidia dubs the SMX. It also done away with many unnecessary complexities which enabled the engine to deliver more throughput and higher clock speeds. The Pascal SM in its own right is an evolution of the Maxwell SM, a smarter, more streamlined engine.

Nvidia GTX 1080 GP104 BlockNvidia GTX 1080 – GP104 Block DIagram

Inside Nvidia’s full GP104 powering the GTX 1080 – a cut back version of the GP104 GPU will be leveraged in the GTX 1070 – there are four Graphics Processing Clusters or GPCs for short. Each GPC consists of five Streaming Multiprocessors or SMs for short – each SM contains 128 CUDA cores – and sixteen Texture Mapping Units or TMUs for short. Each SM includes eight Render Output Units, ROPs. In turn GP104 houses 2560 CUDA cores, 160 TMUs and 64 ROPs. Finally the engine is connected via eight 32-bit memory segments – 256bit memory controller – to 8GB of GDDR5X memory.

Nvidia Disables SLI On GTX 1060 3GB Cards

As mentioned, each GP104 streaming multiprocessor includes 128 FP32 CUDA cores, the same as Maxwell. Within each GP104 streaming multiprocessor there are four 32 CUDA core partitions, four dispatch units, two warp schedulers and a fairly large instruction buffer. Twice as large compared to Maxwell.

gp100_SM_diagramThe GP100 Pascal  Streaming Multiprocessor

This arrangement is almost identical to what we’ve seen with the much larger 3840 CUDA core GP100 GPU that’s powering the Tesla P100 and very likely to power Nvidia’s next Titan graphics card. Only inside GP100 each SM contains exactly half the number of CUDA cores, dispatch units and warp schedulers vs GP104. But in turn there are twice as many SMs per GPC.
So the only difference with GP104 is that Nvidia is grouping 64 CUDA core SMs in pairs made of 128 CUDA cores each and in turn naming the larger 128 unit an SM instead. This is all while maintaining the exact same ratio of dispatch units, warp schedulers and instruction buffers per CUDA core that we’ve seen with GP100.

So think of it as Nvidia just pairing 64 CUDA core groups together in a single SM. This decision is likely influenced by the significant reduction of FP64, double precision, CUDA cores per SM inside GP104 vs GP100. GP104 only contains 1 FP64 CUDA core for every 32 FP32 CUDA cores. While GP100 has 1 FP64 CUDA core for every 2 FP32 CUDA cores, 16 times more than GP104.

Additionally because each GP104 SM has twice the number of registers as Maxwell, this in turn means that not only can Pascal accommodate more threads than Maxwell, but each thread has access to more registers and thus a lot more throughput. Finally, each warp scheduler can dispatch two instructions per clock.

Nvidia’s Senior Architect, Lars Nyland admitted that the 16nm FinFET process played an important role in realizing the team’s power efficiency goals for Pascal, but maintains that numerous architectural improvements aided in further reducing the energy footprint of the architecture. Including the use of new on-chip voltage signaling technologies.

Share Tweet Submit