Nvidia's latest GeForce flagship gaming card, the GTX 1080, is already in the hands of reviewers as Nvidia begins seeding evaluation samples to the press who will be putting the company's latest product through its paces ahead of the card's May 27th launch date.
[UPDATED 9:14 AM Sunday May 8th 2016] Reviews will go live on May 17th according to videocardz.com.
By now you probably know everything there is to know about Nvidia's latest Pascal powered gaming card and are eagerly awaiting independent & thorough gaming benchmarks to really see what this card is capable of in the real world. Well worry not as the wait is quickly coming to an end. It is true that Nvidia's May 27th launch date is specific to the $699 "Founder's" edition GTX 1080. So we're likely to only get reviews of the reference design before the 27th with custom AIB GTX 1080 cards coming later at $599.
Nvidia GTX 1080 Cards Already In Reviewers Hands - Beautiful Box Art Revealed
A lucky few including friend of the site Jason Evangelho - PC Tech contributing editor at Forbes - have already received their GeForce GTX 1080 review sample. The box GTX 1080 box maintains the emblematic look and feel of the Nvidia GeForce brand. Displaying the name of the card in bold silver and green letters with the famous "Insipired By Gamers, Built By Nvidia" tag line below.
It's happening. pic.twitter.com/Gal9ypIGnm
— Radeon Jason ✈️ E3 (@killyourfm) May 8, 2016
But of course you're not overly interested in the box but what's inside it. With Independent benchmarks and reviews still two weeks away and actual product launch roughly three weeks away, you'll need something to quinch your thirst for more GTX 1080 information. Fortunately and lucky for us we don't have to wait until the 17th or the 27th to tell you all about not only what's inside the box, but what's inside the GPU itself that's powering the GTX 1080. Nvidia's latest and most refined CUDA graphics architecture yet Pascal.
Credit : Hardwareluxx
Official Geforce GTX 1080 and Geforce GTX 1070 Specifications
|WCCFTech||Nvidia Geforce GTX 1080||Nvidia Geforce GTX 1070|
|Transistors||7.2 Billion||7.2 Billion|
|Core Clock||1607 Mhz||TBA|
|Boost Clock||1733 Mhz||1683 Mhz|
|Memory Type||G5X (GDDR5X)||GDDR5|
|Memory Speed||10 Gbps||8 Gbps|
|Memory Bandwidth||320 GB/s||256 GB/s|
|HB SLI Bridge Support||Yes||Yes|
|Nvidia GPU Boost||3.0||3.0|
|DirectX 12 Feature Level||12_1||12_1|
|Maximum Digital Resolution||7680x4320@60Hz||7680x4320@60Hz|
|Display Connectors||DP 1.42, HDMI 2.0b, DL-DVI||DP 1.42, HDMI 2.0b, DL-DVI|
|Power Connector||Single 8-Pin||Single 8-Pin|
|Maximum Operating Temp||94 C||94 C|
|Partner Price (MSRP)||$599||$379|
|FE Price (MSRP)||$699||$449|
Diving Into Nvidia's Pascal Architecture, CUDA Refined
Historically speaking both graphics vendors Nvidia and AMD have successfully maintained a cadence that insured significant performance and power efficiency advances are delivered with every new architecture and Pascal is no exception. The very basic building block of Nvidia's Pascal architecture has been stripped apart and redesigned. This building block which Nvidia dubs Streaming Multiprocessor or SM for short is the engine that drives the graphics and compute horsepower of every Pascal chip.
With Maxwell, Pascal's predecessor powering the GTX 900 series, Nvidia introduced the Streaming Maxwell Multiprocessor. The SMM built on the strengths of Nvidia's Kepler SM - introduced with the GTX 600 and 700 series - which Nvidia dubs the SMX. It also done away with many unnecessary complexities which enabled the engine to deliver more throughput and higher clock speeds. The Pascal SM in its own right is an evolution of the Maxwell SM, a smarter, more streamlined engine.
Each Pascal streaming multiprocessor includes 64 FP32 CUDA cores, half that of Maxwell. Within each Pascal streaming multiprocessor there are two 32 CUDA core partitions, two dispatch units, a warp scheduler and a fairly large instruction buffer. Which is the same size compared to a Maxwell SM but twice the size per CUDA core since a Pascal SM only has half as many CUDA cores as a Maxwell SM.
Additionally because each Pascal SM has the same number of registers as Maxwell, this translates to each Pascal CUDA core having access to twice the number of registers. This in turn means that not only can Pascal accommodate more threads than Maxwell, but each thread has access to more registers and thus a lot more throughput. Finally, each warp scheduler can dispatch two instructions per clock.
Nvidia's Senior Architect, Lars Nyland admitted that the 16nm FinFET process played an important role in realizing the team's power efficiency goals for Pascal, but maintains that numerous architectural improvements aided in further reducing the energy footprint of the architecture. Including the use of new on-chip voltage signaling technologies.
Async Compute, Still The Hottest Item On The DirectX 12 Menu
According to Nvidia the end result is that each Pascal SM actually requires less power and area to manage data transfers even compared to a Kepler SMX. Which improves both performance and power efficiency. Pascal also includes an updated scheduler that not only improves CUDA core utilization inside each SM utilization but is also more intelligent and more power efficient. This hardware scheduler will play a key role in improving Pascal's DirectX 12 Async Compute capability and allow Nvidia to catch up to AMD's GCN architecture in this area.
Async Compute has been a hot subject of debate ever since its been announced. We dove deep a couple of months ago into this peculiar DirectX 12 feature in our two thousand word analysis piece dubbed "AMD’s Secret DirectX 12 Weapon That Nvidia Had To Trade Off – Demystifying Async Compute". We explained the inherent architectural differences between Nvidia and AMD graphics cards and how they deal and perform so differently with asynchronous game code. I'd highly recommend giving it a read if you're looking to wrap your head around this topic and get down to the core of it all.
The addition of an updated hardware scheduler in Pascal signals a change of heart for Nvidia. It represents a walk-back on some of the trade-offs that the company decided to make with Maxwell to achieve its power efficiency goals. Trade-offs that some could argue were reasonably sound for DirectX 11 and traditional generic APIs, but not so much for the new era of VR and low level APIs such as DirectX 12 and Vulkan. Where executing code asynchronously has proven to be of benefit to latency and performance.
Without a doubt Pascal will handle Async Compute better than Maxwell, how much better? we'll find out soon enough. In the meantime we'll keep an eye out on the other side of the fence. As Nvidia's rival, AMD plans and executes its next move with Polaris to win over its own mind share of gamers looking to jump on the next generation of VR and high resolution gaming bandwagon this summer. Word on the street is that it won't be long before AMD reciprocates and shows its own hand with the Radeon 400 series. Whether that hand packs an ace to take on the GTX 1080 and 1070 or something entirely different, unexpected even, gamers will find out soon enough.
|GPU||Kepler GK110||Maxwell GM200||Pascal GP100||Volta GV100|
|Threads / Warp||32||32||32||32|
|Max Warps / Multiprocessor||64||64||64||64|
|Max Threads / Multiprocessor||2048||2048||2048||2048|
|Max Thread Blocks / Multiprocessor||16||32||32||32|
|Max 32-bit Registers / SM||65536||65536||65536||65536|
|Max Registers / Block||65536||32768||65536||65536|
|Max Registers / Thread||255||255||255||255|
|Max Thread Block Size||1024||1024||1024||1024|
|CUDA Cores / SM||192||128||64||64|
|Shared Memory Size / SM Configurations (bytes)||16K/32K/48K||96K||64K||96K|