AMD Vega Demoed, Outperforms Nvidia’s GTX 1080 – Features 8GB of HBM2 & 512GB/s Of Bandwidth

Khalid Moammer • Dec 12, 2016 at 12:19pm EST

AMD just demoed the gaming capability of an upcoming enthusiast Radeon graphics card powered by its next gen Vega GPU and the results are in. In a head to head performance showdown with an overclocked GTX 1080, en equally specced Vega equipped system was able to outperform its counterpart running Doom at 4K in Vulkan.

AMD Showcases Vega In Action - Running Doom In Vulkan At 4K & DeepBench GEMM

At the event AMD showcased two systems featuring graphics cards powered by its next generation Vega GPU. The first had a Radeon instinct MI25 graphics accelerator equipped with 16GB of HBM2, second generation high bandwidth memory. This system was used to demonstrate Vega's capabilities in deep learning, which were quite impressive. The MI25 outperformed Nvidia's Tesla P100 accelerator based on the GP100 GPU at two key AI workloads.

AMD Radeon Instinct_Final for Distribution-page-021

The second system was configured with the consumer version of Vega, equipped with 8GB of HBM2. However, unlike the Mi25, AMD was very secretive about what this consumer card looked like. It was not shown to the press and to maintain its appearance a secret all ventilation and fan inlets were taped shut. Which obviously deprived the machine of any air flow. The folks over at PCGamesHardware.de made it a point to note that it's quite possible that the graphics card was throttling, as AMD's secretive measures made things quite toasty inside.

AMD Vega demo system, photo courtesy of pcgameshardware.de

Additionally, the demo was conducted using an ordinary Fiji ( Radeon R9 Fury X, Fury and Nano ) driver with an additional debugging layer. No Vega optimized driver was used. Despite this the consumer Vega graphics card was able to outperform a GTX 1080 running at 1911Mhz by 10%. Although, Doom's Vulkan implementation has been shown to run faster on AMD GPUs.

With that being said, with optimized drivers and proper cooling it's likely that we'll see AMD squeeze out more performance out of Vega before launch. The folks over at pcgameshardware.de have also confirmed that this is the very same graphics card "687F:C1" that we spotted mingling with other GTX 1080s on AOTS's benchmark leaderboard a couple of weeks back.

Vega's Confirmed Specs

Members of the press inside the demo room were able to spot some key specifications pertaining to Vega by taking a look at the expanded statistics in Doom. 8GB of HBM2 memory for the consumer version of Vega was confirmed.

Additionally, an employee slipped a key specification that wasn't supposed to be made public yet and it's that Vega 10 features 512GB/s of memory bandwidth. The memory capacity and bandwidth are clear indications that Vega 10 has a 2048bit wide memory interface. Half that of its older sibling, Fiji. However, because HBM2 is rated at twice the speed of HBM1, Vega 10 is able to achieve the same 512GB/s of memory bandwidth.

In terms of graphics horsepower, the Vega 10 powered MI25 accelerator is rated at a staggering 12.5 TERAFLOPS of single precision floating point compute and double that in half precision FP16 compute. That's 1.5 TERAFLOPS more than Nvidia's Tesla P100 accelerator, powered by the monstrous 610mm² GP100 GPU and 2.5 TERAFLOPS more than the GTX 1080.

The MI25 is a professional, passively cooled product. The gaming oriented variant of Vega, equipped with more aggressive cooling solutions and running at higher clock speeds, would naturally be expected to achieve an even higher figure.

Vega's Next Generation Compute Unit Architecture

Vega is based on a brand new graphics architecture, the particulars of which we had already detailed briefly in our exclusive piece about Vega 10 and Vega 11. AMD confirmed today in its announcement what we had brought you back in October, which is that Vega makes use of a brand new compute unit design called NCU. Short for Next Compute Unit.

AMD hasn't discussed any details pertaining to the new design. However, we're going to give you an exclusive high-level look at NCU. This new architecture holds several key advantages over its predecessor. Chief among which is that each Vega NCU is now capable of simultaneously processing variable length wavefronts. To understand why this is such a big deal we have to look at AMD's current GCN implementation.

In AMD's current GCN implementation, each compute unit has four 16-wide vector SIMD units, capable of executing four 16-wide wavefronts ( a group of threads ) over four cycles. In addition to one scalar unit, capable of executing one instruction per cycle. This unit is delegated time-critical tasks, where the four-cycle turnaround of the SIMD units isn't sufficient.

Unfortunately, these 16-wide SIMD units work exactly the same no matter how small of a wavefront they're fed. Executing a 16-wide wavefront would take just as long as executing a 4-wide wavefront, rendering the other 12 ALUs inside the SIMD completely useless. And as graphics workloads are inherently non uniform it's effectively impossible to find any scenario where all 16-wide SIMD units are fully occupied at any given time.

Variable Width SIMDs, Getting More Performance Out Of Fewer Cycles

This is no longer the case in AMD's new GCN implementation inside Vega. The V9 architecture includes new incredibly clever schedulers and coherency subsystems that allow several smaller wavefronts to be executed simultaneously inside any SIMD that's able to accommodate the workload. This in effect allows each NCU to finish considerably more work in the same amount of time compared to its predecessor. In addition to freeing up valuable cache and memory resources for other compute units.

AMD Vega architecture
It's very hard to predict how much of a difference this big of an improvement in resource utilization and CU occupancy will yield given how unpredictable and inherently fluctuant graphics workloads are. Which brings us neatly to Vega's rumored specs.

Vega, The Rumored Specs

One of the few things that AMD has not talked about regarding Vega's specifications to date are the number of GCN stream processors it actually has. Vega 10 is believed to have 4096 GCN stream processors, according to the LinkedIn page of a leading engineer which leaked earlier this year.

Assuming that this figure is accurate, Vega 10 would have to operate at a frequency 20% higher than Polaris 10 to achieve the 12.5 TFLOPS of the Radeon Instinct MI25. We're talking 1520Mhz+, on a passively cooled enterprise GPU. A clock speed that few, mostly liquid cooled, overclocked RX 480 cards can achieve. None of AMD's current or past professional grade graphics cards and/or accelerators come close to that. We've also never seen such a large hike in clock speeds from one graphics generation to another in the same process node generation.

AMD Vega Lineup

Graphics Card	Radeon R9 Fury X	Radeon RX 480	Radeon RX Vega Frontier Edition	Radeon Vega Pro	Radeon RX Vega (Gaming)	Radeon RX Vega Pro Duo
GPU	Fiji XT	Polaris 10	Vega 10	Vega 10	Vega 10	2x Vega 10
Process Node	28nm	14nm FinFET	FinFET	FinFET	FinFET	FinFET
Stream Processors	4096	2304	4096	3584	4096 (?)	Up to 8192
Performance	8.6 TFLOPS 8.6 (FP16) TFLOPS	5.8 TFLOPS 5.8 (FP16) TFLOPS	~13 TFLOLPS ~25 (FP16) TFLOPS	11 TFLOLPS 22 (FP16) TFLOPS	>13 TFLOLPS >25 (FP16) TFLOPS	TBA TBA
Memory	4GB HBM	8GB GDDR5	16GB HBM2	TBA	TBA	TBA
Memory Bus	4096-bit	256-bit	2048-bit	2048-bit	2048-bit	4096-bit
Bandwidth	512GB/s	256GB/S	480GB/S	400GB/S	TBA	TBA
TDP	275W	150W	TBA	TBA	TBA	TBA
Launch	2015	2016	June 2017	June 2017	July 2017	TBA

It's more plausible that this 20% improvement actually comes from the IPC ( instruction per clock ) improvement of the new architecture. In fact, it's not unlikely that the MI25 runs at an even lower frequency than that of the RX 480. Especially considering it's a 300W, passively cooled enterprise part. Which would indicate that 20%+ of the chip's performance stems directly from architecture-based enhancements.

Whether that's actually the case or not remains to be seen. A combination of IPC uplift and higher clock speeds is probably the most plausible scenario.

About the author: PC hardware & tech evangelist. Been building PCs for over a decade & following the industry for just as long. Also a doctor specializing in Preventive Medicine.

Follow Wccftech on Google to get more of our news coverage in your feeds.

Read all comments on AMD Vega Demoed, Outperforms Nvidia’s GTX 1080 – Features 8GB of HBM2 & 512GB/s Of Bandwidth

AMD Vega Demoed, Outperforms Nvidia’s GTX 1080 – Features 8GB of HBM2 & 512GB/s Of Bandwidth

AMD Showcases Vega In Action - Running Doom In Vulkan At 4K & DeepBench GEMM

Vega's Confirmed Specs

Vega's Next Generation Compute Unit Architecture

Variable Width SIMDs, Getting More Performance Out Of Fewer Cycles

Vega, The Rumored Specs

AMD Vega Lineup

Trending Stories

AMD Radeon Drivers Silently Add Multi Frame Generation “MFG 8x”, Ray Regeneration, and Neural Radiance Overrides, Hinting At A Bigger FSR Push

Apple’s Chipset Development Is Unparalleled As Company Is Already Working On The M8, Flaunting Superior AI Capabilities Not To Mention Enhanced Efficiency

TSMC Can’t Keep Up With CoWoS Demand, Sending Advanced Packaging Orders Spilling Over To Intel & Rival Taiwanese Fabs

Battlestar Galactica: Scattered Hopes Review – Sometimes, You Have to Roll a Hard six

Assassin’s Creed Black Flag Resynced Beats Shadows at Launch, yet Ubisoft Rewards Barcelona Office With 51 Layoffs

Popular Discussions

AMD Prepares For Zen 6 EPYC CPUs Launch For July 22nd-23rd, Confirms AMD’s Mark Papermaster

NVIDIA’s GeForce RTX 5070 Ti SUPER – Specs, Performance, And Price, Everything We Know So Far

AMD Radeon Drivers Silently Add Multi Frame Generation “MFG 8x”, Ray Regeneration, and Neural Radiance Overrides, Hinting At A Bigger FSR Push

AMD’s Next-Gen Medusa Point “10-Core” CPU Beats Strix “10-Core” By 29% In Single-Core & 22% In Multi-Core While Running At Just 2.0 GHz

AMD Ryzen Becomes The Top CPU Choice While Radeon Powers 1 In Every 3 Desktop Gaming GPUs Sold at Microcenter

AMD Vega Demoed, Outperforms Nvidia’s GTX 1080 – Features 8GB of HBM2 & 512GB/s Of Bandwidth

AMD Showcases Vega In Action - Running Doom In Vulkan At 4K & DeepBench GEMM

Vega's Confirmed Specs

Vega's Next Generation Compute Unit Architecture

Variable Width SIMDs, Getting More Performance Out Of Fewer Cycles

Vega, The Rumored Specs

AMD Vega Lineup

Further Reading

Trending Stories

Popular Discussions