Hardware PC

AMD Unveils Radeon Pro Vega 64, Vega 56 & Vega Die Shot – 25-22 TFLOPS, 400GB/s, 16-8 GB HBM2 & 4096-3584 SPs

Khalid Moammer • Jun 5, 2017 at 07:49pm EDT

AMD has officially introduced today the Radeon Pro Vega, a powerful next generation professional graphics card that will power Apple's new iMac pro system.

AMD Vega 10 Die Shot Detailed

Before we get back to AMD's Radeon Pro Vega, let's first discuss what sits at the heart of every Radeon Pro Vega graphics cards and that's the Vega 10 GPU. The very same GPU that will power AMD's upcoming Radeon Vega Frontier Edition and Radeon RX Vega graphics cards.

The Vega 10 GPU is significantly larger than the Polaris 10/20 chips that the RX 480 and RX 580 are based on. It features 256 texture mapping units and 64 next generation Vega compute units arranged in two islets, each housing two compute engines. Every compute engine includes two distinct compute clusters. Each of those clusters features 512 stream processors and 32 texture mapping units. The chip in its entirety has a total of 4096 stream processors and 256 texture mapping units.

On the front-end side of things there are 64 render output units that make up 16 distinct render back-ends that connect to the 2048-bit wide-IO HBM2 memory interface. The whole Vega 10 die sits on an interposer and is stacked in a 2.5D fashion with two HBM2 stacks. Every stack can be configured with up to 8 GB of memory for a total of 16 gigabytes of memory for both stacks.

For a more detailed look at the Vega 10 GPU specs make sure to check out our in-depth Vega 10 spec break down here.

AMD Vega 10 GPU Specifications

GPU	Polaris 10 XT	Vega 10 XT
Process Node	14nm	14nm
Shader Engines	4	4
Stream Processors	2304	4096
Performance	5.8 TFLOPS 5.8 (FP16) TFLOPS	12.5 TFLOLPS 25 (FP16) TFLOPS
Render Output Units	32	64
Texture Mapping Units	144	256
Hardware Threads	4	8
Memory Interface	256-bit	2048-bit
Memory	8GB GDDR5	Up To 16GB HBM2

AMD Radeon Pro Vega 64 & Radeon Pro Vega 56

The Radeon Pro Vega graphics card features AMD's latest and greatest Vega architecture and Vega 10 GPU. The card comes in two flavors, a full-fat version and a skimmed version. The latter is what will come standard with all iMac Pros, while the Vega 64 will be an option users can upgrade to. The Radeon Pro Vega 64 will feature the cream of the crop "Vega 10 XT" GPU configuration with all of its 64 compute units, hence the name, and 16GB of HBM2.

The Radeon Pro Vega 56 is based on a cut-back "Vega 10 Pro" GPU with only 56 compute units instead of the full 64. This leaves this variant with 3584 GCN stream processors, 512 short of the Radeon Vega Frontier Edition that AMD is launching on the 27th of this month and its bigger brother the Vega Pro 64. Even this "skimmed" version delivers a whopping 22 TFLOPS of graphics horsepower, 400GB/S of memory bandwidth and will come with 8 gigabytes of 2nd generation vertically stacked High Bandwidth Memory.

AMD Radeon Vega Lineup:

Graphics Card	Radeon R9 Fury X	Radeon RX 480	Radeon RX Vega Frontier Edition	Radeon RX Vega 64	Radeon RX Vega 56(	Radeon Pro Vega 64	Radeon Pro Vega 56
GPU	Fiji XT	Polaris 10	Vega 10	Vega 10 XTX/XT	Vega 10 XL	Vega 10	Vega 10
Process Node	28nm	14nm FinFET	FinFET	FinFET	FinFET	FinFET	FinFET
Compute Units	64	36	64	64	56	64	56
Stream Processors	4096	2304	4096	4096	3584	4096	3584
Performance	8.6 TFLOPS 8.6 (FP16) TFLOPS	5.8 TFLOPS 5.8 (FP16) TFLOPS	13 TFLOLPS 26 (FP16) TFLOPS	Up to 13+ TFLOPS 26+ (FP16) TFLOPS	TBA	~13 TFLOLPS ~25 (FP16) TFLOPS	11 TFLOLPS 22 (FP16) TFLOPS
Texture Mapping Units	256	144	256	256	TBA	256	224
Render Output Units	64	32	64	64	TBA	64	64
Memory	4GB HBM	8GB GDDR5	16GB HBM2	TBA	TBA	16GB HBM2	8GB HBM2
Memory Bus	4096-bit	256-bit	2048-bit	2048-bit	2048-bit	2048-bit	2048-bit
Bandwidth	512GB/s	256GB/s	484GB/s	TBA	TBA	TBA	400GB/s
TDP	275W	150W	300-375W	TBA	TBA	TBA	TBA
Launch	2015	2016	June 2017	July 2017	July 2017	December 2017	December 2017
Price	$649 US	$199 (4 GB) $229 (8 GB)	$999 (Reference) $1499 (Liquid)	$499 (Reference) $549 (Limited Air) $599 (Liquid) $649 (Liquid LE)	$399	TBD	TBD

The Vega Architecture

High Bandwidth Cache And Unique Memory Sub-System

With the Vega architecture AMD is introducing several new cutting edge technologies, chief among which is a brand new unique memory engine. In Vega 10 the HBM2 storage acts as a superfast cache thanks to a specialized processor that AMD dubs the High Bandwidth Cache Controller. The HBCC works to seamlessly stream data in and out of the memory, allowing Vega GPUs to have an insanely large address space of up to 512TB. This address space is only limited by the system's overall storage space.

Vega Next Generation Compute Engine

The next generation compute unit the company is debuting with Vega can execute half precision 16-bit floating point ops at twice the rate of FP32, which software can opportunistically take advantage of to increase throughput and reduce the thermal and power footprints of the GPU.

Next-Gen Compute Units (NCUs) provide super-charged pathways for doubling processing throughput when using 16-bit data types.¹ In cases where a full 32 bits of precision is not necessary to obtain the desired result, they can pack twice as much data into each register and use it to execute two parallel operations. This is ideal for a wide range of computationally intensive applications including image/video processing, ray tracing, artificial intelligence, and game rendering.

Geometry Engine

Vega also features a new programmable geometry engine that delivers twice the performance per clock. In combination with the engine’s new primitive shader discard capability Vega is now significantly faster at tessellation and rendering of complex geometry and detail rich scenes.

The most challenging workloads for a GPU can present it with millions of geometry primitives per frame, all of which must be evaluated to determine their contribution to the final image. New primitive shader technology allows Radeon Pro Vega graphics to perform geometry culling at an accelerated rate, eliminating unnecessary work for the rest of the GPU. An advanced workload distribution mechanism then assigns processing tasks to the available pipelines in a way that maximizes their utilization and avoids idle time. The result is Radeon Pro Vega graphics is capable of rendering extremely complex 3D models and scenes smoothly in real time.

The gemoetry pipeline also includes a new Primitive Discard Accelerator that detects parts of the gemoetry that are obscured by other objects or sit outside the scene and discards them, saving power and performance. The PDA ensure only visible parts of the scene are rendered and no energy is wasted on rendering invisible geometry. The issue of wasting cycles on rendering the invisible has led to unnecessarily slow performance in numerous games including Crysis 2, where it would make GPUs wastefully tessellate entire oceans of invisible water hidden below the surface.

Pixel Engine

Another key part of the Vega architecture is AMD's brand new pixel engine which is able to break work down into batches that then can enter the cache directly rather than reside in memory. This saves power, cycles, increases overall bandwidth and renders the scene faster.

Another clever technology that will be debuting with the Vega architecture is the shade-once technology which works just like the Primitive Discard Accelerator but on the pixel scale. It analyses pixels early in the graphics pipe and discards any that are hidden behind other objects in the scene. Again saving power, cycles and rendering the scene faster.

Another key advantage with the new Pixel engine is the fact that AMD has now linked it directly to the on-chip cache rather than the off-chip memory. This approach allows for some key optimization opportunities that developers are already familiar with on the gaming consoles.

Vega Architecture Key Features

– 4x Power Efficiency
– 2x Peak Throughput/Performance Per Clock
– High Bandwidth Cache
– 2x Bandwidth per pin
– 8x Capacity Per stack (2nd Generation High Bandwidth Memory)
– 512TB Virtual Address Space
– Next Generation Compute Engine
– Next Generation Pixel Engine
– Next Generation Compute Unit optimized for higher clock speeds
– Rapid Packed Math
– Draw Stream Binning Rasterizer
– Primitive Shaders

You can read about the Vega architecture in full detail here.