AMD Vega 10 and Vega 20 GPU Details Leak Out – 12 TFLOPs of Single Precision Compute, Dual GPU Landing in 2H 2017

Author Photo
Sep 20, 2016
Share Tweet Submit

Our colleagues over at Videocardz.com have posted some very tasty information on the upcoming Vega 10 flagship from Radeon Technologies Group, AMD. This particular leak is very significant, because not only is this the first time we have had concrete numbers on the performance of the upcoming flagship but it also confirms the existence of a dual Vega based graphics card from AMD that is in the works. AMD is expected to introduce its first Vega 10 based flagship sometime in the first half of 2017.

New information about the Vega 10 and Vega 11 GPUs – Dual Vega 10 based graphics card inbound

It will contain 64 Compute Units, which (assuming the same ratio of CU to SPs as the current iteration of GCN) translates to exactly 4096 Stream Processors. The internal codename of this GPU is GFX9. Remember all our internal nomenclature analysis? Well, it’s the same thing, only in a more appealing format. Hawaii was GFX7, Polaris is GFX8 and Vega is GFX9.

Related AMD : “RX Vega Is Just Around The Corner” – New Details Unveiled

The GPU is stated to be manufactured on the 14nm FinFET node which means you are looking at primarily GlobalFoundries based chips here (with Samsung based chips as required under the amended WSA). It will be shipping with 16GB of HBM2 memory, with roughly 512 GB/s of bandwidth (the source states 512 Gbps but I believe that is a typo).  The TDP is slated to be around 225W.

Here comes the interesting part, however, according to VCZ, the GPU will have roughly 24 TeraFLOPs of 16-bit compute. 16 bit compute is, of course, half-precision work and if Vega has native 16-bit compute support then we can find out the single precision performance by simply cutting the number in half: 12 TeraFLOPs of compute. A solid, concrete number is more than any tech journo can ask for, but it allows us to easily reverse engineer the clock speed the GPU will be clocked at.

With a single precision compute of 12 TeraFLOPs per second on a GPU with 4096 cores, and considering TeraFLOPs is a function of Clock Speed * 2 Instructions Per Clock * Cores, you are looking at a Vega 10 graphics card that is clocked at roughly 1465 Mhz. Considering the Polaris 10 GPU is clocked at 1266 Mhz however, this is a fairly significant step up from the last iteration and probably due to the increasing maturity of the 14nm node over at GlobalFoundries. On the other hand, just in case this information turns out to be inaccurate later on, I can tell you that even if we were assuming a clock rate similar to Polaris 10 (1266 Mhz), you are still looking at a single precision compute of 10.3 TeraFLOPs. Which is still, a huge performance leap over the mainstream-oriented Polaris 10.

The original roadmap, as revealed by AMD.

So how does this compare to the Nvidia high end? Well, trouble is, we do not know the specs for Nvidia’s GP102 based GPU yet. There had been a leak some time ago, but considering we were fairly sure it was fake, I never published it. However, we do know that it will probably have fewer cores than the P100 but will be clocked higher. A Vega 10 GPU, at both clock rates (1465 Mhz or 1266 Mhz) easily beats out the P100 on paper – which has a single precision performance of 9.3 TeraFLOPs for PCI-e based cards. It goes without saying that since we are comparing across two completely different architectures here, what is on paper can be different in real life. The ball is now undoubtedly in Nvidia’s court and we shall see what its high-end looks like in the coming months.

Related Rumor: Chipset Diagrams Of AMD’s X390 and X399 Motherboards Leak Out – Hint Towards The Existence of An LGA Socket

AMD will be replacing the Polaris 10 chip with the Vega 11 GPU sometimes next year. The latter will be based on 14nm FinFET just like the former but should have a higher spec count (since Polaris 10 is already on the 14nm node, it wouldn’t make sense to a new chip otherwise). So all in all, we are looking at a pretty much top to bottom revamp in the Radeon lineup – something that should really reinvigorate the company’s GPU side which has been lacking lately. I won’t go into a lengthy rant, but the timing of these GPUs is not a co-incidence. By offering brand new graphics cards for the entire customer spectrum AMD is allowing customers to build PCs which are completely red. With Zen releasing as well next year, and the company having already introduced its AM4 motherboards and Memory Sticks, pretty much the only component that won’t be manufactured by AMD will be the casing and the PSU.

The leak also states that AMD will be releasing a dual-GPU based graphics card later on in the second half of 2017. Knowing AMD, this dual Vega 10 GPU graphics card will probably have the full blown cores at lowered clock rates (TDP is said to be approximately 300W). On paper, and at a rough estimate without any knowledge of clock rates, the card should be capable of 19-22 TeraFLOPs of single-precision compute – which is an absolutely ginormous amount of graphical horsepower. 4K Eyefinity setups might be coming next year folks.

Vega 20 GPU will be landing on the 7nm node

Vega 20 will be building on AMD’s philosophy of curtailing excess and should bring significant leaps in power efficiency and performance per watt. The GPU will have the same core count as the Vega 10 GPU at 64 Compute Units and will be based on the 7nm FinFET architecture. Since the designation remains GFX9, we can easily assume that this is, in fact, a simple node shrink of the Vega 10 GPU ported over to 7nm with bigger and better memory (it will have 32 GB HBM2 with 1 TB/s of bandwidth).

A clean room in Fab 1, GlobalFoundries.

The graphics card will support the PCI Express 4.0 standard and will have a TDP of 150 Watts. It is possible that the node shrink will allow it to achieve higher clocks resulting in the card effectively beating the Vega 10 not only in terms of performance per watt but in terms of actual graphical horsepower as well.

As far as when the time horizon goes, according to foundry roadmaps, you are looking at 7nm landing by 2017 and ready for high-powered ASIC production by 2018. Knowing that foundry schedules are about as solid as the weather, this isn’t anything more than an estimated timeframe, it could easily be more (or less).

From what I can gauge from these specifications, the Vega 20 GPU will be able to offer the power of a very high-end GPU at a much more reasonable price (remember, node shrinks bring economies of scale with them!) unless AMD decides to chase margins. The 150 Watt TDP means that it will be able to fit into most power supplies and effectively put serious gaming power in the hands of the mainstream gamer.

But what about Navi 10 and 11?

Navi is the next generation architecture (next-next generation?) which will succeed Vega. However, according to this leak, it has been delayed by one year due to the introduction of the new chips in the Vega lineup. It will now be landing sometimes in 2019. Navi 10 and Navi 11 will replace Vega 10 and 11 respectively and should offer a significant upgrade over the former due to an increase in core count. Since the architecture is pretty far out into the horizon, there isn’t much point talking about it right now. That said, however, it is clear that RTG is introducing the Vega 10 based dual-GPU to keep high-end enthusiasts happy till the time Navi arrives – considering it will probably be able to rock more power than the latter (which is a single chip card).

AMD Next Generation Vega 10, 11, 20 and Dual GPU Graphics Card Rumored Lineup:

WCCFTechPolaris 10Vega 11Vega 10Vega 10 Dual GPUVega 20
Process14nm FinFET14nm FinFET14nm FinFET14nm FinFET7nm FinFET
Transistors In Billions5.7TBATBATBATBA
Stream Processors23042304+ (est.)409681924096
Clock Speed1266 MhzTBA1526 Mhz 1100 Mhz+ (est.)1800 Mhz+ (est.)
Performance5.8 TFLOPSTBA12.5 TFLOPS19 TFLOPS - 24 TFLOPs (est.)15 TFLOPS+
Memory8GB GDDR5TBA8GB/16GB HBM216-32GB HBM216-32GB HBM2
Memory Bus256bitTBA2048-bit (2 Stacks)4096-bit (2048-bit x2)4096-bit (4 Stacks)
PCI Express3.0TBA3.03.0 4.0
Bandwidth256 GB/sTBA512 GB/s1 TB/s1 TB/s
Share Tweet Submit