⋮    ⋮  

AMD’s Next Generation Navi GPU Will Be Launching in August 2018 at SIGGRAPH – Monolithic vs MCM Design Yields Explored

Author Photo
Oct 9, 2017
19Shares
Submit

A very interesting report published by TweakTown states that they expect AMD to release their next generation professional graphics card at SIGGRAPH 2018. It will be based on the Navi GPU and fabricated on the 7nm platform.  Considering the Radeon Pro SSG has yet to make an appearance on the shelves after it was unveiled this would mean AMD speeding up its roll out to counter NVIDIA’s increasing influence on the graphics market.

AMD’s Navi GPU will be ready in the July-August 2018 time frame with professional graphics card launching at SIGGRAPH 2018

RTG head, Raja Koduri has also taken a sabbatical till early 2018 which means that we really weren’t expecting Navi to land so soon. However, if the report is to be believed, we should expect the architecture to be ready sometime in the July-August timeframe of next year. This of course depends on the foundries having the 7nm process ready for high performance ASIC fabrication by that time. Since the next generation architecture is pretty much in its infancy stage, we don’t really know much about it except that it will be fabricated on the 7nm process.

AMD made a brilliant come back in the x86 department and against all odds was able to place itself firmly as the best bang for the buck mainstream x86 king out there. The same however, cannot be said for GPUs which are currently lacking in power and are also in short supply. As we have reported previously however, we expect this to change soon as custom AIB variants enter the ring and SK Hynix starts manufacturing HBM2 memory (bringing costs down for all parties).

NVIDIA is readying its GeForce GTX 1070 Ti for the holiday season as we speak and if AMD does not get custom AIB variants sorted before then, it might be too late for Vega. A (Vega) refresh is something that could solve the hiccups of the current generation and achieve something the original was never able to do. It is clear that the current lineup of Vega will not hold up against any form of Pascal refresh – should NVIDIA decide to make one. However, if AMD chooses to go for Navi this early in the game, it might have to forgo any form of Vega refresh and go straight for the price. It would be a risky move but could potentially put it back in full form if successful.

Exploring the multi-chip module die philosophy for GPUs

Here’s the thing however, AMD has proven itself to be exceptionally good at creating MCM based products. The Threadripper series (the 1920X and 1950X at any rate) were absolutely disruptive to the HEDT market space. They single handedly turned what was usually a 6-core and very expensive affair to a 16 core affordable combo. The power of servers and Xeons was finally in the hands of average consumers, so why can’t the same philosophy work for GPUs as well?
Well, theoretically speaking, it should work better in all regards for GPUs which are parallel devices than for CPUs which are serial devices. Not only that but you are looking at massive yield gains from just shifting to an MCM based approach instead of a monolithic die. A single huge die has abysmal yields, is expensive to produce and usually has high wastage. Multiple chips totaling the same die size would offer yield increases straight of the bat.

I took the liberty to do some rough approximations using the lovely Silicon Edge tool and was not surprised to see instant yield gains. The Vega 64 has a die measuring 484mm² which equates to a die measuring 22mm² by 22mm². Splitting this monolothic die into 4x 11mm² by 11² gives you the same net surface area  (484mm²) and will also result in yield gains. How much? Let’s see. According to the approximation, a 200mm wafer should be able to produce 45 monolithic dies (22×22) or 202 smaller dies (11×11). Since we need 4 smaller dies to equal 1 monolithic part, we end up with 50 484mm² MCM dies. That’s a yield gain of 11% right there.

The yield gains are even larger for bigger chips. The upper limit of lithographic techniques (with reasonable yields) is roughly 625mm². On a single 200mm wafer, we can get about 33 of these (25×25) or 154 smaller dies (12.5×12.5). That gives us a total of 38 MCM based dies for a yield increase of 15%. Now full disclosure, this is a very rough approximation and does not take into account several factors such as packaging yields, complicated high level design, etc but the basic idea holds well. But at the same time, it also does not take into account increased gains by lowered wastage – a faulty 625mm² monolithic die is much more wasteful than a single 156mm² one!

Long story short, AMD is perfectly capable of creating an MCM based GPU and would even get some serious yield benefits out of this if it chooses to run with this for Navi. Considering the 7nm node is very much in the early bleeding edge stage, yields can’t be too good even by mid-2018 for very large high performance ASICs. Switching to smaller dies for an MCM based approach would solve that problem and even allow it to surpass the total 600mm² surface area limitation of monolithic dies. NVIDIA is also actively pursuing this path for the same reasons.

Submit