AMD Navi GPUs Will Not Use MCM Design, Feature Single Monolithic Die Instead, Reveals RTG SVP – Yet To Conclude If MCM Can Be Used in Traditional Gaming Graphics Cards
It looks like AMD is going to stick with traditional monolithic dies and not aim towards Multi-Chip Module (MCM) solutions as far as their next-generation GPUs are concerned. This was revealed to PCGamesn in an interview with the SVP of AMD RTG.
AMD RTG SVP: Yet To Conclude If MCM Can Be Used For Gaming Graphics Cards, Looking Into It, AMD Navi GPUs Will Stick With Traditional Designs
The interview was done with the Senior Vice President of Engineering at AMD Radeon Technologies Group, David Wang. Upon being asked whether the AMD Navi GPUs would use an MCM (Multi-Chip Module) approach, David replied that while they are looking into the MCM approach, they haven't yet concluded whether that is a viable approach for traditional gaming graphics cards. Following is the quote from PCGamesn:
“We are looking at the MCM type of approach,” says Wang, “but we’ve yet to conclude that this is something that can be used for traditional gaming graphics type of application.” via PCGamesN
We know that GPUs are designed years in advance and once the designs are finalized, there's little you could do in terms of design change since companies are on a tight schedule and engineering teams have to start working on the next design. We saw it with the Vega GPUs which AMD started designing as soon as Raja Koduri took the helm of the Radeon Technologies Group back in 2015. A year later, we saw the company celebrating a key development milestone in 2016, a year prior to the release of Radeon RX Vega GPUs.
We know that Navi GPUs are headed for launch next year and most of the design work is already completed with the development phase to begin really soon. AMD has already got experience on the 7nm process since their Vega 20 parts based on the new process are heading out early next year. But as much as we wanted to see MCM die on the Navi GPUs, I guess we have to wait a bit longer.
With Navi GPUs, AMD is going to stick with the traditional monolithic design that we see on all modern GPUs. Unlike the MCM approach that AMD is taking on their HEDT Threadripper and server EPYC parts, the GPUs are yet to use the full potential of AMD's Infinity Fabric, something which Raja Koduri wanted to implement on their next-gen Radeon parts. Unfortunately, Raja Koduri left AMD for Intel as the Chief architect of their core and visual computing group and has confirmed to be working on Intel's first discrete graphics cards aimed at the gaming market for a 2020 release.
“To some extent you’re talking about doing CrossFire on a single package,” says Wang. “The challenge is that unless we make it invisible to the ISVs [independent software vendors] you’re going to see the same sort of reluctance.
“We’re going down that path on the CPU side, and I think on the GPU we’re always looking at new ideas. But the GPU has unique constraints with this type of NUMA [non-uniform memory access] architecture, and how you combine features... The multithreaded CPU is a bit easier to scale the workload. The NUMA is part of the OS support so it’s much easier to handle this multi-die thing relative to the graphics type of workload.”
So, is it possible to make an MCM design invisible to a game developer so they can address it as a single GPU without expensive recoding?
“Anything’s possible…” says Wang.
“That’s gaming” AMD’s Scott Herkelman tells us. “In professional and Instinct workloads multi-GPU is considerably different, we are all in on that side. Even in blockchain applications we are all in on multi-GPU. Gaming on the other hand has to be enabled by the ISVs. And ISVs see it as a tremendous burden.”
Does that mean we might end up seeing diverging GPU architectures for the professional and consumer spaces to enable MCM on one side and not the other?
“Yeah, I can definitely see that,” says Wang, “because of one reason we just talked about, one workload is a lot more scalable, and has different sensitivity on multi-GPU or multi-die communication. Versus the other workload or applications that are much less scalable on that standpoint. So yes, I can definitely see the possibility that architectures will start diverging.”
Now we have had multiple topics on what it takes for an MCM die to work, what the yields would be like and what performance estimates we should expect. The way an MCM die works is that it uses several different dies connected together through a high-speed interconnect. On EPYC chips, AMD connects four Zen Core complexes with their infinity fabric interconnect. These small dies form together to make a high-core count chip that runs efficiently and on par or better than single monolithic chips. AMD's EPYC has even led Intel to start developing their own MCM solution. Also, NVIDIA is researching their own MCM solutions for future GPU architectures, which will probably end up in server space first before we ever get to see them on consumer cards.
But on the gaming graphics card side, David mentions that while the hardware implementation exists to make it work, the software side still needs a lot of updating. It's quite the same thing as Crossfire or any other multi-GPU interface that exists for gamers. Unless there's proper software support, you won't see any gains and with MCM, you are basically trying to make multiple GPUs look like a single GPU. You already know how well things went with SLI and Crossfire which are literally dead and have nill support from game developers. But the issue only exists with gaming industry since, in the server space, multiple GPUs can work together seamlessly with proper performance scaling, unlike Crossfire and SLI. So for Navi, an MCM die seems out of the equation but hopefully, AMD will be able to deliver a working solution in their Radeon cards in future.