One of DirectX 12's many promises is to enable mixed configurations of AMD and Nvidia GPUs to work together through the explicit asynchronous multi-GPU feature. This promise was made a very long while back but it looks like that what Microsoft had promised with DIrectX12 is actually coming to fruition with Oxide's upcoming DX12 enabled Ashes Of The Singularity real-time strategy game.
There are two types of Explicit Multi-Adapter enabled through DirectX12. The first is dubbed Linked and the second Unlinked.
This is because instead of tasking each GPU with rendering a whole frame on its own, each GPU would actually render different parts of the same frame using Split Frame Rendering, SFR for short. Or even render different stages of the frame and pass the rest onto a different GPU. Which in turn negates the need of having to mirror the resources in the two separate memory pools of the GPUs. This is the limitation behind the traditional alternate frame rendering - AFR - technique used today. Which prevents developers from addressing the two separate memory pools as a larger common pool of graphics memory.
DirectX 12 Enables Cross-SLI Between AMD And Nvidia GPUs
I know what you’re thinking, all of this sounds fantastic and wonderful but does it work? Now that’s the one million dollar question, but before we get into the tests and benchmarks we should familiarize ourselves with some of the real-world technical and non-technical hurdles that stand in the way.
Let’s take a step back and examine Mantle, AMD’s own low level API. Mantle actually supports explicit asynchronous multi-GPU control. In fact this feature was used in Civilization : Beyond Earth, which is one of the Mantle's last titles. The developers of Firaxis used explicit asynchronous multi-GPU to implement split frame rendering (SFR) for CrossfireX support under Mantle. While the DX11 implementation of CrossfireX used the standard alternate frame rendering (AFR) technique explained earlier in the article.
While explicit control of multi-GPU configurations is very beneficial it also requires a fairly significant amount of effort from the developer to not only ensure that multi-GPU configurations work as intended but also that there’s a sufficient performance advantage over regular old AFR. So what it essentially does is take some of the responsibility away from the hardware vendor and the driver team and put it in the hands of the game developer to ensure that this feature is leveraged and implemented well.
A good SFR implementation requires a considerable amount of skill and talent but if done right can yield noticeable improvements over AFR, especially in reducing input-latency. It’s a different type of technical challenge for each game to implement SFR. Some game engines and game genres are more suitable for it while others aren’t. For example in a turn based strategy game input-latency wouldn’t be as important as it is in a fast paced FPS game.
Apart from the technical challenge of splitting each frame and assigning the parts to different graphics processors in the system, which can also be challenge. Especially if the different graphics processors in the system have different performance characteristics ( high-end discrete GPU vs an integrated GPU ). There’s also the challenge of dealing with different GPU architectures that support different sets of features. Here the developers could to choose to dive into the nitty gritty details of assigning different elements of each part of the rendering to the most suitable GPU. Or simply program for the biggest common denominator of the different DX12 GPU architectures made by the different hardware vendors. Interestingly, we're told that the code base to achieve this in DX12 is actually fairly clean and isn't as much of a hassle as one would first imagine.
This finally brings us to how this technology holds up today. Ryan Smith from Anandtech.com did a really amazing job of testing different configurations at different settings and resolutions in his full write-up about the technology, which you should definitely go and check out.
Credit : Anandtech.com
Interestingly, the best results are achieved when an AMD GPU is used as the main adapter, while an Nvidia GPU is used as the secondary adapter. This yields the highest average framerates consistently, even compared to multi-GPU configurations that contain two similar AMD GPUs or two similar Nvidia GPUs.
Also of note is that in the case of the GTX 680 and HD 7970. The combination actually results in a drastic performance loss when the GTX 680 is used as the main adapter. And a considerable performance improvement when the HD 7970 is used as the main adapter.
The trend then it seems to be then that AMD GPUs are best utilized as the primary or "lead" adapters as Ryan put it and Nvidia GPUs as the secondary adapters. This yielded the best results in terms of framerates and frametimes. This peculiar pattern emerged consistently throughout all the tests but is not yet well understood.
This is the first ever testable demonstration of DirectX 12's EMA technology, and it's shaping up far better than one would have reasonably expected.
Follow Wccftech on Google to get more of our news coverage in your feeds.
