Exclusive: The Nvidia and AMD DirectX 12 Editorial – Complete DX12 Graphic Card List with Specifications, Asynchronous Shaders and Hardware Features Explained

Sep 2, 2015 at 10:08pm EDT

AMD Radeon DirectX 12 Graphic Cards List and Supported Features

Foreword: There has been a lot of debate going on about the DirectX 12 capabilities of Nvidia and AMD graphic cards recently, and allegations have been flying high left and right. While most of these are completely unfounded, the subtle nature of the technicality involved is usually misconstrued to a horrible extent; and to better support the popular narrative. In light of this, I thought it was high time that I try to tackle the beast myself and make a one click resource for everything DirectX 12 and GPU related - that is relevant to today's gamers.

Not an official poster. @Wccftech

A thorough look at AMD and Nvidia DirectX 12 support and the AotS controversy

This editorial, will not only cover all the basics and frequently asked questions about DirectX 12, it will also attempt to shed neutral light on the recent controversy that will, hopefully, steer things away from the redundant debate. This is my attempt to show just how things actually stand. It will also contain a complete list of all AMD and Nvidia DirectX12 capable base models and their respective capabilities, with these capabilities explained before hand. Ofcourse, it will not be possible (not to mention inadvisable) to go into the complete explicit details, however, I will go into more depth than the usual posts designed for gamers go.

Here is what you will find in this article:

  1. A statement of problem, which is the DirectX 12 hype

  2. A complete overview of the technicalities that are critical to understanding "DirectX 12 Support"

  3. Addressing the ASync Question: Nvidia support and AMD advantage

  4. A complete list of AMD graphic cards that support DirectX12 and the extent of support.

  5. A complete list of Nvidia graphic cards that support DirectX12 and the extent of support.

  6. Our foray into the AotS controversy and an attempt to look at the problem in a new way.

 

Eligibility Criteria: Here is our inclusion and exclusion criteria for the DX12 GPU list:

Disclaimer: Every attempt has been taken to ensure the accuracy of the data present in this piece. However, we accept the possibility of a mistake or accidental omission due to human error. If any such hiccup is spotted, please let me know and I will make sure to update accordingly at the earliest.

"DirectX 12 is a revolutionary API". This statement, and variations thereof, have been echoing throughout the tech world for the past year or so. WCCF included, everyone has hailed the upcoming API like the harbinger of a new age in gaming technology. And to a certain extent - it is true. There is however, a very big problem with this mindset. While any tech enthusiast, myself included, will attest to the fact that the DirectX 12 API makes for a very significant update, the general gaming world has been quietly taking away a very subtly different meaning from all this.

 

With hype at an all time high - even for hardware standards, it was only a matter of time before the over inflated bubble burst. There are amazing things that DirectX 12 can achieve, but magically adding power to a hardware configuration that it was not theoretically capable of in the first place, is not one of them. Confused? Let me elaborate.

Differentiating between Untapped Potential and Maximum Potential

The key difference between the hardware enthusiast and the general masses, is knowing what the words 'untapped potential' means. Contrary to what most gamers might believe, there are a very few cases in which a graphic card is being utilized completely, or at a true 100% capacity - especially in the high end spectrum. Usually, there are certain bottlenecks in place that stop that from happening, and usually, these exist in the software layer that acts as a bridge between the GPU and the end user software.

The point of DirectX 12 is to move towards an approach that has many names - 'low level', 'to the metal access', etc etc. This is basically the implementation where the bloated middle software layer (drivers) provides minimum interference, and the ability to control specific parts of the hardware directly is also handed over by the API. At the same time, more autonomy is given to the GPU. Traditionally, the graphic card is a slave of the processor and can only work as fast as the work provided by the CPU. With DX12, much of the load is lifted off of the CPU so the GPU can work towards its true potential.

So lets look at the following scenarios:

In the first case, the untapped potential of the graphic card was enormous - and DirectX 12's low level API was able to unlock it to a very, very impressive extent. In the second case, the card was already operating at its maximum operational capacity and DX12 API did not make a lot of difference. Because of the consistent over-hyping, everyone now expects DirectX 12 to be nothing short of a miracle worker.

Differentiating between the DirectX 12 'Low Level' API and DirectX 12 'Hardware Features'

Now granted, the vast majority of the cases are going to have one bottleneck or the other; that DX12 successfully overcomes. This means that most of the users, be it AMD or Nvidia, are going to benefit from the DirectX 12 API in general. But this is where another problem starts. People see the "DirectX 12" tag on a GPU and will undoubtedly expect it to use and benefit from every single "feature" that DX12 unlocks.

As far as I can see, the average gamer fails to differentiate the API from the various hardware features that it can access on a particular card. The point of DirectX 12 API, is to provide low level access and GPU autonomy capabilities. This translates into more draw calls (among various other things), more flexibility to the developers and basically universal performance gains of any given extent.

What is not universal is the availability, use and advantage of a given hardware feature. For convenience's sake, lets call it DX12 hardware features. These consist of many, some of which are available to select vendors, some of which are limited to certain graphical generations and some which are redundant to a particular IHV. Before we go any further, a short overview on feature levels is in order.

Before we get into the nitty gritty details, lets differentiate between APIs, Feature levels and HW specifications such as Resource Binding. DirectX 12 is an API or an Application Programming Interface. It is simply put, code that forms a bridge between the GPU and any end user software. Everyone is thoroughly excited about the DirectX 12 API - because its low level capabilities are a huge upgrade over its predecessor.

Low level access, as most of our readers know, means the ability of the API in question to access parts of the GPU directly. Now, we come to feature_levels. Feature levels are pre defined standards of GPU hardware capabilities and have almost nothing to do with the API in the very strict sense. The DirectX 12 API requires graphics hardware that conforms to Feature Level 11_0 (at the very least). But even after a new Feature Level is defined, many old GPUs and graphics architectures can still qualify for that feature level. For example a previously Feature Level 11_1 graphics card may very well meet all the requirements to fully support Feature Level 12_0.

The various DirectX 12 feature levels

A feature level will however, usually require a similarly named API to access its features in their entirety. So basically, all GPUs conforming to FL 11_0 through FL 12_1 can run the DirectX 12 API completely and fully. The much hyped about advantage that is the reduction of CPU overhead - everyone will get that (provided you fall in the FL 11_0 to 12_1 band). The thing is however, these new GPU had new hardware features, something that only the DirectX 12 API can finally access: so new standards had to be created: namely FL 12_0 and 12_1.

Graphic cards supporting the following feature levels can run DirectX 12. The qualifying requirement for the particular feature level itself is also given:

Now that we know what the definitions are, here is the complete specification table of all IHVs with released hardware (including the latest Skylake iGPU and GM200):

IHV Hardware Specification Comparison

WCCFTechAMDNvidiaIntel
Architecture GCN 1.0GCN 1.1GCN 1.2FermiKeplerMaxwell 1.0Maxwell 2.0HaswellBroadwellSkylake
Resource Binding 3331222112
Conservative Rasterization0000002000
Tiled Resources1221223112
Raster Order ViewsNoNoNoNoNoNoYesYesYesYes
Typed UAV Formats1220011001
Feature Level Specification11_112_012_011_0 + Partial 11_1 support11_0 + Partial 11_1 support12_012_111_111_112_0

So here is the thing. Maxwell 2.0 (GM200) has the hardware characteristics necessary to get the 12_1 stamp, so it does. However, AMD's GCN actually has Resource Binding Tier 3 for a very long time now, not to mention Typed UAV Format Tier 2 and Asynchronous shaders for parallel functions. Similarly, Intel has supported Raster Order Views since Haswell iGPUs and has been rocking it on Feature Level 11_1. To put this into perspective Nvidia's architectures supports ROV only after the GM200 Maxwell. You can clearly see that no hardware vendor has the undisputed best GPU hardware specification around.

Every IHV has a weakness or missing specification in some form or the other. So who exactly has the best relative specification all things considered? This is where it gets really tricky and also unanswerable, mostly unanswerable.

Hardware vendors DirectX 12 hardware specifications compared

The question we can however answer is: which specification (or lack thereof) will actually translate to an increased (or decreased) gaming experience at the end of the day? Here, the answer is relatively simpler to explain.

Lets start with AMD's edge. Since I will be tackling Async shaders on the next page, lets start with Resource Binding. Resource Binding is basically, the process of linking resources (such as textures, vertex buffers, index buffers) to the graphics pipeline so that the shaders can process the resource. This means that AMD's architecture is mostly limited only by memory and while this is a desirable trait, it is something that will happen out of sight, without translating to anything a gamer can observe on-screen. Similarly Typed UAV formats isn't something an end user can observe. Currently there isn't a fully developed ecosystem for these and only when VR becomes mainstream will these affect anything but a very small minority.

Asynchronous compute shaders however, is a performance enhancing feature so the benefit is not strictly based on new visual effects but on improved performance.

Intel has supported Raster Order Views since the Haswell days (fulfilling one half of the requirement for 12_1) and now with Skylake it also boasts full DirectX 12 API support with the Feature Level 12_0.

Finally, we come to Nvidia. Nvidia has something that no other IHV currently has: and that is Conservative Rasterization and Raster Order Views. While the qualifying requirement is only Tier 1, GM200 has Tier 2 Conservative Rasterization support.

Here is the thing however, Conservative Rasterization is a technique that samples pixels on screen based on the primitive in question and is much more accurate than conventional rasterization - in other words, it will make a difference to the end user in the form of special graphical effects. Conservative Raster itself will give way to many interesting graphical techniques - Hybrid Ray Traced Shadows for one.

All right, now that we have gotten that out of the way, lets begin with the heart of the current controversy: the Asynchronous Shaders and AotS benchmarks. To make this much more simpler, let me list the benchmarks tested and the basic configurations here, as well as the DirectX 11 and DirectX12 average frames per second:

Explaining the AotS benchmarks with what we know of DirectX 12

These benchmarks were mostly taken at face value and the usual frame war erupted over the raw value of these numbers. The problem is, when taking about an API that eliminates overhead, we need to look at the context as well. In both AMD tests, the numbers seem to rise, but that is because in the first test, the processor is obviously a bottleneck so the configuration had alot of untapped potential. In the second test, the processor was once again an arguable bottleneck, since Xeons are clocked pretty low.

In the second Nvidia test, DX12 also performs as expected, when coupling a reasonably powerful GPU with a decent processor. In all these three scenarios, the bottleneck of the processor was eliminated.

The actual anomalous test is the first one. With the incredibly powerful CPU and incredibly powerful GPU. Theoretically, this configuration has very little bottleneck - if any. DirectX 12 wouldn't have yielded any major performance increase because the configuration is already very much near its maximum potential. But the funny thing is, switching to DX12 actually results in a lowered value than before. That is something, that shouldn't have happened. To understand just what is going we need to look at what was happening behind the scene.

An overview of Synchronous and Asynchronous Shaders in different GPU Architectures

Now what exactly are Asynchronous Shaders? Traditionally, there is one graphical queue available for work to be scheduled. Whatever work needs to be done is scheduled in a serial order in the queue. The problem with this approach is that it usually results in bottlenecking and the GPU not working at its full capacity. For understanding's sake you can imagine the Queue as a thread. And as you might know, multi threaded approach to computation is the future. So Asynchronous Shaders is basically where there is:

A copy queue is also available, but since that is irrelevant to our current topic, I wont be going into that.

Now contrary to popular belief, Nvidia's Maxwell 2.0 does support "Asynchronous Shaders". Do bear in mind that documentation on these things is very limited - most of it comes from engineer comments and documentation on HyperQ (Nvidia's multiple queue implementation). The following data shows the Queue Engines of various AMD and Nvidia architectures:

There are two ways the extra compute threads can  be used. In a "Pure Compute" mode which will be expensive because it will require switching and a "Mixed Mode" which is what Asynchronous Shaders is all about. In all AMD GPUs with ASync enabled, the card will be running 1 Graphical Queue and atleast 8 Compute Queues. This means that tasks in-game that require compute can be offloaded onto the GPU (If and only if, it has extra horsepower to spare.) This naturally translates to the GPU becoming more autonomous where the CPU is the bottleneck or the GPU is not being used to its full potential.

As you can see, Maxwell 2.0 does support mixed mode and upto 31 compute queues in conjunction with its primary graphics Queue. No other Nvidia architecture has this capability without involving a complete mode switch. Now this where things get a bit muddy. Since there is no clear documentation, and since Nvidia has yet to release an official statement on the matter, it is alleged (read: not confirmed in any way), that Nvidia's mixed mode requires the use of a software scheduler which is why it is actually expensive to deploy even on Maxwell 2.0.

Different architectural approaches to achieving the same result: gaming excellence

There is something else that we have to consider too. The chip currently employing the Maxwell 2.0 architecture is the GM200, 204 and 206. These chips were not designed to be compute extensive. AMD's architecture on the other hand has always been exceptional in terms of compute. So using Compute threads to supplement the Graphical threads will always be better on a Radeon. That is a fact.

Picture credits: Nvidia

 

However, the question remains (and is currently unanswered) whether Nvidia cards needs ASync to achieve their maximum potential at all. There is no evidence to suggest that Maxwell architecture would benefit from ASync. There is no evidence to suggest they wouldn't benefit either. But if we are to trust the each vendor on knowing their architecture, then Nvidia, these past generations have focused on creating graphical processors that specialize in single precision and gaming performance.

Double precision and compute took a rather back seat since the Fermi era. Dynamic Parallelism is one of the examples of such technologies present in post Fermi architectures. But usually, these are only ever used in the HPC sector.  This is also one of the reasons why gamers should still focus on the maximum potential or the raw frames per second achieved by the graphic card instead of focusing on the performance gain achieved by tapping into the untapped with DX12.

ModelGPUStream ProcessorsTMUROPArchitectureResource Binding LevelConservative RasterizationTiled ResourcesRaster Order ViewsTyped UAV FormatsDirectX 12 Feature LevelQueue Engines (Graphics/Compute)
Radeon HD 7730Cape Verde LE384248GCN 1.0301No111_11/8
Radeon HD 7750Cape Verde PRO5123216GCN 1.0301No111_11/8
Radeon HD 7770Cape Verde XT6404016GCN 1.0301No111_11/8
Radeon HD 7790Bonaire XT8965626GCN 1.1302No212_01/16
Radeon HD 7850Pitcairn PRO10246432GCN 1.0301No111_11/16
Radeon HD 7870Pitcairn XT12808032GCN 1.0301No111_11/16
Radeon HD 7870 XTTahiti LE15369632GCN 1.0301No111_11/16
Radeon HD 7950Tahiti Pro179211232GCN 1.0301No111_11/16
Radeon HD 7970Tahiti XT204812832GCN 1.0301No111_11/16
Radeon HD 7990Tahiti XT x22048 x2128 x232 x2GCN 1.0301No111_11/16 x2
Radeon R7 240Oland Pro 320208GCN 1.1301No112_01/8
Radeon R7 250Oland XT384248GCN 1.1301No112_01/8
Radeon R7 260Bonaire PRO7684816GCN 1.1302No212_01/16
Radeon R7 260XBonaire XT8965616GCN 1.1302No212_01/16
Radeon R7 265Curacao PRO10246432GCN 1.0301No111_11/16
Radeon R9 270XCuracao XT12808032GCN 1.0301No111_11/16
Radeon R9 280Tahiti PRO179211232GCN 1.0301No111_11/16
Radeon R9 280XTahiti XT2/XTL204812832GCN 1.0301No111_11/16
Radeon R9 285Tonga PRO179211232GCN 1.2302No212_01/64
Radeon R9 290Hawaii PRO256016064GCN 1.1302No212_01/64
Radeon R9 290XHawaii XT281617664GCN 1.1302No212_01/64
Radeon R9 295X2Vesuvius2816 x2176 x264 x2GCN 1.1302No212_01/64 x2
Radeon R7 360Tobago7684816GCN 1.1302No212_01/16
Radeon R7 370Trinidad Pro10246432GCN 1.0301No111_11/16
Radeon R9 370XTrinidad XT12808032GCN 1.0301No111_11/16
Radeon R9 380Antigua179211232GCN 1.2302No212_01/64
Radeon R9 390Grenada Pro256016064GCN 1.1302No212_01/64
Radeon R9 390XGrenada XT281617664GCN 1.1302No212_01/64
Radeon R9 FuryFiji PRO358422464GCN 1.2302No212_01/64
Radeon R9 Fury XFiji XT409625664GCN 1.2302No212_01/64
Radeon R9 NanoFiji XT409625664GCN 1.2302No212_01/64

AMD DX12 Graphic Cards List Notes:

Model GPUCUDA Cores TMUROPArchitectureResource Binding Conservative RasterTiled ResourcesRaster Order ViewsTyped UAV FormatDX12 API Feature Level Mixed Mode Queue Engines (Graphics/Compute)
Geforce GT 440 GF10896164Fermi101No0D3D12 Support Delayed.N/A
GeForce GTS 450GF1061923216Fermi101No0D3D12 Support Delayed.N/A
GeForce GTX 460 SEGF1042884832Fermi101No0D3D12 Support Delayed.N/A
GeForce GTX 460GF1043365624Fermi101No0D3D12 Support Delayed.N/A
GeForce GTX 465GF1003524432Fermi101No0D3D12 Support Delayed.N/A
GeForce GTX 470GF1004485640Fermi101No0D3D12 Support Delayed.N/A
GeForce GTX 480GF1004806048Fermi101No0D3D12 Support Delayed.N/A
GeForce 520GF1194884Fermi101No0D3D12 Support Delayed.N/A
GeForce GTX 545 DDR3GF1161442416Fermi101No0D3D12 Support Delayed.N/A
GeForce GTX 550 TiGF1161923224Fermi101No0D3D12 Support Delayed.N/A
GeForce GTX 560GF1143365632Fermi101No0D3D12 Support Delayed.N/A
GeForce GTX 560 TiGF1143846432Fermi101No0D3D12 Support Delayed.N/A
GeForce GTX 560 Ti 448 Cores Limited EditionGF1104485640Fermi101No0D3D12 Support Delayed.N/A
GeForce GTX 570GF1104806040Fermi101No0D3D12 Support Delayed.N/A
GeForce GTX 580GF1105126448Fermi101No0D3D12 Support Delayed.N/A
GeForce GTX 590GF110 (x2)102412896Fermi101No0D3D12 Support Delayed.N/A
Geforce GT 610 GF1194884Fermi101No0D3D12 Support Delayed.N/A
Geforce GT 620GF10896164Fermi101No0D3D12 Support Delayed.N/A
Geforce GT 640 GF1161442424Fermi101No0D3D12 Support Delayed.N/A
GeForce GT 730
(128b DDR3)
GF10896164Fermi101No0D3D12 Support Delayed.N/A
Geforce GT 630 GK107 1921616Kepler202No011_0 + Partial 11_1 support1
Geforce GT 630 Rev 2GK208384168Kepler202No011_0 + Partial 11_1 support1
Geforce GT 640 Rev 2Gk208384328Kepler202No011_0 + Partial 11_1 support1
GeForce GTX 650GK1073843216Kepler202No011_0 + Partial 11_1 support1
GeForce GTX 650 TiGK1067686416Kepler202No011_0 + Partial 11_1 support1
GeForce GTX 650 Ti BoostGK1067686424Kepler202No011_0 + Partial 11_1 support1
GeForce GTX 660GK1069608024Kepler202No011_0 + Partial 11_1 support1
GeForce GTX 660 TiGK104134411224Kepler202No011_0 + Partial 11_1 support1
Geforce GTX 670GK104134411232Kepler202No011_0 + Partial 11_1 support1
GeForce GTX 680GK104153612832Kepler202No011_0 + Partial 11_1 support1
GeForce GTX 690GK104 (x2)307225664Kepler202No011_0 + Partial 11_1 support1
GeForce GT 720GK208192168Kepler202No011_0 + Partial 11_1 support1
GeForce GT 730 (GDDR5)GK208384328Kepler202No011_0 + Partial 11_1 support1
GeForce GT 740GK1073843216Kepler202No011_0 + Partial 11_1 support1
GeForce GTX 760GK10411529632Kepler202No011_0 + Partial 11_1 support1
GeForce GTX 760 TiGK104134411232Kepler202No011_0 + Partial 11_1 support1
GeForce GTX 770GK104153612832Kepler202No011_0 + Partial 11_1 support1
GeForce GTX 780GK110230419248Kepler202No011_0 + Partial 11_1 support1
GeForce GTX 780 TiGK110288024048Kepler202No011_0 + Partial 11_1 support1
GeForce GTX TitanGK110268822448Kepler202No011_0 + Partial 11_1 support1
GeForce GTX Titan BlackGK110288024048Kepler202No011_0 + Partial 11_1 support1
GeForce GTX Titan ZGK110 (x2)576048096Kepler202No011_0 + Partial 11_1 support1
GeForce GTX 750GM1075123216Maxwell 1st Generation202No112_01
GeForce GTX 750 TiGM1076404016Maxwell 1st Generation202No112_01
GeForce GTX 950GM2067684832Maxwell 2nd Generation223Yes112_11/31
GeForce GTX 960GM20610246432Maxwell 2nd Generation223Yes112_11/31
GeForce GTX 970GM204166410456Maxwell 2nd Generation223Yes112_11/31
GeForce GTX 980GM204204812864Maxwell 2nd Generation223Yes112_11/31
GeForce GTX 980 TiGM200281617696Maxwell 2nd Generation223Yes112_11/31
GeForce GTX Titan XGM200307219296Maxwell 2nd Generation223Yes112_11/31

Nvidia Graphic Cards List Notes:

This editorial serves as my first (and hopefully only) foray into the AoTs controversy that has been plaguing the inter-webs recently. As I mentioned in the beginning of this editorial, rivalry between the two giants is something that has been part and parcel of their history. With the advent of DirectX 12, it was only bound to increase tenfold.

So when something as technical as DX12 was hyped up to gigantic proportions for the laymen, I thought it was only fair that we put the 'technicality' back into the hype.

We have explained away some of the setups in which an Nvidia card appeared to gain less advantage with DX12 and we established a probable explanation for the only anomalous scenario in which it didn't.

I think I would like to quote Robert Hallock from AMD here:

“I think gamers are learning an important lesson: there’s no such thing as “full support” for DX12 on the market today. There have been many attempts to distract people from this truth through campaigns that deliberately conflate feature levels, individual untiered features and the definition of “support.”

So summarizing, all hardware vendors fully and completely support the DirectX 12 API.

No hardware vendor can claim 100% support of all hardware features and the differences are usually negligible in nature. If one is deciding by features observable by the end user and gaming experience, the vote might fall in favour of Nvidia with its Feature Level 12_1 support which will allow advanced illumination visual effects in next generation games. That said, there are ways to simulate the effects without much of a performance hit for Radeons as well.  If we are talking about performance increase (in terms of untapped potential, not maximum potential) then an argument can be made for AMD with its ASync advantage.

Also remember that developers usually code for the lowest common denominator, which means both AMD and Nvidia's edge depends entirely on how many devs use it; and the expected mean result is a win-win for owners of both vendors.

All that said and done, we will be looking out for more DX12 titles (AotS is after all a single DX12 title, and there is way too much bias involved with making a conclusion from a single data source) and seeing how they fare in terms of untapped potential that was unlocked and maximum potential (which is what we should actually be looking at). 

If you take away one thing from this editorial, let it be that there is very little black and white advantage in terms of DX12 compatibility for either vendor.

 

About the author: PC Hardware and Technology Enthusiast, Blood of Silicon (1 nm),

Follow Wccftech on Google to get more of our news coverage in your feeds.