⋮    ⋮  

Nvidia Geforce and AMD Radeon Graphic Cards Memory Analysis


Memory has always been one of the more important components of a graphics card. Its importance, in recent times has become even more prominent – particularly thanks to the advent of HBM technology. While the focus of the industry has shifted to that of maximizing bandwidth while retaining an acceptable physical footprint – there lurks in the shadows a menace that could turn out to have far reaching consequences: power consumption of memory. I thought it was time that we do a short editorial on the effects of bandwidth and power consumption.

The untold story of memory, bandwidth and power consumption in modern graphic cards

As most of you will know, the memory standard for graphic cards of the past generation is GDDR5, and while many future offerings will undoubtedly retain this storage type, there is a new player in town: High Bandwidth Memory. HBM or stacked DRAM (which comes in many different names and flavors) is a recent innovation - with significantly increased bandwidth but drastically lowered power consumption. The entire point of HBM was to bring high bandwidth to an affordable level (in terms of power cost). And while they have pretty much succeeded in doing that for the near future, the problem persists when talking about the long term horizon. Nvidia had their GPU Technology Theater conference at SC15 a while back and Dr. Stephen W. Keckler, Senior Director of Architectural Research (at Nvidia) mentioned that the problem of the memory power consumption remains unsolved - even with HBM.

Nvidia Looming Memory Crisis SC15

The graph shows an exponentially increasing trend in power consumption of the two memory types - as bandwidth increases. We can see the structural break in the graph where the industry shifts from GDDR5 to HBM, but the end result remains the same - albeit delayed. Now why is this a problem? one might ask. In terms of the TDP enthusiast rigs are used to - these numbers are well within range. But you have to realize that a GPU which would require so much bandwidth would consume quite a lot on its own too - and the combined TDP will not be within acceptable ranges. The energy consumed by a GPU is a very important number, and if just the memory requires such a large amount of power than the total number would be pretty huge.

The power of graphic cards is increasing at an unrelenting pace which means that within a few more years we will require bandwidth in the region of a few thousand GB/s - which would mean that even with the energy efficiency afforded by stacked DRAM, memory power consumption will rise above the 100W mark - which considering the fact that the GPU themselves will be sipping quite alot power on their own - means total wattage above the 250W mark high end cards currently employ.

Some of our readers reading this will probably be wondering why we need bandwidth in the thousands of GB/s. Well, you have to remember the fact that graphic cards are actually processors that have huge clusters of incredibly small cores. You can call them CUDA cores, or you can call them stream processors - either way, it is these tiny processors that require the bandwidth. Now while the number of these cores has been increasing at a very steady rate, the bandwidth afforded to them has not been able to keep pace. To elaborate the statement, we calculated the bandwidth available per core of recent Nvidia and AMD graphic cards. A very clear trend is immediately visible.

Bandwidth per core analysis of Nvidia and AMD graphic cards

Bandwidth Per Core AMD Radeon Graphic Cards 1Bandwidth per core of different Radeons, with the number of stream processors in ascending order.

Bandwidth Per Core Nvidia Geforce Graphic CardsBandwidth per core of various Geforce cards, with the number of CUDA cores sorted in ascending order. 

You will notice that the bandwidth available to each core decreases as the GPU grows more powerful in nature. This is because of the inherently slow pace of current memory technology. Interestingly, we can see that AMD has been able to keep the gradient of the Bandwidth Per Core relatively constant while as Nvidia has just a slightly steeper gradient. Nvidia's card range in the interval of 107MB/s to 120 MB/s per core going as high as 160 MB/s per core on the GTX 760.  This is ofcourse something we have known all along - AMD is more generous with allotting bandwidth to its graphic cards than Nvidia - which uses color optimization technology to compensate for the lack of bandwidth. AMD cards range from 110 MB/s to 140 MB/s per core, going as high as 175 MB/s per core for the R9 370. Interestingly, the only card in our calculations which dropped below the 100 MB/s per core mark is the R9 285 - which has a number of 98 MB/s per core.

Lets explore the power dilemma a bit more. The power consumption of volatile memory is a function of the clock speed and bus width (aka bandwidth). According to a statement by AMD, the latest version of GDDR5 (on average) uses about 1 watt per 10.66 GB/s of bandwidth whileas HBM is good for about 35 GB/s per watt. This is a phenomenal, nearly 3.5x, increase in power efficiency - but is it enough? We computed the approximate power consumption of the various graphic cards and arrived at some very interesting conclusions.

Memory power consumption trend of AMD and Nvidia graphic cards

As expected with Nvidia, the power consumed by the memory is increasing at an approximately proportional rate to the average power of the graphic card. Keep in mind that cores across generations are not completely comparable (the average Maxwell core is 15%-32% more powerful depending on the card) so the actual number (after adjusting for performance differences) would be slightly higher. We see that on average an AMD card uses around 15W to 36W to power its GDDR5 memory standard whileas Nvidia cards use around 18W-32W depending on the exact card. With GDDR5 - the maximum amount of wattage so far is near the 40W mark (after accounting for real world line losses). With HBM however, things take an altogether different turn. HBM memory uses only 15Ws and provides more bandwidth than the highest clocked GDDR5. Infact if you look at the graph you will notice that in the case of the Fury lineup, the curve breaks from its expected path and actually slopes downwards - indicating the structural break in trend we talked about.

By sticking to the HBM standard we can compute what the bandwidth will be like at the power levels stated by Nvidia. At 120W an HBM (at today's efficiency) will be able to output 4200 GB/s - which is an absolutely huge number. Unfortunately however, efficiency does not scale linearly and we might as well be looking at a factor of 20-25 GB/s per watt at thorough puts that high. Which gives us a number around 2400 GB/s - a much more reasonable number which we can expect to see in the near future. Currently, the power consumption of the memory ranges anywhere from 8-15% of the total TDP of a GPU but as the bandwidth increases - this number will go up as well. Ofcourse, there are many alternative technologies already in the pipeline - including standards being worked on by Intel, Rambus and Micron. This projection does not (and cannot) take into account disruptive new technologies which reset the curve once more - a possibility that is more than likely.

Nvidia Geforce - Memory Analysis

ModelCUDA CoresBandwidthBandwidth per CoreMemory TDPTotal TDP% of TDP
GeForce GTX 7601152192.3 GB/s0.1669 GBpc18W170W10.6%
GeForce GTX 760 Ti1344192.3 GB/s0.1431 GBpc18W170W10.6%
GeForce GTX 7701536224 GB/s0.1458 GBpc21W230W9.1%
GeForce GTX 7802304288.4 GB/s0.1252 GBpc27W250W10.8%
GeForce GTX 780 Ti2880336.5 GB/s0.1168 GBpc32W250W12.6%
GeForce GTX Titan2688288.4 GB/s0.1073 GBpc27W250W10.8%
GeForce GTX Titan Black2880336.5 GB/s0.1168 GBpc32W250W12.6%
GeForce GTX Titan Z5760336.5 GB/s0.0584 GBpc32W375W8.4%
GeForce GTX 75051280 GB/s0.1563 GBpc8W55W13.6%
GeForce GTX 750 Ti64088 GB/s0.1375 GBpc8W60W13.8%
GeForce GTX 950768106 GB/s0.1380 GBpc10W90W11.0%
GeForce GTX 9601024112 GB/s0.1094 GBpc11W120W8.8%
GeForce GTX 9701664196 GB/s + 28 GB/s0.1178 GBpc21W145W14.5%
GeForce GTX 9802048224 GB/s0.1094 GBpc21W165W12.7%
GeForce GTX 980 Ti2816336 GB/s0.1193 GBpc32W250W12.6%
GeForce GTX Titan X3072336 GB/s0.1094 GBpc32W250W12.6%

AMD Radeon - Memory Analysis

ModelStream ProcessorsBandwidthBandwidth per CoreMemory PowerTDP% of TDP
Radeon R9 270X1280179.2 GB/s0.1400 GBpc17W180W9.3%
Radeon R9 2801792240 GB/s0.1339 GBpc23W250W9.0%
Radeon R9 280X2048288 GB/s0.1406 GBpc27W250W10.8%
Radeon R9 2851792176 GB/s0.0982 GBpc17W190W8.7%
Radeon R9 2902560320 GB/s0.1250 GBpc30W250W12.0%
Radeon R9 290X2816320 GB/s0.1136 GBpc30W250W12.0%
Radeon R9 295X25632640 GB/s0.1136 GBpc60W500W12.0%
Radeon R7 360768104 GB/s0.1354 GBpc10W100W9.8%
Radeon R7 3701024179.2 GB/s0.1750 GBpc17W110W15.3%
Radeon R9 370X1280179.2 GB/s0.1400 GBpc17W180W9.3%
Radeon R9 3801792182.4 GB/s0.1018 GBpc17W190W9.0%
Radeon R9 3902560384 GB/s0.1500 GBpc36W275W13.1%
Radeon R9 390X2816384 GB/s0.1364 GBpc36W275W13.1%
Radeon R9 Fury3584512 GB/s0.1429 GBpc15W275W5.3%
Radeon R9 Fury X4096512 GB/s0.1250 GBpc15W275W5.3%
Radeon R9 Nano4096512 GB/s0.1250 GBpc15W175W8.4%