NVIDIA GeForce GTX 980 Graphics Card and GM204 GPU Detailed – 64 ROPs, HDMI 2.0, 2048 CUDA Cores and 5.2 Billion Transistors Operating At 165W
The final specifications and the architectural details regarding the GM204 GPU fused inside NVIDIA’s next generation and flagship Maxwell graphics card, the GeForce GTX 980 have been revealed by Videocardz. NVIDIA’s GeForce GTX 980 and GeForce GTX 970 launch in just three days on Friday but we have already started looking at preliminary performance results, several non-reference graphics cards from AIB vendors along with tons of new information regarding these chips.
NVIDIA GeForce GTX 980 Graphics Card and GM204 GPU Detailed
Starting with the GM204 GPU which makes use of the second generation Maxwell core architecture that has faster per core performance than first generation Maxwell based chips (GM107) before it and has several new features which deliver better performance and great power efficency making GeForce GTX 980 one of the most efficient flagship offering in history. This is all achieved with the 28nm process node so one can imagine about the numbers we can expect when NVIDIA hops to an even lower process in the future. However, people are still worried about the fact that not being on a 20nm process doesn’t make the GTX 980 as a worthy contender to the Kepler architecture but that’s not true since the card has achieved some great feats and we will be able to see that in the performance reviews which arrive this weekend.
Alright, so the GM204 has two variants, the GM204-400 which is fused on the GeForce GTX 980 and the GM204-200 which is fused on the GeForce GTX 970. We know a lot about the GeForce GTX 970 since a few weeks now but the real flagship part to be discussed is the GeForce GTX 980’s GM204-400 chip. The GM204 chip features 4 GPC (Graphics Processing Clusters) which feature four SMM blocks each. These blocks include four logic units each which consist of 32 cores so in total, a single SMM unit results in 128 Cores while the 16 blocks available on the GM204-400 chip equate to 2048 CUDA Cores. The GM204-200 has three less SMM units which result in a lower core count of 1664 thus making it around as fast as the GeForce GTX 780 while the GTX 980 will tackle the GeForce GTX 980 with a good 15-20% performance lead.
The most critical details of the chip are the transistor number, we all remember that the GK110 chip was a performance and computing beast at 7.08 Billion transistors while the GK104 included 3.54 Billion transistors. The GM204 includes 5.2 Billion transistors crammed inside a die that measures around 398 mm2 just 2 mm2 shy of 400mm2. The GK104 and GK110 measure at around 294 mm2 and 581 mm2 respectively. The die size has been increased a lot compared to GK104 and that’s the generational predecessor of the card. The GK110 will be replaced by GM200 but that is far from launch at the moment but NVIDIA has managed to include more on the 28nm process yet keeping the power consumption at just 165W on the GTX 980 and 148W for the GTX 970 which is simply mind boggling.
Image Credits: Videocardz!
Then we have another update when it comes to ROPs and TMUs. The Texture mapping units on the GM204 are 128 which are the same as GTX 680 but the Raster Operation units are increased to 64 which are twice the amount featured on GK104. This is actually a larger update to GK110 too but the GK110 does come with a very high TMU count of 240. NVIDIA compensates this by clocking the GM204 chip hence resulting in a higher per clock performance output when it comes to texture fill rate. Maxwell was also meant to improve the way GPU handles bandwidth and they are limiting the bandwidth dependancy of their cards by adding more cache of 2 MB which is 512 KB L2 more than GK110. The GK104 had just 256 KB of L2 cache so a major update there.
The theoretical compute of the chip in single precision would be rated around 4.6 TFLOPs which is really close to the GK110 which pumps out 5.1 TFLOps while the 1144 GT/s texture fill rate is a bit low but the pixel fillrate is considerably higher at 72.1 GP/s compared to 53.3 GP/s on GTX 780 Ti.
NVIDIA has some new software side enhancements through the hardware implemented in Maxwell which include Dynamic Super Resolution which is basically a second version of down sampling that functions to increase video quality at 1080P that matches 4K resolution. There’s also Delta Color Compression which is similar to the color compression we saw on AMD’s Tonga but a more refined version which saves images in local memory to be used later on to increase memory efficiency. Then there’s Multi-Pixel Programming Sampling technology which improves randomization of each sample and reduces quantification artifacts for better geometry processing and anti aliasing filtering. An update on the display side is that GeForce GTX 980 adopts the HDMI 2.0 standard which goes in well with the new display standard of three Display Ports 1,2, 1 DVI, 1 HDMI outputs set by NVIDIA for their flagship offering.
NVIDIA GeForce GTX 980 Specifications At a Glance:
So the final specifications of the GeForce GTX 980 include 2048 CUDA Cores, 128 TMUs, 64 ROPs. The core clock is maintained at 1126 MHz core and 1216 MHz boost while the memory is clocked in at 7 GHz effective clock which results in 224 GB/s bandwidth which might be enough thanks to the lower bandwidth dependency and increased efficiency. The TDP of the card is set at 165W while the power is fed through dual 6-Pin power connectors.
The GeForce GTX 980 is making use of an update revision of the cooler introduced on the GeForce GTX Titan Black with a all black naming logo etched on the shroud near the I/O plate and a all black heatsink which can be spotted from the mirror cut out in the center of the shroud. The card obviously makes of vapor chamber which is cooler off by a blower fan. We were unable to find the Dual Axial fan design which NVIDIA had patented back a few months and was rumored to be a part of the new graphics card series but I expect the card even as it is will do a great job cooling the card considering it can dissipate heat of up to 275W while GeForce GTX 980 will have a maximum thermal dissipation power of just under 180W. So that’s a ton of cooling being supplied to the core and we can expect massive overclocking headroom for a card which is already clocked past the 1050 MHz barrier.
Back to the cooler design, the NVTTM does include some minor changes along the display ports isolating it inside the shroud entirely. One of the changes I like the most is the addition of the backplate which is carried over from the GeForce GTX Titan Z. The card features two SLI Gold fingers which will allow 4-Way SLI Multi GPU functionality. The GeForce GTX 980 is fed power through dual 6-Pin connectors and while there is space for an 8-Pin connector, NVIDIA will just feature two 6-Pin as a reference design leaving its AIB partners to do the rest in the form of custom designs. Display outputs include DVI, HDMI and three display ports which is one reason for the unusually large size of the display connector. The bracket is also updated with a new layout since the cut outs for exhaust look similar to the ones featured on the GeForce GTX Titan Z.
The PCB has been modified to a more brute design, NVIDIA can be seen using eight Samsung K4G41325FC-HC28 128M x 32. A total of eight of these modules have been featured which equate to 4 GB GDDR5 VRAM across a 256-bit bus. The voltage controller has been moved below the power connectors and the power delivery includes 5 Phases compared to 6 on the GeForce GTX 780 Ti. At the same time, we can see a large array of VRMs aside the chokes which will deliver unprecedented amount of overclocking performance even on the reference designs.
NVIDIA GeForce GTX 970 and GTX 980 Specifications:
|GeForce GTX 470||GeForce GTX 480||GeForce GTX 570||GeForce GTX 580||GeForce GTX 670||GeForce GTX 680||GeForce GTX 770||GeForce GTX 780||GeForce GTX 780 Ti||GeForce GTX 970||GeForce GTX 980|
|SM Units||14 x 32||15 x 32||15 x 32||16 x 32||7 x 192||8 x 192||8 x 192||12 x 192||14 x 192||13 x 128||16 x 128|
|Core Clock||607 MHz||700 MHz||732 MHz||772 MHz||915 MHz||1006 MHz||1046 MHz||863 MHz||875 MHz||1051 MHz||1126 MHz|
|Boost Clock||1215 MHz||1401 MHz (Shader Clock)||1464 MHz||1544 MHz (Shader Clock)||980 MHz||1058 MHz||1085 MHz||900 MHz||928 MHz||1178 MHz||1216 MHz|
|Memory||1.2 GB GDDR5||1.5 GB GDDR5||1.2 GB GDDR5||1.5 GB GDDR5||2 GB GDDR5||2 GB GDDR5||2 GB GDDR5||3 GB GDDR5||3 GB GDDR5||4 GB GDDR5||4 GB GDDR5|
|Memory Clock||3.34 GB/s||3.69 GB/s||3.80 GB/s||4.0 GB/s||6.0 GHz||6.0 GHz||7.0 GHz||6.0 GHz||7.0 GHz||7.0 GHz||7.0 GHz|
|Memory Bandwidth||133.34 GB/s||177.4 GB/s||152.00 GB/s||192.4 GB/s||192.0 GB/s||192.0 GB/s||224.5 GB/s||288.6 GB/s||336.0 GB/s||224.5 GB/s||224.5 GB/s|
|Texture Fill Rate GT/s||34||42||43.92||49.41||102.5||128.8||134||166||210||145.0||TBC|
|Power Connectors||8+6 Pin||8+6 Pin||6+6 Pin||8+6 Pin||6+6 Pin||6+6 Pin||8+6 Pin||8+6 Pin||8+6 Pin||6+6 Pin||6+6 Pin|
|DirectX 12 Support||Yes||Yes||Yes||Yes||Yes||Yes||Yes||Yes||Yes||Yes||Yes|
|Launch||March 26th 2010||March 26th 2010||December 7th 2010||November 09 2010||May 10th 2012||March 22nd 2012||May 30th 2013||May 23rd 2013||December 2013||18th September 2014||18th September 2014|
|Price||$349 US||$499 US||$349 US||$499 US||$349 US||$499 US||$349 US||$499 US||$699 US||$299 Reference