NVIDIA GeForce RTX 20 Series Review Ft. RTX 2080 Ti & RTX 2080 Founders Edition Graphics Cards – Turing Ray Traces The Gaming Industry
NVIDIA GeForce RTX 2080 Ti & GeForce RTX 208019th September, 2018
NVIDIA Turing GPU - Turing GDDR6 Memory Subsystem Deep Dive
Turing is a very powerful core of its generation and there's nothing like it. High-performance GPUs of this caliber require to be fed by lots of bandwidth. NVIDIA's Volta GPUs are fed by the fastest memory standard in the industry aka HBM2 but while they make sense from an HPC standpoint, they are nowhere near the price to be featured on consumer level products. And GDDR5 has already exceeded its maximum potential with G5X. So this is where GDDR6 enters the industry.
While GDDR6 follows an evolutionary path over GDDR5 and GDDR5X memory, there are still some significant changes in the underlying architecture to boost memory bandwidth while saving power. This makes the VRAM a viable option for next-generation consumer graphics cards such as NVIDIA's upcoming line of the GeForce products. Furthermore, unlike GDDR5X which was only supported and produced by Micron, GDDR6 has the backing of all three players which includes Samsung, SK Hynix, and Micron. NVIDIA is continuing their partnership with Micron and featuring their memory on the GeForce RTX cards but if they were ever to run in a shortage, there won't be an issue as NVIDIA can select from other manufacturers too.
For those who like to know what difference is between GDDR5 and GDDR6, we know from the official specifications published by JEDEC, that both memory standards are not a whole lot different from each other but they aren’t the same thing either. The GDDR6 solution is built upon the DNA of GDDR5X and has been updated to deliver twice the data rate and denser die capacities.
While the new memory technology would be very similar to GDDR5X, there are a few differences of which the major ones include:
- The introduction of an FBGA180 ball package with increased pitch
- A dual channel architecture
There are a lot of design changes that went in developing GDDR6 to achieve the faster transfer speeds, higher bandwidth and in a package that consumers just around the same power or even lower. Samsung states that GDDR6 has 35% lower power input than GDDR5 DRAM.
Coming to the specifications in detail, the Samsung 16 Gb GDDR6 memory die will be built on the 10nm process node which Samsung is calling as the most advanced memory node to date. It will double the density of their GDDR5 solution which was composed of a 20nm 8 Gb die. According to Samsung, their solution will be operating at up to 18 Gbps against a previous standard speed of 16 Gbps and that is a big deal here. Each die will be able to deliver a data transfer rate of 72 Gbps and hold a capacity of 2 GB VRAM. The solution will be able to do all of this with 35% lower power input at just 1.35V compared to 1.55V
This means that a solution based on a 384-bit interface and surrounded by 12 DRAM dies could feature up to 24 GB of VRAM while a 256-bit solution can house up to 16 GB of VRAM. That’s twice the VRAM capacity as current generation cards. While VRAM is one thing, the maximum bandwidth output on a 384-bit card can reach a blistering fast 672 GB/s while the 256-bit solution can reach a stunning 448 GB/s transfer rate on existing 14 Gbps dies which are in full production.
GPU Memory Technology Updates
|Graphics Card Name||Memory Technology||Memory Speed||Memory Bus||Memory Bandwidth||Release|
|NVIDIA GeForce GTX 1080||GDDR5X||10.0 Gbps||256-bit||320 GB/s||2016|
|NVIDIA GeForce RTX 2080||GDDR6||14.0 Gbps||256-bit||448GB/s||2018|
|AMD Radeon RX Vega 64||HBM2||1.9 Gbps||2048-bit||483 GB/s||2017|
|AMD Radeon R9 Fury X||HBM1||1.0 Gbps||4096-bit||512 GB/s||2015|
|NVIDIA Titan Xp||GDDR5X||11.4 Gbps||384-bit||547 GB/s||2017|
|NVIDIA Titan V||HBM2||1.7 Gbps||3072-bit||652.8 GB/s||2017|
|NVIDIA GeForce RTX 2080 Ti||GDDR6||14.0 Gbps||384-bit||672GB/s||2018|
|NVIDIA Tesla P100||HBM2||1.4 Gbps||4096-bit||720 GB/s||2016|
|NVIDIA Tesla V100||HBM2||1.7 Gbps||4096-bit||901 GB/s||2017|
NVIDIA Turing GPUs With Better Memory Compression – Effective Memory Bandwidth Increased Up To 50% Over Pascal GPUs, Over 1.5 TB/s
One of the key improvements of Pascal over Maxwell was the faster memory compression algorithms which delivered very high bandwidth by using various compression and caching techniques.
With Turing, we are looking at the third generation of memory compression architecture which is said to effectively deliver up to 50% boost in effective bandwidth when compared to Pascal GPUs. We know that the Pascal GeForce GTX 1080 Ti memory bandwidth was boosted to 1.2 TB/s over the raw 484.4 GB/s bandwidth when using these algorithms and with Turing, NVIDIA is saying that we should expect 50% more effective bandwidth with Memory Compression 3.0.
Since Turing GPU already have higher raw bandwidth compared to Pascal GPUs (RTX 2080 Ti with 616 GB/s), we can expect the effective bandwidth using the new algorithm to reach past 1.5 TB/s which is very good considering it would help games deliver even better performance on higher resolutions which the graphics cards are aiming at.