Hardware

NVIDIA GeForce RTX 20 Series Review Ft. RTX 2080 Ti & RTX 2080 Founders Edition Graphics Cards – Turing Ray Traces The Gaming Industry

Hassan Mujtaba & Keith May • Sep 19, 2018 at 09:10am EDT

Product Info

NVIDIA GeForce RTX 2080 Ti & GeForce RTX 2080

19th September, 2018

Type

Graphics Cards

Price

$1199 US / $799 US

NVIDIA Turing GPU - Turing Streaming Multiprocessor Deep Dive

Let's take a trip down the journey to Turing. In 2016, NVIDIA announced their Pascal GPUs which would soon be featured in their top to bottom GeForce 10 series lineup. After the launch of Maxwell, NVIDIA gained a lot of experience in the efficiency department which they put a focus on since their Kepler GPUs.

Now, with an enhanced FinFET process available, NVIDIA is taking the efficiency lead beyond where it was previously possible, which is completely unrivaled by the competition. With Volta, NVIDIA focused on the AI and HPC market, but most of the features that Volta supported aren’t necessarily needed in the gaming department. Take for instance the double precision floating point execution units. With Pascal, NVIDIA diversified their consumer and HPC GPUs and this time, they are going with a more aggressive approach, completely classifying the consumer GPU in a category of its own. This is where Turing comes in, a GPU designed solely for the consumer segment.

Starting with the most significant part of the Turing GPU architecture, the Turing SM, we are seeing an entirely new graphics core. The Turing SM is made up of a combination of INT32, FP32, and the new Tensor cores.

Coming to the new execution units or cores, Turing has both INT32 and FP32 units which can execute concurrently. This new architectural design allows Turing to execute floating point and non-floating point operations in parallel which allows for up to 36% higher throughput in standard floating point operations.

The Turing SM is partitioned into four processing blocks, each with 16 FP32 Cores, 16 INT32 Cores, two Tensor Cores, one warp scheduler, and one dispatch unit. This adds to 64 FP32 Cores, 64 INT 32 Cores, 8 Tensor, 4 Wrap Schedulers and 4 Dispatch Units on a single Turing SM. Each block also includes a new L0 instruction cache and a 64 KB register file.

The four processing blocks share a combined 96 KB L1 data cache/shared memory. Traditional graphics workloads partition the 96 KB L1/shared memory as 64 KB of dedicated graphics shader RAM and 32 KB for texture cache and register file spill area. Compute workloads can divide the 96 KB into 32 KB shared memory and 64 KB L1 cache, or 64 KB shared memory and 32 KB L1 cache.

The entire SM works in harmony by using different blocks to deliver high performance and better texture caching, enabling for up to 50% better CUDA core performance when compared to the previous generation.

Many of these Turing SMs combine to form the Turing GPU. Each TPC inside the Turing GPU houses 2 Turing SMs which are linked to the raster engine. There are a total of 6 TPCs or 12 Turing SM that are arranged inside the GPC or Graphics Processing Cluster. The top configured TU102 GPU comes with 6 GPCs that are connected to 6 MB of L2 cache, ROPs, TMUs, memory controllers and NVLINK HighSpeed I/O hub. All of this combines to form the massive Turing GPU. Following are some perf figures for the top Turing graphics cards.

NVIDIA GeForce RTX 2080 TI

14.2 TFLOPS of peak single precision (FP32) performance
28.5 TFLOPS of peak half-precision (FP16) performance
14.2 TIPS1 concurrent with FP, through independent integer execution units
113.8 Tensor TFLOPS
10 Giga Rays/sec
78 Tera RTX-OPS

NVIDIA Quadro RTX 8000

16.3 TFLOPS of peak single precision (FP32) performance
32.6 TFLOPS of peak half-precision (FP16) performance
16.3 TIPS1 concurrent with FP, through independent integer execution units
130.5 Tensor TFLOPS
10 Giga Rays/sec
84 Tera RTX-OPS

In terms of shading performance which is the direct result of the enhanced core design and GPU architecture revamp, the Turing GPU offers an average uplift of 50% better performance per core compared to Pascal GPUs. In VR games, the shading performance would be a good 2x ahead than what Pascal achieved while many modern gaming titles show a ~50% lead over Pascal with Turing’s enhanced core design.

It should be pointed out that these are just per core performance gains at the same clock speeds without adding the benefits of other technologies that Turing comes with. That would further increase the performance in a wide variety of gaming applications, since we have already seen the gaming performance of a GeForce RTX 2080 to be 50% faster than the GTX 1080 on average and twice as fast with the new DLSS technology.

You can find additional information about our hardware review process and ethics policy here.

About the author: A Software Engineer by training and a PC enthusiast by passion, Hassan Mujtaba serves as Wccftech's Senior Editor for hardware section. With years of experience in the industry, he specializes in deep-dive technical analysis of next-generation CPU and GPU architectures, motherboards, and cooling solutions. His work involves not only breaking news on upcoming technologies but also extensive hands-on reviews and benchmarking.

Follow Wccftech on Google to get more of our news coverage in your feeds.

Deal of the Day

Read all comments on NVIDIA GeForce RTX 20 Series Review Ft. RTX 2080 Ti & RTX 2080 Founders Edition Graphics Cards – Turing Ray Traces The Gaming Industry

NVIDIA GeForce RTX 20 Series Review Ft. RTX 2080 Ti & RTX 2080 Founders Edition Graphics Cards – Turing Ray Traces The Gaming Industry

NVIDIA GeForce RTX 2080 Ti & GeForce RTX 2080

Type

Price

NVIDIA Turing GPU - Turing Streaming Multiprocessor Deep Dive

Related Story NVIDIA’s 96 GB RTX PRO 6000 Blackwell Is Now Over 50% More Expensive As Price Hits $13,250

Contents

Deal of the Day

Further Reading

ZOTAC Marks 20 Years With a Gold-Themed RTX 5070 Ti, Two RTX 5080 Liquid-Cooled Prototypes, & The World’s Smallest PC With A Desktop 5080

NVIDIA Moves Gaming Segment Under “Edge Computing”, Posts 29% Revenue Growth From Blackwell Workstations But Gaming GPUs Slow Down Due To “Elevated” Memory Prices

GIGABYTE Mixes Up GeForce And Radeon Shrouds, Shipping An RTX 5060 Ti With AMD Branding To A Stunned Buyer

Australia Buyers Pay Just 9.5% More For GPUs While US Shoppers Get Hit With 22% RAMpocalypse Premium