NVIDIA GeForce RTX 3080 10 GB “Ampere” Graphics Card Review

Hassan Mujtaba & Keith May

•

Sep 16, 2020 at 08:59am EDT

Chinese Miners Forced To Sell Huge Quantities of Graphics Cards After Crypto GPU Mining Downfall

The Wccftech Test Bench

Keeping their tradition alive of launching a new graphics architecture every two years, this year, NVIDIA introduces its Ampere GPU. The Ampere GPU is built upon the foundation set by Turing. Termed as its biggest generational leap, the NVIDIA Ampere GPUs excel previous generations at everything.

The Ampere GPU architecture has a lot to be talked about in this review, but so does the new RTX lineup. The Ampere lineup offers faster shader performance, faster ray tracing performance, and faster AI performance. Built on a brand new process node and featuring an architecture designed from the ground up, Ampere is a killer product with lots of numbers to talk about.

The fundamental of Ampere was to take everything NVIDIA learned with its Turing architecture and not only refine it but to use its DNA to form a product in a completely new performance category. Tall claims were made by NVIDIA when they introduced its Ampere lineup earlier this month & we will be finding out whether NVIDIA hit all the ticks with its Ampere architecture as this review will be your guiding path to see what makes Ampere and how it performs against its predecessors.

Today, we will be taking a look at the NVIDIA GeForce RTX 3080 Founders Edition graphics card. The card was provided by NVIDIA for the sole purpose of this review & we will be taking a look at their technology, design, and performance metrics in detail.

NVIDIA GeForce RTX 30 Series Gaming Graphics Cards - The Biggest GPU Performance Leap in Recent History

Turing wasn't just any graphics core, it was the graphics core that was to become the foundation of future GPUs. The future is realized now with next-generation consoles going deep in talks about ray tracing and AI-assisted super-sampling techniques. NVIDIA had a head start with Turing and its Ampere generation will only do things infinitely times better.

The Ampere GPU does many traditional things which we would expect from a GPU, but at the same time, also breaks the barrier when it comes to untraditional GPU operations. Just to sum up some features:

New Streaming Multiprocessor (SM)
New Turing Tensor Cores
New Real-Time Ray Tracing Acceleration
New Shading Enhancements
New Deep Learning Features For Graphics & Inference
New GDDR6X High-Performance Memory Subsystem
New 2nd Generation NVLINK Interconnect
New HDMI 2.1 Display Engine & Next-Gen NVENC/NVDEC

The technologies mentioned above are some of the main building blocks of the Ampere GPU, but there's more within the graphics core itself which we will talk about in detail so let's get started.

Let's take a trip down the journey to Ampere. In 2016, NVIDIA announced their Pascal GPUs which would soon be featured in their top to bottom GeForce 10 series lineup. After the launch of Maxwell, NVIDIA gained a lot of experience in the efficiency department which they put a focus on since their Kepler GPUs. Two years go, NVIDIA, rather than offering another standard leap in the rasterization performance of its GPUs took a different approach & introduced two key technologies in its Turing line of consumer GPUs, one being AI-assisted acceleration with the Tensor Cores and the second being hardware-level acceleration for Ray Tracing with its brand new RT cores.

With Ampere and it's brand new Samsung 8nm fabrication process, NVIDIA is adding even more to its gaming graphics lineup. Starting with the most significant part of the Ampere GPU architecture, the Ampere SM, we are seeing an entirely new graphics core. The Ampere SM features the next-gen FP32, INT32, Tensor Cores, and RT cores.

Coming to the new execution units or cores, Ampere has both INT32 and FP32 units which can execute concurrently. This new architectural design allows Turing to execute floating-point and non-floating point operations in parallel which allows for higher throughput in standard floating-point operations. According to NVIDIA, the updated Ampere graphics core delivers up to 1.7x faster traditional rasterization performance and up to 2x faster ray-tracing performance compared to the Turing GPUs.

The Ampere SM is partitioned into four processing blocks, each with 32 FP32 Cores, 16 INT32 Cores, one Tensor Core, one warp scheduler, and one dispatch unit. This is made possible with an updated datapath with one data path offering 16 FP32 execution units while the other offers either 16 FP32 or 16 INT32 execution units. This adds to 128 FP32 Cores, 64 INT 32 Cores,4 Tensor, 4 Wrap Schedulers, and 4 Dispatch Units on a single Ampere SM. Each block also includes a new L0 instruction cache and a 64 KB register file for a total of 256 KB register file per SM.

One of the key design goals for the Ampere 30-series SM was to achieve twice the throughput for FP32 operations compared to the Turing SM. To accomplish this goal, the Ampere SM includes new datapath designs for FP32 and INT32 operations. One datapath in each partition consists of 16 FP32 CUDA Cores capable of executing 16 FP32 operations per clock. Another datapath consists of both 16 FP32 CUDA Cores and 16 INT32 Cores. As a result of this new design, each Ampere SM partition is capable of executing either 32 FP32 operations per clock, or 16 FP32 and 16 INT32 operations per clock. All four SM partitions combined can execute 128 FP32 operations per clock, which is double the FP32 rate of the Turing SM, or 64 FP32 and 64 INT32 operations per clock.

Doubling the processing speed for FP32 improves performance for a number of common graphics and compute operations and algorithms. Modern shader workloads typically have a mixture of FP32 arithmetic instructions such as FFMA, floating point additions (FADD), or floating point multiplications (FMUL), combined with simpler instructions such as integer adds for addressing and fetching data, floating point compare, or min/max for processing results, etc. Performance gains will vary at the shader and application level depending on the mix of instructions. Ray tracing denoising shaders are good examples that might benefit greatly from doubling FP32 throughput.

Doubling math throughput required doubling the data paths supporting it, which is why the Ampere SM also doubled the shared memory and L1 cache performance for the SM. (128 bytes/clock per Ampere SM versus 64 bytes/clock in Turing). Total L1 bandwidth for GeForce RTX 3080 is 219 GB/sec versus 116 GB/sec for GeForce RTX 2080 Super.

Like prior NVIDIA GPUs, Ampere is composed of Graphics Processing Clusters (GPCs), Texture Processing Clusters (TPCs), Streaming Multiprocessors (SMs), Raster Operators (ROPS), and memory controllers.

The GPC is the dominant high-level hardware block with all of the key graphics processing units residing inside the GPC. Each GPC includes a dedicated Raster Engine, and now also includes two ROP partitions (each partition containing eight ROP units), which is a new feature for NVIDIA Ampere Architecture GA10x GPUs. More details on the NVIDIA Ampere architecture can be found in NVIDIA’s Ampere Architecture White Paper, which will be published in the coming days.

The four processing blocks share a combined 128 KB L1 data cache/shared memory. Traditional graphics workloads partition the 128 KB L1/shared memory as 64 KB of dedicated graphics shader RAM and 64 KB for texture cache and register file spill area. In compute mode, the GA10x SM will support the following configurations:

128 KB L1 + 0 KB Shared Memory
120 KB L1 + 8 KB Shared Memory
112 KB L1 + 16 KB Shared Memory
96 KB L1 + 32 KB Shared Memory
64 KB L1 + 64 KB Shared Memory
28 KB L1 + 100 KB Shared Memory

Ampere also ties its ROPs to the HPC and houses a total of 16 ROP units per GPC. The full GA102 GPU feature 112 ROPs while the GeForce RTX 3080 comes with a total of 96 ROPs.

The block diagram of the NVIDIA Ampere SM Gaming GPUs.

The entire SM works in harmony by using different blocks to deliver high performance and better texture caching, enabling for up to twice as better CUDA core performance when compared to the previous generation.

A block diagram of the GA102 GPU featured on the NVIDIA GeForce RTX 3080 graphics card.

Many of these Ampere SMs combine to form the Ampere GPU. Each TPC inside the Ampere GPU houses 2 Turing SMs which are linked to the raster engine. There are a total of 6 TPCs or 12 Ampere SM that are arranged inside the GPC or Graphics Processing Cluster. The top configured GA102 GPU comes with 7 GPCs with a total of 42 TPCs and 84 SMs that are connected to 10 MB of L1 and 6 MB of L2 cache, ROPs, TMUs, memory controllers, and NVLINK HighSpeed I/O hub. All of this combines to form the massive Ampere GA102 GPU. The following are some perf figures for the top Ampere graphics cards.

NVIDIA GeForce RTX 3090

35.58 TFLOPS of peak single-precision (FP32) performance
71.16 TFLOPS of peak half-precision (FP16) performance
17.79 TIPS1 concurrent with FP, through independent integer execution units
258 Tensor TFLOPS
69 RT-TFLOPs

NVIDIA GeForce RTX 3080

30 TFLOPS of peak single-precision (FP32) performance
60 TFLOPS of peak half-precision (FP16) performance
15 TIPS1 concurrent with FP, through independent integer execution units
238 Tensor TFLOPS
58 RT-TFLOPs

In terms of shading performance which is the direct result of the enhanced core design and GPU architecture revamp, the Ampere GPU offers an uplift of up to 70% better performance per core compared to Turing GPUs.

It should be pointed out that these are just per core performance gains at the same clock speeds without adding the benefits of other technologies that Ampere comes with. That would further increase the performance in a wide variety of gaming applications.

NVIDIA Ampere "GeForce RTX 30" GPUs Full Breakdown:

Graphics Card	NVIDIA GeForce RTX 2070 SUPER	NVIDIA GeForce RTX 3070	NVIDIA GeForce RTX 2080	NVIDIA GeForce RTX 3080	NVIDIA Titan RTX	NVIDIA GeForce RTX 3090
GPU Codename	TU106	GA104	TU104	GA102	TU102	GA102
GPU Architecture	NVIDIA Turing	NVIDIA Ampere	NVIDIA Turing	NVIDIA Ampere	NVIDIA Turing	NVIDIA Ampere
GPCs	5 or 6	6	6	6	6	7
TPCs	20	23	23	34	36	41
SMs	40	46	46	68	72	82
CUDA Cores / SM	64	128	64	128	64	128
CUDA Cores / GPU	2560	5888	2944	8704	4608	10496
Tensor Cores / SM	8 (2nd Gen)	4 (3rd Gen)	8 (2nd Gen)	4 (3rd Gen)	8 (2nd Gen)	4 (3rd Gen)
Tensor Cores / GPU	320 (2nd Gen)	184 (3rd Gen)	368	272 (3rd Gen)	576 (2nd Gen)	328 (3rd Gen)
RT Cores	40 (1st Gen)	46 (2nd Gen)	46 (1st Gen)	68 (2nd Gen)	72 (1st Gen)	82 (2nd Gen)
GPU Boost Clock (MHz)	1770	1725	1800	1710	1770	1695
Peak FP32 TFLOPS (non-Tensor)	9.1	20.3	10.6	29.8	16.3	35.6
Peak FP16 TFLOPS (non-Tensor)	18.1	20.3	21.2	29.8	32.6	35.6
Peak BF16 TFLOPS (non-Tensor)	NA	20.3	NA	29.8	NA	35.6
Peak INT32 TOPS (non-Tensor)	9.1	10.2	10.6	14.9	16.3	17.8
Peak FP16 Tensor TFLOPS with FP16 Accumulate	72.5	81.3/162.6	84.8	119/238	130.5	142/284
Peak FP16 Tensor TFLOPS with FP32 Accumulate	36.3	40.6/81.3	42.4	59.5/119	65.2	71/142
Peak BF16 Tensor TFLOPS with FP32 Accumulate	NA	40.6/81.3	NA	59.5/119	NA	71/142
Peak TF32 Tensor TFLOPS	NA	20.3/40.6	NA	29.8/59.5	NA	35.6/71
Peak INT8 Tensor TOPS	145	162.6/325.2	169.6	238/476	261	284/568
Peak INT4 Tensor TOPS	290	325.2/650.4	339.1	476/952	522	568/1136
Frame Buffer Memory Size and Type	8 GB GDDR6	8 GB GDDR6	8 GB GDDR6	10 GB GDDR6X	24 GB GDDR6	24 GB GDDR6X
Memory Interface	256-bit	256-bit	256-bit	320-bit	384-bit	384-bit
Memory Clock (Data Rate)	14 Gbps	14 Gbps	14 Gbps	19 Gbps	14 Gbps	19.5 Gbps
Memory Bandwidth	448 GB/sec	448 GB/sec	448 GB/sec	760 GB/sec	672 GB/sec	936 GB/sec
ROPs	64	96	64	96	96	112
Pixel Fill-rate (Gigapixels/sec)	113.3	165.6	115.2	164.2	169.9	193
Texture Units	160	184	184	272	288	328
Texel Fill-rate (Gigatexels/sec)	283.2	317.4	331.2	465	509.8	566
L1 Data Cache/Shared Memory	3840	5888	4416 KB	8704 KB	6912 KB	10496 KB
L2 Cache Size	4096 KB	4096 KB	4096 KB	5120 KB	6144 KB	6144 KB
Register File Size	10240 KB	11776 KB	11776 KB	17408 KB	18432 KB	20992 KB
TGP (Total Graphics Power)	215 Watts	220W	225W	320W	280W	350W
Transistor Count	13.6 Billion	17.4 Billion	13.6 Billion	28.3 Billion	18.6 Billion	28.3 Billion
Die Size	545 mm2	392.5 mm2	545 mm2	628.4 mm2	754mm2	628.4 mm2
Manufacturing Process	TSMC 12 nm FFN (FinFET NVIDIA)	Samsung 8 nm 8N NVIDIA Custom Process	TSMC 12 nm FFN (FinFET NVIDIA)	Samsung 8 nm 8N NVIDIA Custom Process	TSMC 12 nm FFN (FinFET NVIDIA)	Samsung 8 nm 8N NVIDIA Custom Process

NVIDIA Ampere GPUs - GA102 & GA104 For The First Wave of Gaming Cards

NVIDIA is first introducing two brand new Ampere GPUs which include the GA102 and the GA104. The GA102 GPU is going to be featured on the GeForce RTX 3090 and GeForce RTX 3080 graphics cards while the GA104 GPU is going to be featured on the GeForce RTX 3070 graphics cards. The Ampere GPUs are based on the Samsung 8nm custom process node for NVIDIA and as such, the resultant GPU dies are slightly smaller than their Turing based predecessors but do come with a denser transistor layout. There will be several variations of each GPU featured across the RTX 30 series lineup. Following is what the complete GA102 and GA104 GPUs have to offer.

NVIDIA Ampere GA102 GPU

The full GA102 GPU is made up of 7 graphics processing clusters with 12 SM units on each cluster. That makes up 84 SM units for a total of 10752 cores in a 28.3 billion transistor package measuring 628.4mm2.

NVIDIA Ampere GA104 GPU

The full GA104 GPU is made up of 6 graphics processing clusters with 8 SM units on each cluster. That makes up 48 SM units for a total of 6144 cores in a 17.4 billion transistor package measuring 392.5mm2.

NVIDIA has also introduced its 3rd Generation Tensor core architecture and 2nd Generation RT cores on Ampere GPUs. Now Tensor cores have been available since Volta and consumers got a taste of it with the Turing GPUs. One of the key areas where Tensor Cores are put to use for AAA games is DLSS. There's a whole software stack that leverages from Tensor cores and that is known as the NVIDIA NGX. These software-based technologies will help enhance graphics fidelity with features such as Deep Learning Super Sampling (DLSS), AI InPainting, AI Super Rez, RTX Voice, and AI Slow-Mo.

While its initial debut was a bit flawed, DLSS in its 2nd iteration (DLSS 2.0) has done wonders to not only improve gaming performance but also image quality. In titles such as Death Stranding and Control, games are shown to offer higher visual fidelity than at native resolution while running at much higher framerates. With Ampere, we can expect an even higher boost in terms of DLSS 2.0 (and DLSS Next-Gen) performance as the deep-learning model continues working its magic in DLSS supported titles. NVIDIA will also be adding 8K DLSS support to its Ampere GPU lineup which would be great to test out with the 24 GB RTX 3090 graphics card.

With Ampere, Tensor cores add INT8 and INT4 precision in addition to FP16 which is still fully supported. NVIDIA has been at the helm of the deep learning revolution by supporting it since its Kepler generation of graphics cards. Today, NVIDIA has some of the most powerful AI graphics accelerators and a software stack that is widely adopted by this fast-growing industry.

For its 3rd Gen Tensor cores, NVIDIA is using the same sparsity architecture that they've used on the Ampere HPC line of GPUs. While Ampere features 4 Tensor cores per SM compared to Turing's 8 tensor cores per SM, they are not only based on the new 3rd Generation design but also get an increased count with the larger SM array. The Ampere GPU can execute 128 FP16 FMA operations per tensor core utilizing its entire INT16 cores and with sparsity, it can do up to 256. The total FP16 FMA operations per SM are increased to 512 and 1024 with sparsity. That's a 2x increase over the Turing GPU in terms of inference performance with the updated Tensor design.

2nd Gen RT Cores, RTX, and Real-Time Ray Tracing Dissected

Next up, we have the RT Cores which are what will power Real-Time Raytracing. NVIDIA isn't going to distance themselves from traditional rasterization-based rendering, but instead following a hybrid rendering model. The new 2nd Generation RT cores offer increased performance and offer double the ray/triangle intersection testing rate over Turing RT cores.

There's one RT core per SM and all of them combined accelerate Bounding Volume Hierarchy (BVH) traversal and ray/triangle intersection testing (ray casting) functions. RT Cores work together with advanced denoising filtering, a highly-efficient BVH acceleration structure developed by NVIDIA Research, and RTX compatible APIs to achieve real-time ray tracing on a single Turing GPU.

RT Cores traverse the BVH autonomously, and by accelerating traversal and ray/triangle intersection tests, they offload the SM, allowing it to handle another vertex, pixel, and compute shading work. Functions such as BVH building and refitting are handled by the driver, and ray generation and shading are managed by the application through new types of shaders.

To better understand the function of RT Cores, and what exactly they accelerate, we should first explain how ray tracing is performed on GPUs or CPUs without a dedicated hardware ray tracing engine. Essentially, the process of BVH traversal would need to be performed by shader operations and take thousands of instruction slots per ray cast to test against bounding box intersections in the BVH until finally hitting a triangle and the color at the point of intersection contributes to the final pixel color (or if no triangle is hit, the background color may be used to shade a pixel).

Ray tracing without hardware acceleration requires thousands of software instruction slots per ray to test successively smaller bounding boxes in the BVH structure until possibly hitting a triangle. It’s a computationally-intensive process making it impossible to do on GPUs in real-time without hardware-based ray tracing acceleration.

The RT Cores in Ampere can process all the BVH traversal and ray-triangle intersection testing, saving the SM from spending the thousands of instruction slots per ray, which could be an enormous amount of instructions for an entire scene. The RT Core includes two specialized units. The first unit does bounding box tests, and the second unit does ray-triangle intersection tests.

The SM only has to launch a ray probe, and the RT core does the BVH traversal and ray-triangle tests, and return a hit or no hit to the SM. Also unlike the last generation, Ampere SM can process two compute workloads simultaneously, allowing ray-tracing & graphics/compute workloads to be done concurrently.

In a visual demonstration, NVIDIA has shown how RT and Tensor cores help speed up ray tracing and shader workloads significantly. A fully ray-traced frame from Wolfenstein Youngblood was taken as an example. The last-gen RTX 2080 SUPER will take 51ms to render the frame if it does it all with its shaders (CUDA Cores). With RT cores and shaders working in tandem, the processing times are reduced to just 20ms or less than half the time. Adding in Tensor cores to help reduce the rendering time even lower to just 12ms (~83 FPS).

However, with Ampere, each standard processing block receives a huge performance uplift. With an RTX 3080, the same frame can be rendered within 37ms on the Shader cores alone, 11ms with the RT+Shader cores, and 6.7ms (150 FPS) with all three core technologies working together. That's half the time of what Turing took to render the same scene.

The Micron GDDR6X memory brings a lot of new stuff to the table. It is faster, doubles the I/O data rate, and is the first to implement PAM4 multi-level signaling in memory dies. With the Geforce RTX 3090 class products, Micron's GDDR6X memory achieves a bandwidth of up to 1 TB/s which is used to power the next-generation gaming experiences at high-fidelity resolutions such as 8K.

Micron GDDR6X graphics memory doubles input/output (I/O) performance while minimizing the cost of memory. Working with AI-innovation leader NVIDIA, Micron delivers higher bandwidth by enabling multi-level signaling in the form of four-level pulse amplitude modulation (PAM4) technology in this memory device via Micron

The new GDDR6X SGRAM:

Doubles the data rate of SGRAM at a lower power per transaction while enabling breaking of the 1 Terabyte per second (TB/s) system memory bandwidth boundary for graphics card applications;

Is the first discrete graphics memory device that employs PAM4 encoded signaling between the processor and the DRAM, using four voltage levels to encode and transfer two bits of data per interface clock.
Can be designed and operated stably at high speeds, and built-in mass-production.

As mentioned, GDDR6X features the brand new PAM4 multilevel signaling techniques which helps transfer data much faster, doubles the I/O rate, pushing the capability of each memory dies from 64 GB/s to 84 GB/s. The Micron GDDR6X memory dies are also the only graphics DRAM that can be mass-produced while feature PAM4 signaling.

What is interesting is that Micron quotes that its GDDR6X memory can hit speeds of up to 21 Gbps whereas we have only got to see 19.5 Gbps in action on the GeForce RTX 3090. It is likely that AIBs could utilize higher binned dies as they are available. Micron also confirms that they plan to offer speeds higher than 21 GB/s moving in 2021 but we will have to wait and see whether any cards will utilize them.

It's not just faster speeds but Micron's GDDR6X provides higher bandwidth while sipping in 15% lower power per transferred bit compared to the previous generation GDDR6 memory. PAM4 signaling is a big upgrade from the two-level NRZ signaling on the GDDR6 memory.

Instead of transmitting two binary bits of data each clock cycle (one bit on the rising edge and one bit on the falling edge of the clock), PAM4 sends two bits each clock edge, encoded using four different voltage levels. The voltage levels are divided into 250 mV steps with each level representing two bits of data - 00, 01, 10, or 11 sent on each clock edge (still DDR technology).

Micron GDDR6X Memory

Feature	GDDR5	GDDR5X	GDDR6	GDDR6X
Density	From 512Mb to 8Gb	8Gb	8Gb, 16Gb	8Gb, 16Gb
VDD and VDDQ	Either 1.5V or 1.35V	1.35V	Either 1.35V or 1.25V	Either 1.35V or 1.25V
VPP	N/A	1.8V	1.8V	1.8V
Data rates	Up to 8 Gb/s	Up to 12Gb/s	Up to 16 Gb/s	19 Gb/s, 21 Gb/s, >21 Gb/s
Channel count	1	1	2	2
Access granularity	32 bytes	64 bytes 2x 32 bytes in pseudo 32B mode	2 ch x 32 bytes	2 ch x 32 bytes
Burst length	8	16 / 8	16	8 in PAM4 mode 16 in RDQS mode
Signaling	POD15/POD135	POD135	POD135/POD125	PAM4 POD135/POD125
Package	BGA-170 14mm x 12mm 0.8mm ball pitch	BGA-190 14mm x 12mm 0.65mm ball pitch	BGA-180 14mm x 12mm 0.75mm ball pitch	BGA-180 14mm x 12mm 0.75mm ball pitch
I/O width	x32/x16	x32/x16	2 ch x16/x8	2 ch x16/x8
Signal count	61 - 40 DQ, DBI, EDC - 15 CA - 6 CK, WCK	61 - 40 DQ, DBI, EDC - 15 CA - 6 CK, WCK	70 or 74 - 40 DQ, DBI, EDC - 24 CA - 6 or 10 CK, WCK	70 or 74 - 40 DQ, DBI, EDC - 24 CA - 6 or 10 CK, WCK
PLL, DCC	PLL	PLL	PLL, DCC	DCC
CRC	CRC-8	CRC-8	2x CRC-8	2x CRC-8
VREFD	External or internal per 2 bytes	Internal per byte	Internal per pin	Internal per pin 3 sub-receivers per pin
Equalization	N/A	RX/TX	RX/TX	RX/TX
VREFC	External	External or Internal	External or Internal	External or Internal
Self refresh (SRF)	Yes Temp. Controlled SRF	Yes Temp. Controlled SRF Hibernate SRF	Yes Temp. Controlled SRF Hibernate SRF VDDQ-off	Yes Temp. Controlled SRF Hibernate SRF VDDQ-off
Scan	SEN	IEEE 1149.1 (JTAG)	IEEE 1149.1 (JTAG)	IEEE 1149.1 (JTAG)

With each new generation of graphics cards, NVIDIA delivers a new range of display technologies. This generation is no different and we see some significant updates to not only the display engine but also the graphics interconnect. With the adoption of faster GDDR6X memory which provides higher bandwidth, faster compression, and more cache, Gaming applications can now run at higher resolutions, supporting more details on the display.

The Ampere Display Engine supports two new display technologies, HDMI 2.1 and DisplayPort 1.4a with DSC 1.2a. HDMI 2.1 allows for up to 48 Gbps of total bandwidth and allows for up to 4K 240Hz HDR and 8K 60Hz HDR.

DisplayPort 1.4a allows for up to 8K resolutions with 60Hz refresh rates and includes VESA's display stream compression 1.2 technology with visually lossless compression. You can run up to two 8K displays at 60 Hz using two cables, one for each display. In addition to that, Ampere also supports HDR processing natively with tone mapping added to the HDR pipeline.

Ampere GPUs also ships with the Fifth Generation NVDEC decoder unit that adds AV1 hardware decode support. Ampere's new NVDEC decoder has also been updated to support the decoding of MPEG-2, VC-1, H.264 (AVCHD), H.265 (HEVC), VP8, VP9, and AV1.

Ampere also adds the 7th Generation NVENC encoder by offering seamless hardware-accelerated encoding of up to 4K on H.264 and 8K on HEVC.

NVIDIA RTX IO - Blazing Fast Read Speeds With GPU Utilization

As storage sizes have grown, so has storage performance. Gamers are increasingly turning to SSDs to reduce game load times: while hard drives are limited to 50-100 MB/sec throughput, the latest M.2 PCIe Gen4 SSDs deliver up to 7 GB/sec. With the traditional storage model, game data is read from the hard disk, then passed from the system memory and CPU before being passed to the GPU.

Historically games have read files from the hard disk, using the CPU to decompress the game image. Developers have used lossless compression to reduce install sizes and to improve I/O performance. However, as storage performance has increased, traditional file systems and storage APIs have become a bottleneck. For example, decompressing game data from a 100 MB/sec hard drive takes only a few CPU cores, but decompressing data from a 7 GB/sec PCIe Gen4 SSD can consume more than twenty AMD Ryzen Threadripper 3960X CPU cores!

Using the traditional storage model, game decompression can consume all 24 cores on a Threadripper CPU. Modern game engines have exceeded the capability of traditional storage APIs. A new generation of I/O architecture is needed. Data transfer rates are the gray bars, CPU cores required are the black/blue blocks.

NVIDIA RTX IO is a suite of technologies that enable rapid GPU-based loading and decompression of game assets, accelerating I/O performance by up to 100x compared to hard drives and traditional storage APIs. When used with Microsoft’s new DirectStorage for Windows API, RTX IO offloads dozens of CPU cores’ worth of work to your RTX GPU, improving frame rates, enabling near-instantaneous game loading, and opening the door to a new era of large, incredibly detailed open-world games.

Object pop-in and stutter can be reduced, and high-quality textures can be streamed at incredible rates, so even if you’re speeding through a world, everything runs and looks great. In addition, with lossless compression, game download and install sizes can be reduced, allowing gamers to store more games on their SSD while also improving their performance.

NVIDIA RTX IO plugs into Microsoft’s upcoming DirectStorage API which is a next-generation storage architecture designed specifically for state-of-the-art NVMe SSD-equipped gaming PCs and the complex workloads that modern games require. Together, streamlined and parallelized APIs specifically tailored for games allow dramatically reduced IO overhead and maximize performance/bandwidth from NVMe SSDs to your RTX IO-enabled GPU.

Specifically, NVIDIA RTX IO brings GPU-based lossless decompression, allowing reads through DirectStorage to remain compressed and delivered to the GPU for decompression. This removes the load from the CPU, moving the data from storage to the GPU in a more efficient, compressed form, and improving I/O performance by a factor of two.

GeForce RTX GPUs will deliver decompression performance beyond the limits of even Gen4 SSDs, offloading potentially dozens of CPU cores’ worth of work to ensure maximum overall system performance for next-generation games. Lossless decompression is implemented with high performance compute kernels, asynchronously scheduled. This functionality leverages the DMA and copy engines of Turing and Ampere, as well as the advanced instruction set, and architecture of these GPU’s SM’s.

The advantage of this is that the enormous compute power of the GPU can be leveraged for burst or bulk loading (at level load for example) when GPU resources can be leveraged as high performance I/O processor, delivering decompression performance well beyond the limits of Gen4 NVMe. During streaming scenarios, bandwidths are a tiny fraction of the GPU capability, further leveraging the advanced asynchronous compute capabilities of Turing and Ampere. Microsoft is targeting a developer preview of DirectStorage for Windows for game developers next year, and NVIDIA Turing & Ampere gamers will be able to take advantage of RTX IO enhanced games as soon as they become available.

NVLINK For GeForce RTX 3090 And Titan Class Products Only!

NVIDIA has said farewell to their SLI (Scale Link Interface) interconnect for consumer graphics cards. They will now be using the NVLINK interconnect which has already been featured on their Turing GPUs. The reason is that SLI was simply not enough to feed higher bandwidth to Ampere GPUs.

A single x8 NVLINK channel provides 25 GB/s peak bandwidth. There are 4 x4 links on the GA102 GPU. The GA102 GPU features 50 GB/s of bandwidth in parallel and 100 GB/s bandwidth bi-directionally. Using NVLINK on high-end cards would be beneficial in high-resolution gaming but there's a reason NVIDIA still restricts users from doing 3 and 4 way SLI.

Multi-GPU still isn't optimized so you won't see many benefits unless you are running the highest-end graphics cards. That's another reason why the RTX 3080 & RTX 3070 are deprived of NVLINK connectors. The NVLINK connectors cost $79 US each and are sold separately.

The NVIDIA GeForce RTX 3080 is a force to be reckoned with. It is the ultimate gaming GPU that is surprisingly much faster than the GeForce RTX 2080 Ti which is its Turing based predecessor but is also much cheaper at just $699 US. The GeForce RTX 3080 carries more cores, more memory, higher performance efficiency, and also carries next-generation ray-tracing and tensor cores that make this a truly next-generation graphics card.

NVIDIA designed the GeForce RTX 3080 not just for any gamer but all gamers who want to have the best graphics performance at hand to power the next-generation of AAA gaming titles with superb visuals and insane fluidity. It's not just the FPS that matters these days, its visuals, and a smoother frame rate too and this is exactly what the GeForce RTX 30 series is made to excel at. There's a lot to talk about regarding NVIDIA's flagship Ampere gaming graphics cards so let's start off with the specifications.

Marvels of NVIDIA Ampere Architecture - 2nd Generation RTX
Enabling the blistering performance of the new RTX 30 Series GPUs and the NVIDIA Ampere architecture are cutting-edge technologies and over two decades of graphics R&D, including:

New streaming multiprocessors: The building block for the world’s fastest, most efficient GPU, delivering 2x the FP32 throughput of the previous generation, and 30 Shader-TFLOPS of processing power.
Second-gen RT Cores: New dedicated RT Cores deliver 2x the throughput of the previous generation, plus concurrent ray tracing and shading and compute, with 58 RT-TFLOPS of processing power.
Third-gen Tensor Cores: New dedicated Tensor Cores, with up to 2x the throughput of the previous generation, making it faster and more efficient to run AI-powered technologies, like NVIDIA DLSS, and 238 Tensor-TFLOPS of processing power.
NVIDIA RTX IO: Enables rapid GPU-based loading and game asset decompression, accelerating input/output performance by up to 100x compared with hard drives and traditional storage APIs. In conjunction with Microsoft’s new DirectStorage for Windows API, RTX IO offloads dozens of CPU cores’ worth of work to the RTX GPU, improving frame rates and enabling near-instantaneous game loading.
World’s fastest graphics memory: NVIDIA has worked with Micron to create the world’s fastest discrete graphics memory for the RTX 30 Series, GDDR6X. It provides data speeds of close to 1TB/s system memory bandwidth for graphics card applications, maximizing game and app performance.
Next-gen process technology: New 8N NVIDIA custom process from Samsung, which allows for higher transistor density and more efficiency.

NVIDIA GeForce RTX 3080 Graphics Card Specifications - GA102 GPU & 10 GB GDDR6X Memory

At the heart of the NVIDIA GeForce RTX 3080 graphics card lies the GA102 GPU. The GA102 is one of the many Ampere GPUs that we will be getting on the gaming segment. The GA102 GPU is the fastest gaming GPU that NVIDIA has produced. The GPU is based on Samsung's 8nm custom process node designed specifically for NVIDIA and features a total of 28 Billion transistors. It measures at 628mm2 which makes it the 2nd biggest gaming GPU ever produced right below the Turing TU102 GPU.

The new shader core on the NVIDIA Ampere architecture is 2.7x faster, the new RT cores are 1.7x faster while the new Tensor cores are up to 2.7x faster than the previous generation Turing GPUs. The 2nd Generation RT core delivers dedicated hardware-accelerated ray-tracing performance & features twice the ray/triangles intersection with concurrent RT graphics and compute operations.

For the GeForce RTX 3080, NVIDIA has enabled a total of 68 SM units on its flagship which results in a total of 8704 CUDA cores. In addition to the CUDA cores, NVIDIA's GeForce RTX 3080 also comes packed with next-generation RT (Ray-Tracing) cores, Tensor cores, and brand new SM or streaming multi-processor units. The GPU runs at a base clock speed of 1440 MHz and a boost clock speed of 1710 MHz. The card has a TDP of 320W.

In terms of memory, the GeForce RTX 3080 comes packed with 10 GB of memory and that too the next-generation GDDR6X design. With Micron's latest and greatest graphics memory dies, the RTX 3080 can deliver GDDR6X memory speeds of 19.0 Gbps. That along with a bus interface of 320-bit will deliver a cumulative bandwidth of 760 Gbps.

NVIDIA GeForce RTX 30 Series 'Ampere' Graphics Card Specifications:

Graphics Card Name	NVIDIA GeForce RTX 3060	NVIDIA GeForce RTX 3060 Ti	NVIDIA GeForce RTX 3070	NVIDIA GeForce RTX 3080	NVIDIA GeForce RTX 3090
GPU Name	Ampere GA106-300	Ampere GA104-200	Ampere GA104-300	Ampere GA102-200	Ampere GA102-300
Process Node	Samsung 8nm	Samsung 8nm	Samsung 8nm	Samsung 8nm	Samsung 8nm
Die Size	TBC	395.2mm2	395.2mm2	628.4mm2	628.4mm2
Transistors	TBC	17.4 Billion	17.4 Billion	28 Billion	28 Billion
CUDA Cores	3584	4864	5888	8704	10496
TMUs / ROPs	112 / 64	152 / 80	184 / 96	272 / 96	328 / 112
Tensor / RT Cores	112 / 28	152 / 38	184 / 46	272 / 68	328 / 82
Base Clock	1320 MHz	1410 MHz	1500 MHz	1440 MHz	1400 MHz
Boost Clock	1780 MHz	1665 MHz	1730 MHz	1710 MHz	1700 MHz
FP32 Compute	13 TFLOPs	16 TFLOPs	20 TFLOPs	30 TFLOPs	36 TFLOPs
RT TFLOPs	25 TFLOPs	32 TFLOPs	40 TFLOPs	58 TFLOPs	69 TFLOPs
Tensor-TOPs	101 TOPs	192 TOPs	163 TOPs	238 TOPs	285 TOPs
Memory Capacity	12 GB GDDR6	8 GB GDDR6	8 GB GDDR6	10 GB GDDR6X	24 GB GDDR6X
Memory Bus	192-bit	256-bit	256-bit	320-bit	384-bit
Memory Speed	16 Gbps	14 Gbps	14 Gbps	19 Gbps	19.5 Gbps
Bandwidth	384 Gbps	448 Gbps	448 Gbps	760 Gbps	936 Gbps
TGP	170W	175W	220W	320W	350W
Price (MSRP / FE)	$329 US	$399 US	$499 US	$699 US	$1499 US
Launch (Availability)	25th February 2021	2nd December 2020	29th October 2020	17th September 2020	24th September 2020

NVIDIA GeForce RTX 3080 Graphics Card Cooling & Design- Next-Gen NVTTM Founders Edition Design

NVIDIA has developed one of their best and most powerful Founders Edition cooling design to date for the GeForce RTX 30 series graphics cards. NVIDIA explained that higher performance requires a new form of cooling solution and as such, it has prepared a unique cooling solution for its next-gen cards which will keep GPUs running cool while staying quiet by utilizing several new & existing tech.

The Founders Edition cooling makes use of a full aluminum alloy heatsink which makes use of a hybrid vapor chamber with dual-sided axial-tech based fans. The cooler heatsink is coated with a nano-carbon coating and should do a really good job at keeping the temperatures in control.

The design is interesting in the sense that not only does it goes all out with a fin and heat pipe design. This is the first design of its kind since the original Founders Edition GeForce GTX 780 that makes use of a much larger heatsink area.

It also comes with a unique fan placement, one on the front and one at the bottom. This push & pull fan configuration which as it is referred to is said to push heat out of the exhaust vents much more effectively. There will be some air that will be blown out inside the case from the back of the card itself but that shouldn't be a major cause of concern as modern CPU Air or Liquid coolers do a really good job venting out air from within the case.

Acoustically, the new Founders Edition design is quieter than traditional dual axial coolers, while still delivering nearly 2x the cooling performance of previous-generation solutions. The aforementioned NVLink and power design changes help here, creating more space for airflow through the largest fin stack seen to date, and the larger bracket vents improve airflow in concert with individually shaped shroud fins. In fact, wherever you look, every aspect of the Founders Edition cards are designed to maximize airflow, minimize temperatures, and enable the highest levels of performance with the least possible noise.

NVIDIA GeForce RTX 3080 Graphics Card PCB & 12-Pin Power Input

One of the biggest changes on the Founders Edition GeForce RTX 30 series graphics cards is the PCB design. The GeForce RTX 3090 & GeForce RTX 3080 comes with a unique & compact PCB package that is unlike anything we've seen in the consumer space before. But being compact doesn't mean that the cards don't pack a punch. There's some serious horsepower on these compact PCBs that NVIDIA has designed.

The PCB features over 18 power chokes which put it is a more premium design than the flagship non-reference RTX 20 series cards. The GeForce RTX 3080 is powered by an 18 phase design that is insane and designed to be overclocked with unprecedented GPU overclock headroom that most users can leverage from to gain even faster performance.

In addition to that, GeForce RTX 30 series Founders Edition cards will be featuring the 12-pin Micro-Fit 3.0 power connectors. These connectors don't require a power supply upgrade as the cards will ship with bundled 2x 8-pin to 1x 12-pin connectors so you can run your latest graphics card without any compatibility issues.

The placement of the 12-pin connector on the PCB is also noteworthy. It is placed in a vertical position and judging by the PCB design, we can tell why NVIDIA moved to a single 12-pin plug instead of the standard dual 8-pin design. There's limited room on the PCB to do stuff and as such, it was necessary to go for a more small and compact power input.

NVIDIA GeForce RTX 3080 Graphics Card Price & Availability - Both Custom & Reference Designs at Launch

The NVIDIA GeForce RTX 3080 is being announced today and will be launching to consumers on the 17th of September, 2020. The first wave of graphics cards to hit the market would be the reference Founders Edition variant which will cost $699 US. The NVIDIA GeForce RTX 3080 will feature a price of $699 (MSRP) however custom models will vary depending on their design and the extra horse-power that they have to offer.

There aren't any performance numbers that NVIDIA is sharing right now but from what has been showcased, the GeForce RTX 3070 is faster than an RTX 2080 Ti, the RTX 3080 is a good bit ahead of the RTX 2080 Ti and the RTX 3090 is about as much as 50% faster than the RTX 2080 Ti which is very impressive for the full lineup stack. The NVIDIA GeForce RTX 3080 itself is twice as fast as the RTX 2080 and is considerably faster than the RTX 2080 Ti making it a perfect 60 FPS 4K gaming graphics card.

NVIDIA has developed one of their best and most powerful Founders Edition cooling design to date for the GeForce RTX 30 series graphics cards. NVIDIA explained that higher performance requires a new form of cooling solution and as such, it has prepared a unique cooling solution for its next-gen cards which will keep GPUs running cool while staying quiet by utilizing several new & existing tech.

The Founders Edition cooling makes use of a full aluminum alloy heatsink which makes use of a hybrid vapor chamber with dual-sided axial-tech based fans. The cooler heatsink is coated with a nano-carbon coating and should do a really good job at keeping the temperatures in control.

The official heatsink of the NVIDIA GeForce RTX 3080 Founders Edition graphics card.

The design is interesting in the sense that not only does it goes all out with a fin and heat pipe design. This is the first design of its kind since the original Founders Edition GeForce GTX 780 that makes use of a much larger heatsink area.

It also comes with a unique fan placement, one on the front and one at the bottom. This push & pull fan configuration which as it is referred to is said to push heat out of the exhaust vents much more effectively. There will be some air that will be blown out inside the case from the back of the card itself but that shouldn't be a major cause of concern as modern CPU Air or Liquid coolers do a really good job venting out air from within the case.

Acoustically, the new Founders Edition design is quieter than traditional dual axial coolers, while still delivering nearly 2x the cooling performance of previous-generation solutions. The aforementioned NVLink and power design changes help here, creating more space for airflow through the largest fin stack seen to date, and the larger bracket vents improve airflow in concert with individually shaped shroud fins. In fact, wherever you look, every aspect of the Founders Edition cards are designed to maximize airflow, minimize temperatures, and enable the highest levels of performance with the least possible noise.

In terms of cooler noise and performance, the GeForce RTX 3080 operates at a peak temperature of 78C when hitting its peak TBP of 320W with a noise output of just 30dBA. For comparison, the Turing Founders Edition coolers peak out at 81C with a noise output of 32dBA when hitting their TBP of 240W (RTX 2080 SUPER). In NVIDIA's own testing, they reveal that the GeForce RTX 3080 averages at around 1920 MHz with a GPU power draw of 310W and a peak temperature of 76C.

This is also where NVIDIA gets its 1.9x efficiency figure from as the RTX 3080 can deliver over 100 FPS while being cooler and quiet versus the 60 FPS of its Turing gen predecessor.

NVIDIA GeForce RTX 3090, RTX 3080, RTX 3070 Founders Edition Gallery:

NVIDIA GeForce RTX 3090 & RTX 3080 Graphics Card PCB & Power - Designed To Be Overclocked!

One of the biggest changes on the Founders Edition GeForce RTX 3090 graphics cards is the PCB design. The GeForce RTX 3090 & GeForce RTX 3080 comes with a unique & compact PCB package that is unlike anything we've seen in the consumer space before. But being compact doesn't mean that the cards don't pack a punch. There's some serious horsepower on these compact PCBs that NVIDIA has designed.

The PCB features over 20 power chokes which put it is a more premium design than the flagship non-reference RTX 20 series cards. The GPU is powered by 18 phases while the memory receives power from 2 phases. NVIDIA touts this PCB as an overclocking marvel with unprecedented GPU overclock headroom that most users can leverage from to gain even faster performance. But as pointed out earlier by us, the Founders Edition PCB is not the reference design and that will come with a standard rectangular PCB. Water block manufacturers have also confirmed this which we reported here.

The official PCB of the NVIDIA GeForce RTX 3080 Founders Edition graphics card.

In addition to that, GeForce RTX 30 series Founders Edition cards will be featuring the 12-pin Micro-Fit 3.0 power connectors. These connectors don't require a power supply upgrade as the cards will ship with bundled 2x 8-pin to 1x 12-pin connectors so you can run your latest graphics card without any compatibility issues.

The placement of the 12-pin connector on the PCB is also noteworthy. It is placed in a vertical position and judging by the PCB design, we can tell why NVIDIA moved to a single 12-pin plug instead of the standard dual 8-pin design. There's limited room on the PCB to do stuff and as such, it was necessary to go for a more small and compact power input.

NVIDIA GeForce RTX 3080 Founders Edition Box and presentation

Top of the NVIDIA GeForce RTX 3080 Founders Edition

Bottom of the GeForce RTX 3080 Founders Edition

Tour de NVIDIA GeForce RTX 3080 Founder's Edition

The Rear I/O with an HDMI 2.1 and three DisplayPort 1.4a

The 12-pin Microfit connector

We used the following test system for comparison between the different graphics cards. The latest drivers that were available at the time of testing were used from AMD and NVIDIA on an updated version of Windows 10. All games that were tested were patched to the latest version for better performance optimization for NVIDIA and AMD GPUs.

Test System

Components	X570
CPU	Ryzen 9 3900X 4.3GHz All Core Lock (disable one CCD for 3600X Results)
Memory	32GB Hyper X Predator DDR4 3600
Motherboard	ASUS TUF Gaming X570 Plus-WiFi
Storage	TeamGroup Cardea 1TB NVMe PCIe 4.0
PSU	Cooler Master V1200 Platinum
Windows Version	Latest verion of windows at the time of testing
Hardware-Accelerated GPU Scheduling	On if supported by GPU and driver.

Graphics Cards Tested

GPU	Architecture	Core Count	Clock Speed	Memory Capacity	Memory Speed
NVIDIA RTX 3080 FE	Ampere	8704	1440/1710	10GB GDDR6X	19Gbps
NVIDIA RTX 2080ti FE	Turing	4352	1350/1635	11GB GDDR6	14Gbps
NVIDIA RTX 2080 SUPER FE	Turing	3072	1650/1815	8GB GDDR6	15.5Gbps
NVIDIA GTX 1080 FE	Pascal	2560	1607/1733	8GB GDDR5X	10Gbps
AMD Radeon RX 5700XT	Navi 10	2560	1605/1755/1905	8GB GDDR6	14Gbps

Drivers Used

Drivers
Radeon Settings	20.8.3
GeForce	456.16

All games were tested on 2560×1440 (2K), 3440x1440 and 3840x2160 (4K) resolutions.
Image Quality and graphics configurations are provided with each game description.
The "reference" cards are the stock configs.

Firestrike

Firestrike is running the DX11 API and is still a good measure of GPU scaling performance, in this test we ran the Extreme and Ultra versions of Firestrike which runs at 1440p and 4K and we recorded the Graphics Score only since the Physics and combined are not pertinent to this review.

3DMark Firestrike Extreme Graphics

Score

0

5000

10000

15000

20000

25000

30000

0

5000

10000

15000

20000

25000

30000

RTX 3080

RTX Titan

RTX 2080Ti

RTX 2080

RX 5700XT

GTX 1080Ti

GTX 1080

3DMark Firestrike Ultra Graphics

Score

0

4000

8000

12000

16000

20000

24000

0

4000

8000

12000

16000

20000

24000

RTX 3080

RTX Titan

RTX 2080Ti

RTX 2080

RX 5700XT

GTX 1080Ti

GTX 1080

Time Spy

Time Spy is running the DX12 API and we used it in the same manner as Firestrike Extreme where we only recorded the Graphics Score as the Physics score is recording the CPU performance and isn't important to the testing we are doing here.

3DMark Time Spy Graphics

Score

0

4000

8000

12000

16000

20000

24000

0

4000

8000

12000

16000

20000

24000

RTX 3080

RTX Titan

RTX 2080Ti

RTX 2080

RX 5700XT

GTX 1080Ti

GTX 1080

3DMark Time Spy Extreme Graphics

Score

0

2000

4000

6000

8000

10000

12000

0

2000

4000

6000

8000

10000

12000

RTX 3080

RTX Titan

RTX 2080Ti

RTX 2080

RX 5700XT

GTX 1080Ti

GTX 1080

Port Royal

Port Royal is another great tool in the 3DMark suite, but this one is 100% targeting Ray Tracing performance. It loads up ray traced shadows, reflections, and global illumination to really tax the performance of the graphics cards that either have hardware-based or software-based ray tracing support.

3DMark Port Royal Score

Score

0

4000

8000

12000

16000

20000

24000

0

4000

8000

12000

16000

20000

24000

RTX 3080

RTX Titan

RTX 2080Ti

RTX 2080

RX 5700XT

GTX 1080Ti

GTX 1080

Thermals

Thermals were measured from our open test bench after running the Time Spy graphics test 2 on loop for 30 minutes recording the highest temperatures reported. The room was climate controlled and kept at a constant 22c throughout the testing.

Temperatures (22c Ambient)

Load

Idle

0

20

40

60

80

100

120

0

20

40

60

80

100

120

RTX 3080

RTX Titan

RTX 2080Ti

RTX 2080

RX 5700XT

GTX 1080Ti

GTX 1080

Forza Horizon 4

Forza Horizon 4 carries on the open-world racing tradition of the Horizon series. The latest DX12 powered entry is beautifully crafted and amazingly well executed and is a great showcase of DX12 games. We use the benchmark run while having all of the settings set to non-dynamic with an uncapped framerate to gather these results.

Forza Horizon 4 1440p Ultra

AVG FPS

1% Percentile

0

40

80

120

160

200

240

0

40

80

120

160

200

240

RTX 3080

RTX Titan

RTX 2080Ti

RTX 2080

RX 5700XT

GTX 1080Ti

GTX 1080

Shadow of the Tomb Raider

Shadow of the Tomb Raider, unlike its predecessor, does a good job putting DX12 to use and results in higher performance than the DX11 counterpart in this title and because of that, we test this title in DX12. I do use the second segment of the benchmark run to gather these numbers as it is more indicative of in-game scenarios where the foliage is heavy.

Shadow of the Tomb Raider 1440p DX12 Highest

AVG FPS

1% Percentile

0

40

80

120

160

200

240

0

40

80

120

160

200

240

RTX 3080

RTX Titan

RTX 2080Ti

RTX 2080

RX 5700XT

GTX 1080Ti

GTX 1080

Rainbow 6 Siege

Rainbow 6 Siege has maintained a massive following since its launch and it consistently in Steams Top Ten highest player count game. In a title where the higher the framerate the better in a tactical yet fast-paced competitive landscape is essential, we include this title despite its ludicrously high framerates. We use the Vulkan Ultra preset with the High Defenition Texture Pack as well and gather our results from the built-in benchmarking tool.

Rainbow 6 Siege 1440p Vulkan Ultra

AVG FPS

1% Percentile

0

90

180

270

360

450

540

0

90

180

270

360

450

540

RTX 3080

RTX Titan

RTX 2080Ti

RTX 2080

RX 5700XT

GTX 1080Ti

GTX 1080

DOOM Eternal

DOOM Eternal brings hell to earth with the Vulkan powered idTech 7. We test this game using the Ultra Nightmare Preset and follow our in game benchmarking to stay as consistent as possible.

DOOM Eternal 1440p Ultra Nightmare

AVG FPS

1% Percentile

0

70

140

210

280

350

420

0

70

140

210

280

350

420

RTX 3080

RTX Titan

RTX 2080Ti

RTX 2080

RX 5700XT

GTX 1080Ti

GTX 1080

Gears Tactics

Gears Tactics is the latest in the Gears franchise and takes things in a completely different direction with the gameplay design. It is built on a DX12 based Unreal Engine 4 build. We used the Maximum settings allowed but refrained from enabling Variable Rate Shading as all cards ar not capable of supporting this feature.

Gears Tactics 1440p Maximum

AVG FPS

1% Percentile

0

40

80

120

160

200

240

0

40

80

120

160

200

240

RTX 3080

RTX Titan

RTX 2080Ti

RTX 2080

RX 5700XT

GTX 1080Ti

GTX 1080

Ghost Recon Breakpoint

Ghost Recon Breakpoint is powered by the latest iteration of the Anvil Next 2.0 game engine. This is the same engine that was used in Assassin's Creed Odyssey but in Breakpoint has been updated to support the Vulkan API. We performed our tests using the High Preset with the Vulkan API.

Ghost Recon Breakpoint 1440p Vulkan High

AVG FPS

1% Percentile

0

40

80

120

160

200

240

0

40

80

120

160

200

240

RTX 3080

RTX Titan

RTX 2080Ti

RTX 2080

RX 5700XT

GTX 1080Ti

GTX 1080

Call of Duty Modern Warfare (2019)

Call of Duty Modern Warfare is back and this time on a new engine running DX12 to allow for some sick DXR Ray Traced Shadows, but we're not testing that here since this card isn't designed for that level of rendering. We tested in the 'Fog of War' mission where we tested our RT performance run. At 1440p we set the settings all to High.

Call of Duty Modern Warfare 1440p Highest

AVG FPS

1% Percentile

0

40

80

120

160

200

240

0

40

80

120

160

200

240

RTX 3080

RTX Titan

RTX 2080Ti

RTX 2080

RX 5700XT

GTX 1080Ti

GTX 1080

Resident Evil 3

The Resident Evil 3 Remake has surpassed the RE2 Remake in visuals and is the latest use of the RE Engine. While it does have DX12 support the DX11 implementation is far superior and because of that, we will be sticking to DX11 for this title. We use the cutscene where Jill and Carlos enter the subway car for the first time and a 2 minute capture at that point.

Resident Evil 3 1440p DX11 Maximum

AVG FPS

1% Percentile

0

40

80

120

160

200

240

0

40

80

120

160

200

240

RTX 3080

RTX Titan

RTX 2080Ti

RTX 2080

RX 5700XT

GTX 1080Ti

GTX 1080

Borderlands 3

Borderlands 3 has made its way into the test lineup thanks to strong demand by gamers and simply delivering MORE Borderlands. This game is rather intensive after the Medium preset but since we're testing the 'Ultimate 1440p' card, High it is. We tested using the built-in benchmark utility

Borderlands 3 1440p High

AVG FPS

1% Percentile

0

40

80

120

160

200

240

0

40

80

120

160

200

240

RTX 3080

RTX Titan

RTX 2080Ti

RTX 2080

RX 5700XT

GTX 1080Ti

GTX 1080

Total War Saga: Troy

Total War Saga: Troy is powered by their TW Engine 3 (Total War Engine 3) and in this iteration, they have stuck to a strictly DX11 release. We tested the game using the built-in benchmark using the Dynasty model that represents a battle with many soldiers interacting at once and is more representative of normal gameplay.

Total War Saga

AVG FPS

1% Percentile

0

40

80

120

160

200

240

0

40

80

120

160

200

240

RTX 3080

RTX Titan

RTX 2080Ti

RTX 2080

RX 5700XT

GTX 1080Ti

GTX 1080

Forza Horizon 4

Forza Horizon 4 carries on the open-world racing tradition of the Horizon series. The latest DX12 powered entry is beautifully crafted and amazingly well executed and is a great showcase of DX12 games. We use the benchmark run while having all of the settings set to non-dynamic with an uncapped framerate to gather these results.

Forza Horizon 4 UW 1440p Ultra

AVG FPS

1% Percentile

0

40

80

120

160

200

240

0

40

80

120

160

200

240

RTX 3080

RTX Titan

RTX 2080Ti

RTX 2080

RX 5700XT

GTX 1080Ti

GTX 1080

Shadow of the Tomb Raider

Shadow of the Tomb Raider, unlike its predecessor, does a good job putting DX12 to use and results in higher performance than the DX11 counterpart in this title and because of that, we test this title in DX12. I do use the second segment of the benchmark run to gather these numbers as it is more indicative of in-game scenarios where the foliage is heavy.

Shadow of the Tomb Raider UW 1440p DX12 Highest

AVG FPS

1% Percentile

0

40

80

120

160

200

240

0

40

80

120

160

200

240

RTX 3080

RTX Titan

RTX 2080Ti

RTX 2080

RX 5700XT

GTX 1080Ti

GTX 1080

Rainbow 6 Siege

Rainbow 6 Siege has maintained a massive following since its launch and it consistently in Steams Top Ten highest player count game. In a title where the higher the framerate the better in a tactical yet fast-paced competitive landscape is essential, we include this title despite its ludicrously high framerates. We use the Vulkan Ultra preset with the High Defenition Texture Pack as well and gather our results from the built-in benchmarking tool.

Rainbow 6 Siege UW 1440p Vulkan Ultra

AVG FPS

1% Percentile

0

70

140

210

280

350

420

0

70

140

210

280

350

420

RTX 3080

RTX Titan

RTX 2080Ti

RTX 2080

RX 5700XT

GTX 1080Ti

GTX 1080

DOOM Eternal

DOOM Eternal brings hell to earth with the Vulkan powered idTech 7. We test this game using the Ultra Nightmare Preset and follow our in-game benchmarking to stay as consistent as possible.

DOOM Eternal UW 1440p Ultra Nightmare

AVG FPS

1% Percentile

0

50

100

150

200

250

300

0

50

100

150

200

250

300

RTX 3080

RTX Titan

RTX 2080Ti

RTX 2080

RX 5700XT

GTX 1080Ti

GTX 1080

Gears Tactics

Gears Tactics is the latest in the Gears franchise and takes things in a completely different direction with the gameplay design. It is built on a DX12 based Unreal Engine 4 build. We used the Maximum settings allowed but refrained from enabling Variable Rate Shading as all cards ar not capable of supporting this feature.

Gears Tactics UW 1440p Maximum

AVG FPS

1% Percentile

0

20

40

60

80

100

120

0

20

40

60

80

100

120

RTX 3080

RTX Titan

RTX 2080Ti

RTX 2080

RX 5700XT

GTX 1080Ti

GTX 1080

Ghost Recon Breakpoint

Ghost Recon Breakpoint is powered by the latest iteration of the Anvil Next 2.0 game engine. This is the same engine that was used in Assassin's Creed Odyssey but in Breakpoint has been updated to support the Vulkan API. We performed our tests using the High Preset with the Vulkan API.

Ghost Recon Breakpoint UW 1440p Vulkan High

AVG FPS

1% Percentile

0

40

80

120

160

200

240

0

40

80

120

160

200

240

RTX 3080

RTX Titan

RTX 2080Ti

RTX 2080

RX 5700XT

GTX 1080Ti

GTX 1080

Call of Duty Modern Warfare (2019)

Call of Duty Modern Warfare is back and this time on a new engine running DX12 to allow for some sick DXR Ray Traced Shadows, but we're not testing that here since this card isn't designed for that level of rendering. We tested in the 'Fog of War' mission where we tested our RT performance run. At UW 1440p we set the settings all to High.

Call of Duty Modern Warfare UW 1440p Highest

AVG FPS

1% Percentile

0

40

80

120

160

200

240

0

40

80

120

160

200

240

RTX 3080

RTX Titan

RTX 2080Ti

RTX 2080

RX 5700XT

GTX 1080Ti

GTX 1080

Resident Evil 3

The Resident Evil 3 Remake has surpassed the RE2 Remake in visuals and is the latest use of the RE Engine. While it does have DX12 support the DX11 implementation is far superior and because of that, we will be sticking to DX11 for this title. We use the cutscene where Jill and Carlos enter the subway car for the first time and a 2 minute capture at that point.

Resident Evil 3 UW 1440p DX11 Maximum

AVG FPS

1% Percentile

0

40

80

120

160

200

240

0

40

80

120

160

200

240

RTX 3080

RTX Titan

RTX 2080Ti

RTX 2080

RX 5700XT

GTX 1080Ti

GTX 1080

Borderlands 3

Borderlands 3 has made its way into the test lineup thanks to strong demand by gamers and simply delivering MORE Borderlands. This game is rather intensive after the Medium preset but since we're testing the 'Ultimate UW 1440p' card, High it is. We tested using the built-in benchmark utility

Borderlands 3 UW 1440p High

AVG FPS

1% Percentile

0

40

80

120

160

200

240

0

40

80

120

160

200

240

RTX 3080

RTX Titan

RTX 2080Ti

RTX 2080

RX 5700XT

GTX 1080Ti

GTX 1080

Total War Saga: Troy

Total War Saga: Troy is powered by their TW Engine 3 (Total War Engine 3) and in this iteration, they have stuck to a strictly DX11 release. We tested the game using the built-in benchmark using the Dynasty model that represents a battle with many soldiers interacting at once and is more representative of normal gameplay.

Total War Saga

AVG FPS

1% Percentile

0

20

40

60

80

100

120

0

20

40

60

80

100

120

RTX 3080

RTX Titan

RTX 2080Ti

RTX 2080

RX 5700XT

GTX 1080Ti

GTX 1080

Forza Horizon 4

Forza Horizon 4 carries on the open-world racing tradition of the Horizon series. The latest DX12 powered entry is beautifully crafted and amazingly well executed and is a great showcase of DX12 games. We use the benchmark run while having all of the settings set to non-dynamic with an uncapped framerate to gather these results.

Forza Horizon 4 4K Ultra

AVG FPS

1% Percentile

0

40

80

120

160

200

240

0

40

80

120

160

200

240

RTX 3080

RTX Titan

RTX 2080Ti

RTX 2080

RX 5700XT

GTX 1080Ti

GTX 1080

Shadow of the Tomb Raider

Shadow of the Tomb Raider, unlike its predecessor, does a good job putting DX12 to use and results in higher performance than the DX11 counterpart in this title and because of that, we test this title in DX12. I do use the second segment of the benchmark run to gather these numbers as it is more indicative of in-game scenarios where the foliage is heavy.

Shadow of the Tomb Raider 4K DX12 Highest

AVG FPS

1% Percentile

0

20

40

60

80

100

120

0

20

40

60

80

100

120

RTX 3080

RTX Titan

RTX 2080Ti

RTX 2080

RX 5700XT

GTX 1080Ti

GTX 1080

Rainbow 6 Siege

Rainbow 6 Siege has maintained a massive following since its launch and it consistently in Steams Top Ten highest player count game. In a title where the higher the framerate the better in a tactical yet fast-paced competitive landscape is essential, we include this title despite its ludicrously high framerates. We use the Vulkan Ultra preset with the High Defenition Texture Pack as well and gather our results from the built-in benchmarking tool.

Rainbow 6 Siege 4K Vulkan Ultra

AVG FPS

1% Percentile

0

50

100

150

200

250

300

0

50

100

150

200

250

300

RTX 3080

RTX Titan

RTX 2080Ti

RTX 2080

RX 5700XT

GTX 1080Ti

GTX 1080

DOOM Eternal

DOOM Eternal brings hell to earth with the Vulkan powered idTech 7. We test this game using the Ultra Nightmare Preset and follow our in-game benchmarking to stay as consistent as possible.

DOOM Eternal 4K Ultra Nightmare

AVG FPS

1% Percentile

0

40

80

120

160

200

240

0

40

80

120

160

200

240

RTX 3080

RTX Titan

RTX 2080Ti

RTX 2080

RX 5700XT

GTX 1080Ti

GTX 1080

Gears Tactics

Gears Tactics is the latest in the Gears franchise and takes things in a completely different direction with the gameplay design. It is built on a DX12 based Unreal Engine 4 build. We used the Maximum settings allowed but refrained from enabling Variable Rate Shading as all cards are not capable of supporting this feature.

Gears Tactics 4K Maximum

AVG FPS

1% Percentile

0

20

40

60

80

100

120

0

20

40

60

80

100

120

RTX 3080

RTX Titan

RTX 2080Ti

RTX 2080

RX 5700XT

GTX 1080Ti

GTX 1080

Ghost Recon Breakpoint

Ghost Recon Breakpoint is powered by the latest iteration of the Anvil Next 2.0 game engine. This is the same engine that was used in Assassin's Creed Odyssey but in Breakpoint has been updated to support the Vulkan API. We performed our tests using the High Preset with the Vulkan API.

Ghost Recon Breakpoint 4K Vulkan High

AVG FPS

1% Percentile

0

20

40

60

80

100

120

0

20

40

60

80

100

120

RTX 3080

RTX Titan

RTX 2080Ti

RTX 2080

RX 5700XT

GTX 1080Ti

GTX 1080

Call of Duty Modern Warfare (2019)

Call of Duty Modern Warfare is back and this time on a new engine running DX12 to allow for some sick DXR Ray Traced Shadows, those results are in the RT section We tested in the 'Fog of War' mission. At 4K we set the settings all to High.

Call of Duty Modern Warfare4K Highest

AVG FPS

1% Percentile

0

40

80

120

160

200

240

0

40

80

120

160

200

240

RTX 3080

RTX Titan

RTX 2080Ti

RTX 2080

RX 5700XT

GTX 1080Ti

GTX 1080

Resident Evil 3

The Resident Evil 3 Remake has surpassed the RE2 Remake in visuals and is the latest use of the RE Engine. While it does have DX12 support the DX11 implementation is far superior and because of that, we will be sticking to DX11 for this title. We use the cutscene where Jill and Carlos enter the subway car for the first time and a 2 minute capture at that point.

Resident Evil 3 4K DX11 Maximum

AVG FPS

1% Percentile

0

20

40

60

80

100

120

0

20

40

60

80

100

120

RTX 3080

RTX Titan

RTX 2080Ti

RTX 2080

RX 5700XT

GTX 1080Ti

GTX 1080

Borderlands 3

Borderlands 3 has made its way into the test lineup thanks to strong demand by gamers and simply delivering MORE Borderlands. This game is rather intensive after the Medium preset but since we're testing the 'Ultimate UW 1440p' card, High it is. We tested using the built-in benchmark utility

Borderlands 3 4K High

AVG FPS

1% Percentile

0

20

40

60

80

100

120

0

20

40

60

80

100

120

RTX 3080

RTX Titan

RTX 2080Ti

RTX 2080

RX 5700XT

GTX 1080Ti

GTX 1080

Total War Saga: Troy

Total War Saga: Troy is powered by their TW Engine 3 (Total War Engine 3) and in this iteration, they have stuck to a strictly DX11 release. We tested the game using the built-in benchmark using the Dynasty model that represents a battle with many soldiers interacting at once and is more representative of normal gameplay.

Total War Saga

AVG FPS

1% Percentile

0

10

20

30

40

50

60

0

10

20

30

40

50

60

RTX 3080

RTX Titan

RTX 2080Ti

RTX 2080

RX 5700XT

GTX 1080Ti

GTX 1080

Shadow of the Tomb Raider

Shadow of the Tomb Raider, unlike its predecessor, does a good job putting DX12 to use and results in higher performance than the DX11 counterpart in this title, and because of that, we test this title in DX12. I do use the second segment of the benchmark run to gather these numbers as it is more indicative of in-game scenarios where the foliage is heavy. SotTR features Ray Traced Shadows as well as DLSS and we used both in the benchmarks with the game set to the 'Highest' preset and RT Shadows at Ultra with DLSS enabled.

Shadow of the Tomb Raider 1440p 'Highest', RT Shadows Ultra, DLSS Enabled

AVG FPS

1% Percentile

0

20

40

60

80

100

120

0

20

40

60

80

100

120

RTX 3080

RTX Titan

RTX 2080Ti

RTX 2080

Modern Warfare

Call of Duty Modern Warfare is back and this time on a new engine running DX12 to allow for some sick DXR Ray Traced Shadows. We tested in the 'Fog of War' mission where we tested our RT performance run. At 1440p we set the settings all to High with ray-traced shadows enabled.

Call of Duty Modern Warfare 1440p 'High' RT Shadows Enabled

AVG FPS

1% Percentile

0

40

80

120

160

200

240

0

40

80

120

160

200

240

RTX 3080

RTX Titan

RTX 2080Ti

RTX 2080

Control

Control is powered by Remedy's Northlight Storytelling Engine but severely pumped up to support multiple functions of ray-traced effects. We ran this through our test run in the cafeteria with all ray tracing functions on high and the game set to high. DLSS was enabled for this title in the quality setting.

Control 1440p 'High', RT Reflections, RT Shadows, DLSS Quality

AVG FPS

1% Percentile

0

40

80

120

160

200

240

0

40

80

120

160

200

240

RTX 3080

RTX Titan

RTX 2080Ti

RTX 2080

Battlefield V

Battlefield V was one of the earlier games in the RTX 20 Series lifecycles to receive a DXR update. Battlefield V was tested on the opening sequence of the Tiralleur war story as it's been consistently one of the more demanding scenes for ray traced reflections that are featured in this game. DLSS was enabled for this game.

Battlefield V 1440p 'Ultra' RT Reflections, DLSS Enabled

AVG FPS

1% Percentile

0

20

40

60

80

100

120

0

20

40

60

80

100

120

RTX 3080

RTX Titan

RTX 2080Ti

RTX 2080

Metro Exodus

Metro Exodus was the third entry into the Metro series and as Artym vetures away from the Metro he, and you, are able to explore the world with impressive RT Global Illumination. RTGI has proven to be quite the intense feature to run. Metro Exodus also supports DLSS so it was used in our testing. Advanced PhysX was left disabled, but Hairworks was left on.

Metro Exodus 1440p 'Ultra' Ray Tracing 'Ultra' DLSS Enabled

AVG FPS

1% Percentile

0

40

80

120

160

200

240

0

40

80

120

160

200

240

RTX 3080

RTX Titan

RTX 2080Ti

RTX 2080

Minecraft

Minecraft, yup Minecraft. When it comes to ray tracing Minecraft has it all. The Minecraft with RTX update has recently been updated to DXR1.1 so it gets the latest treatment in that regards. But, we're talking a fully path traced version of Minecraft here. We set up a run in the RTX world of Crystal Palace and set the Chunks to the maximium of 24, up from the default 8 in order to really turn the wrenches. Minecraft with RTX supports DLSS so it was used here.

Minecrafte 1440p with RTX '24 Chunks' DLSS Enabled

AVG FPS

1% Percentile

0

40

80

120

160

200

240

0

40

80

120

160

200

240

RTX 3080

RTX Titan

RTX 2080Ti

RTX 2080

Quake 2 RTX

Quake II RTX is much like Minecraft with RTX in the sense that it is fully path-traced, so no rasterization here. This one however doesn't support DLSS so you're going to have to brute force it to acceptable framerates. Thankfully if these numbers don't do it for you then you can always adjust the resolution slider and enjoy a healthy performance boost.

Quake II RTX 1440p

AVG FPS

1% Percentile

0

20

40

60

80

100

120

0

20

40

60

80

100

120

RTX 3080

RTX Titan

RTX 2080Ti

RTX 2080

Boundary

Boundary is a multiplayer tactical shooter...in space. It's not out yet so treat this one as more of a synthetic benchmark as there are likely to be quite a few improvement but for now we had access to the benchmark and it's a doozy to run. Featuring full raytracing effects for the benchmark as well as DLSS, we ran that in Quality mode.

Boundary 1440p RT Enabled, DLSS Quality

AVG FPS

1% Percentile

0

20

40

60

80

100

120

0

20

40

60

80

100

120

RTX 3080

RTX Titan

RTX 2080Ti

RTX 2080

Bright Memory

Bright Memory is an action shooter that is currently in early access on Steam, will later be called Bright Memory Infinite when it fully releases. A one man team has turned this game into a showtopper and now it features RT reflections as well as DLSS. We ran it at the High preset with DLSS set to Balanced for our testing.

Bright Memory 1440p RT High, DSLL Balanced

AVG FPS

1% Percentile

0

20

40

60

80

100

120

0

20

40

60

80

100

120

RTX 3080

RTX Titan

RTX 2080Ti

RTX 2080

Amid Evil

Amid Evil is a high energy old school shotoer that seems like an unlikely recipient of RT features, but here we are with insane DXR support in a modern retro shooter. Feature RT Reflections, RT Shadows, and NVIDIA's DLSS support we had to put this one through the rounds and see how things went. The RTX version of this game is still in beta but publicly available for those who want to try it. We tested with all RT features on and DLSS enabled.

Amid Evil 1440P RT Reflections, RT Shadows, Lights 100%, DLSS Enabled

AVG FPS

1% Percentile

0

50

100

150

200

250

300

0

50

100

150

200

250

300

RTX 3080

RTX Titan

RTX 2080Ti

RTX 2080

Death Stranding

Sam Porter Bridges has delivered one of PS4's most anticipated games to the PC community and opened a whole new wold of possibilities. This was the first game to feature the Decima Engine on PC and unarguably did it the best. Death Stranding may not feature ray tracing effects but it does showcase that DLSS can be used effectively even when RT isn't around. We tested this one just like we did in our launch coverage with DLSS enabled.

Death Stranding 1440p Highest Settings DLSS Quality

AVG FPS

1% Percentile

0

40

80

120

160

200

240

0

40

80

120

160

200

240

RTX 3080

RTX Titan

RTX 2080Ti

RTX 2080

Shadow of the Tomb Raider

Shadow of the Tomb Raider, unlike its predecessor, does a good job putting DX12 to use and results in higher performance than the DX11 counterpart in this title, and because of that, we test this title in DX12. I do use the second segment of the benchmark run to gather these numbers as it is more indicative of in-game scenarios where the foliage is heavy. SotTR features Ray Traced Shadows as well as DLSS and we used both in the benchmarks with the game set to the 'Highest' preset and RT Shadows at Ultra with DLSS enabled.

Shadow of the Tomb Raider 4K 'Highest', RT Shadows Ultra, DLSS Enabled

AVG FPS

1% Percentile

0

20

40

60

80

100

120

0

20

40

60

80

100

120

RTX 3080

RTX Titan

RTX 2080Ti

RTX 2080

Modern Warfare

Call of Duty Modern Warfare is back and this time on a new engine running DX12 to allow for some sick DXR Ray Traced Shadows. We tested in the 'Fog of War' mission where we tested our RT performance run. At 4K we set the settings all to High with ray-traced shadows enabled.

Call of Duty Modern Warfare 4K 'High' RT Shadows Enabled

AVG FPS

1% Percentile

0

20

40

60

80

100

120

0

20

40

60

80

100

120

RTX 3080

RTX Titan

RTX 2080Ti

RTX 2080

Control

Control is powered by Remedy's Northlight Storytelling Engine but severely pumped up to support multiple functions of ray-traced effects. We ran this through our test run in the cafeteria with all ray tracing functions on high and the game set to high. DLSS was enabled for this title in the quality setting.

Control 4K 'High', RT Reflections, RT Shadows, DLSS Quality

AVG FPS

1% Percentile

0

20

40

60

80

100

120

0

20

40

60

80

100

120

RTX 3080

RTX Titan

RTX 2080Ti

RTX 2080

Battlefield V

Battlefield V was one of the earlier games in the RTX 20 Series lifecycles to receive a DXR update. Battlefield V was tested on the opening sequence of the Tirailleur war story as it's been consistently one of the more demanding scenes for ray-traced reflections that are featured in this game. DLSS was enabled for this game.

Battlefield V 4K 'Ultra' RT Reflections, DLSS Enabled

AVG FPS

1% Percentile

0

20

40

60

80

100

120

0

20

40

60

80

100

120

RTX 3080

RTX Titan

RTX 2080Ti

RTX 2080

Metro Exodus

Metro Exodus was the third entry into the Metro series and as Artym ventures away from the Metro he, and you, are able to explore the world with impressive RT Global Illumination. RTGI has proven to be quite an intense feature to run. Metro Exodus also supports DLSS so it was used in our testing. Advanced PhysX was left disabled, but Hairworks was left on.

Metro Exodus 4K 'Ultra' Ray Tracing 'Ultra' DLSS Enabled

AVG FPS

1% Percentile

0

20

40

60

80

100

120

0

20

40

60

80

100

120

RTX 3080

RTX Titan

RTX 2080Ti

RTX 2080

Minecraft

Minecraft, yup Minecraft. When it comes to ray tracing Minecraft has it all. The Minecraft with RTX update has recently been updated to DXR1.1 so it gets the latest treatment in that regard. But, we're talking a fully path traced version of Minecraft here. We set up a run in the RTX world of Crystal Palace and set the Chunks to the maximum of 24, up from the default 8 in order to really turn the wrenches. DLSS was enabled for this game.

Minecrafte 4K with RTX '24 Chunks' DLSS Enabled

AVG FPS

1% Percentile

0

20

40

60

80

100

120

0

20

40

60

80

100

120

RTX 3080

RTX Titan

RTX 2080Ti

RTX 2080

Quake 2 RTX

Quake II RTX is much like Minecraft with RTX in the sense that it is fully path-traced, so no rasterization here. This one however doesn't support DLSS so you're going to have to brute force it to acceptable framerates. Thankfully if these numbers don't do it for you then you can always adjust the resolution slider and enjoy a healthy performance boost.

Quake II RTX 4Kp

AVG FPS

1% Percentile

0

6

12

18

24

30

36

0

6

12

18

24

30

36

RTX 3080

RTX Titan

RTX 2080Ti

RTX 2080

Boundary

Boundary is a multiplayer tactical shooter...in space. It's not out yet so treat this one as more of a synthetic benchmark as there are likely to be quite a few improvements but for now, we had access to the benchmark and it's a doozy to run. Featuring full raytracing effects for the benchmark as well as DLSS, we ran that in Quality mode.

Boundary 4K RT Enabled, DLSS Quality

AVG FPS

1% Percentile

0

6

12

18

24

30

36

0

6

12

18

24

30

36

RTX 3080

RTX Titan

RTX 2080Ti

RTX 2080

Bright Memory

Bright Memory is an action shooter that is currently in early access on Steam, will later be called Bright Memory Infinite when it fully releases. A one-man team has turned this game into a showstopper and now it features RT reflections as well as DLSS. We ran it at the High preset with DLSS set to Balanced for our testing.

Bright Memory 4K RT High, DSLL Balanced

AVG FPS

1% Percentile

0

8

16

24

32

40

48

0

8

16

24

32

40

48

RTX 3080

RTX Titan

RTX 2080Ti

RTX 2080

Amid Evil

Amid Evil is a high energy old school shotoer that seems like an unlikely recipient of RT features, but here we are with insane DXR support in a modern retro shooter. Feature RT Reflections, RT Shadows, and NVIDIA's DLSS support we had to put this one through the rounds and see how things went. The RTX version of this game is still in beta but publicly available for those who want to try it. We tested with all RT features on and DLSS enabled.

Amid Evil 4K RT Reflections, RT Shadows, Lights 100%, DLSS Enabled

AVG FPS

1% Percentile

0

40

80

120

160

200

240

0

40

80

120

160

200

240

RTX 3080

RTX Titan

RTX 2080Ti

RTX 2080

Death Stranding

Sam Porter Bridges has delivered one of PS4's most anticipated games to the PC community and opened a whole new wold of possibilities. This was the first game to feature the Decima Engine on PC and unarguably did it the best. Death Stranding may not feature ray tracing effects but it does showcase that DLSS can be used effectively even when RT isn't around. We tested this one just like we did in our launch coverage with DLSS enabled.

Death Stranding 4K Highest Settings DLSS Quality

AVG FPS

1% Percentile

0

40

80

120

160

200

240

0

40

80

120

160

200

240

RTX 3080

RTX Titan

RTX 2080Ti

RTX 2080

Graphics cards and power draw have always been quite synonymous with each other in terms of how much performance they put out for the power they take in. Measuring this has not always been the most straight forward when it comes to accuracy and methods for reviewers and end-users. NVIDIA has developed their PCAT system, or Power Capture Analysis Tool in order to be able to capture direct power consumption from ALL graphics cards that plug into the PCIe slot so that you can get a very clear barometer on actual power usage without relying on hacked together methods

The Old Way

The old method, for most anyway, was to simply use something along the lines of a Kill-A-Watt wall meter for power capture. This isn't the worst way, but as stated in our reviews it doesn't quite capture the amount of power that the graphics card alone is using. This results in some mental gymnastics to figure out how much the graphics card is using by figuring the system idle, CPU load, and the GPU load and estimating about where the graphics card lands, not very accurate to say the least.

Another way is to use GPU-z. This is the least reliable method as you have to rely entirely on the software reading from the graphics card. This is a poor method as the graphics cards vary in how they report to software when it comes to power usage. Some will only send out what the GPU core itself is using and not consider what the memory is drawing or any other component.

The last way I'll mention is the use of a multi-meter amperage clamp across the PCIe slot by way of a riser cable with separate cables then more power clamps on all the PCIe power cables going into the graphics card. This method is very accurate for graphics card power but is also very cumbersome and typically results in you having to watch the numbers and document them as you see them rather than plotting them across a spreadsheet.

The PCAT Way

This is where PCAT (power capture analysis tool) comes into play. NVIDIA has developed quite a robust tool for measuring graphics card power at the hardware level and taking the guesswork out of the equation. The tool is quite simple to set up and get going, as far as components used there are; a riser board for the GPU with a 4-pin Dupont cable, the PCAT module itself that everything plugs into with an OLED screen attached, 3 PCI-e cables for when a card calls for more than 2x 8-pin connectors, and a Micro-USB cable that allows you to capture the data on the system you're hooked up to or a secondary monitoring system.

Well, that's what it looks like when all hooked up on a test bench, you're not going to want to run this one in a case for sure. Before anyone gets worried, performance is not affected at all by this and the riser board is fully compliant with PCIe Gen 4.0. I'm not so certain about those exposed power points however, I will be getting the hot glue gun out soon for that. Now, what does this do at this point? Well, two options: Plug it into the computer that it's all running on and let FrameView include the metrics, but that's for NVIDIA cards only so a pass, OR (what we do) plug it into a separate monitoring computer and observe and capture during testing scenarios.

The PCAT Power Profile Analyzer is the software tool provided to use to capture and monitor power readings across the PCI Express Power profile. The breadth of this tool is exceptionally useful for us here on the site to really explore what we can monitor. The most useful metric on here to me is the ability to monitor power across all sources, PCIe power cables (individually), and the PCIe slot itself.

Those who rather pull long-form spreadsheets to make their own charts are fully able to do so and even able to quickly form performance per watt metrics. We've found a very fun metric to monitor is actually Watts per frame, how many watts does it take for the graphics card to produce one frame at a locked 60FPS in various games, we'll get into that next.

Control Power

Control was the first game that we wanted to take a look at running at 1440p with RT and DLSS on, and then again with DLSS off, this is the game that NVIDIA used when showcasing the performance per watt improvements of Ampere, and well..they were right in the claim there.

Control 1440p 'High' RT High, DLSS On

GPU Full Load

Total System

1440p60 Power Load

0

90

180

270

360

450

540

0

90

180

270

360

450

540

RTX 3080

RTX Titan

RTX 2080Ti

RTX 2080

Control RT Watts-Per-FPS

Watts-Per-FPS

0

2

4

6

0

2

4

6

RTX 3080

RTX Titan

RTX 2080Ti

RTX 2080

Control 1440p 'High' No RT, DLSS Off

GPU Full Load

Total System

1440p60 Power Load

0

90

180

270

360

450

540

0

90

180

270

360

450

540

RTX 3080

RTX Titan

RTX 2080Ti

RTX 2080

RX 5700 XT

Control non-RT Watts-Per-FPS

Watts-Per-FPS

0

2

4

6

0

2

4

6

RTX 3080

RTX Titan

RTX 2080Ti

RTX 2080

RX 5700 XT

From these results for Control is shows that NVIDIAs measurements and claims of improvements were accurate, but it's not always the case. We tested Forza Horizon 4 in a spot to test the same way again but this time at 4K and looking at when we target at 4K60 scene in this game

Forza Horizon 4 4K Ultra

GPU Idle

GPU Full Load

Total System

1440p60 Power Load

0

90

180

270

360

450

540

0

90

180

270

360

450

540

RTX 3080

RTX Titan

RTX 2080Ti

RTX 2080

RX 5700XT

GTX 1080Ti

GTX 1080

Forza Horizon 4 Watts-Per-FPS

Watts-Per-FPS

0

2

4

6

0

2

4

6

RTX 3080

RTX Titan

RTX 2080Ti

RTX 2080

RX 5700XT

GTX 1080Ti

GTX 1080

Overclocking the GA102 powered GeForce RTX 3080 will get much more attention down the road, but for now, we did a quick and dirty (but still stable) overclock using MSI Afterburner. We took the power slider to the 115% mark on the Power Slider and toyed with balancing the GPU and Memory as we crept up. Some have gotten massive overclocks on the memory, we did not. We managed to get a +500 mark on our memory which pushed us up to 20Gbps and a memory bandwidth of 800GB/s. We tried for +750 but memory protection kicked in and our performance suffered greatly.

The core was able to be pushed to +125 if we left the memory at +250, but found the memory gains more appreciable so we scaled the core back to +75 resulting in a gaming average clock rate between 1925-1960MHz. Once overclocked this way the power draw rose quite a bit with us seeing the total system pull over 500w and the graphics card alone accounting for 369 of that. For the 5% or so uplift at 4K I would likely leave overclocking alone for now until you find you need it down the road.

Firestrike

Firestrike is running the DX11 API and is still a good measure of GPU scaling performance, in this test we ran the Ultra version of Firestrike which runs at 4K and we recorded the Graphics Score only since the Physics and combined are not pertinent to this review.

3DMark Firestrike Ultra Graphics

Score

0

4000

8000

12000

16000

20000

24000

0

4000

8000

12000

16000

20000

24000

RTX 3080 OC

RTX 3080

Time Spy

Time Spy is running the DX12 API and we used it in the same manner as Firestrike Extreme where we only recorded the Graphics Score as the Physics score is recording the CPU performance and isn't important to the testing we are doing here.

3DMark Time Spy Extreme Graphics

Score

0

2000

4000

6000

8000

10000

12000

0

2000

4000

6000

8000

10000

12000

RTX 3080 OC

RTX 3080

Forza Horizon 4

Forza Horizon 4 carries on the open-world racing tradition of the Horizon series. The latest DX12 powered entry is beautifully crafted and amazingly well executed and is a great showcase of DX12 games. We use the benchmark run while having all of the settings set to non-dynamic with an uncapped framerate to gather these results.

Forza Horizon 4 4K Ultra

AVG FPS

1% Percentile

0

40

80

120

160

200

240

0

40

80

120

160

200

240

RTX 3080 OC

RTX 3080

Rainbow 6 Siege

Rainbow 6 Siege has maintained a massive following since its launch and it consistently in Steams Top Ten highest player count game. In a title where the higher the framerate the better in a tactical yet fast-paced competitive landscape is essential, we include this title despite its ludicrously high framerates. We use the Vulkan Ultra preset with the High Defenition Texture Pack as well and gather our results from the built-in benchmarking tool.

Rainbow 6 Siege 4K Vulkan Ultra

AVG FPS

1% Percentile

0

50

100

150

200

250

300

0

50

100

150

200

250

300

RTX 3080 OC

RTX 3080

Resident Evil 3

The Resident Evil 3 Remake has surpassed the RE2 Remake in visuals and is the latest use of the RE Engine. While it does have DX12 support the DX11 implementation is far superior and because of that, we will be sticking to DX11 for this title. We use the cutscene where Jill and Carlos enter the subway car for the first time and a 2 minute capture at that point.

Resident Evil 3 4K DX11 Maximum

AVG FPS

1% Percentile

0

40

80

120

160

200

240

0

40

80

120

160

200

240

RTX 3080 OC

RTX 3080

Thermals

Thermals were measured from our open test bench after running the Time Spy graphics test 2 on loop for 30 minutes recording the highest temperatures reported. The room was climate controlled and kept at a constant 22c throughout the testing.

Temperatures (22c Ambient)

Load

Idle

0

20

40

60

80

100

120

0

20

40

60

80

100

120

RTX 3080 OC

RTX 3080

Power Consumption While Overclocked

Overclocked Power Draw

GPU Idle

GPU Full Load

Total System

0

100

200

300

400

500

600

0

100

200

300

400

500

600

RTX 3080 OC

RTX 3080

When Turing launched there were a lot of unknowns regarding future support. Two years later, we have a much better view of the market. With the upcoming gaming landscape, we can see the benefits of what Turing brought to the market finally coming to fruition. But, now those unknowns are a thing and they're growing. The Ampere powered NVIDIA GeForce RTX 3080 simply brings it in a big way and completely obliterates everything on the market when it comes to performance. The Titan RTX fell prey to what NVIDIA is calling their new flagship card with the Titan only besting it in Gears Tactics.

The design of the new Founders Edition cooler is quite interesting no matter how you look at it, heatsink everywhere and really quiet operation while keeping the 320w monster card cool as a cucumber. There was a lot of concern coming into this one from the community regarding the thermal performance and it's clear that NVIDIA was NOT going to succumb to the Hot & Loud claim that has befallen others when even trying to come close to the level of performance that this card is capable of.

Speaking of performance, it's there. There was a time I had reserved that this level of generational performance was gone forever. But, here we are. At the RTX 3080's worst uplift over the RTX 2080, we get an average generational performance boost, but on average we're seeing what would typically be a generational PLUS one tier up performance. The GeForce RTX 3080 simply leaves the RTX 2080Ti at nearly double it's the price in its dust. I can now eat my words when I said on Twitter that we were still quite some time away from having a really good 4K gaming experience without it being a $1000+ graphics card requirement, we're there. If you're running 1440p then you're going to be killing it on a 144Hz panel and want more, but if you're on a 3440x1440 144Hz screen you're going to be in heaven.

I can't go over performance without mentioning DXR and the likes of Ray Tracing performance. It's there, people argued that the RTX 20 Series just wasn't quite there, but the RTX 30 Series is it. We're seeing great performance in all games that run the RT features and even better when they're paired with DLSS. Early implementations of DLSS may have been mired with image quality issues but the later DLSS 2.0 has been spectacular. The latest update to DXR1.1 with Minecraft with RTX has shown that this is continuing to get better and if you look at upcoming games, the net is getting wider. This time you get massive uplift no matter what kind of games you're playing.

The cooler on this card does its thing, with flying colors to add. There is still some work on our end to be done regarding case effectiveness and how it impacts other components, but the time I've spent with it stuffed into my personal gaming system with the Ryzen 5 3600X stuffed into an NZXT H210i case has been more than acceptable in terms of thermals and noise, more on that in a follow-up piece.

The elephant in the room with the RTX 3080 has likely been its power requirements in regards to PSU needs and concerns. You're going to have to feed this beast, it goes all out and doesn't make any apologies about it while doing so. The performance comes along with it so you're not just cranking the power without the payoff. In our testing under default configuration for the RTX 3080, we found our total system power under 500w, far off from the 'you need 1kW PSUs' meme we've seen lately. Power draw high? Yes, Performance High? Also Yes.

If you've been holding out like so many with a 1080Ti you might have just found your price equivalent upgrade. Even if you have an RTX 2080 TI could see the benefit of upgrading as well if you feel you need quite the boost anyway. Ampere delivers and it's quite easy to see why NVIDIA was so AMP'd.

Contents

Follow Wccftech on Google to get more of our news coverage in your feeds.