NVIDIA GeForce RTX 3070 8GB Founders Edition Graphics Card Review

Oct 27, 2020 at 08:58am EDT

NVIDIA Ampere GPU - Next-Gen Display Engine, HDMI 2.1, RTX IO More

Keeping their tradition alive of launching a new graphics architecture every two years, this year, NVIDIA introduces its Ampere GPU. The Ampere GPU is built upon the foundation set by Turing. Termed as its biggest generational leap, the NVIDIA Ampere GPUs excel compared to previous generations at everything. This time with the GeForce RTX 3070 Founders Edition.

The Ampere GPU architecture has a lot to be talked about in this review, but so does the new RTX lineup. The Ampere lineup offers faster shader performance, faster ray tracing performance, and faster AI performance. Built on a brand new process node and featuring an architecture designed from the ground up, Ampere is a killer product with lots of numbers to talk about.

Related Story Cooler Manufacturer Doubles Down on 1000W GPUs With Twin 360mm AIO Monster, Signaling Where NVIDIA’s Rubin RTX Cards Are Headed

The fundamental of Ampere was to take everything NVIDIA learned with its Turing architecture and not only refine it but to use its DNA to form a product in a completely new performance category. Tall claims were made by NVIDIA when they introduced its Ampere lineup earlier this month & we will be finding out whether NVIDIA hit all the ticks with its Ampere architecture as this review will be your guiding path to see what makes Ampere and how it performs against its predecessors.

Today, we will be taking a look at the NVIDIA GeForce RTX 3070 Founders Edition graphics card. The card was provided by NVIDIA for the sole purpose of this review & we will be taking a look at their technology, design, and performance metrics in detail.

NVIDIA GeForce RTX 30 Series Gaming Graphics Cards - The Biggest GPU Performance Leap in Recent History

Turing wasn't just any graphics core, it was the graphics core that was to become the foundation of future GPUs. The future is realized now with next-generation consoles going deep in talks about ray tracing and AI-assisted super-sampling techniques. NVIDIA had a head start with Turing and its Ampere generation will only do things infinitely times better.

The Ampere GPU does many traditional things which we would expect from a GPU, but at the same time, also breaks the barrier when it comes to untraditional GPU operations. Just to sum up some features:

The technologies mentioned above are some of the main building blocks of the Ampere GPU, but there's more within the graphics core itself which we will talk about in detail so let's get started.

Let's take a trip down the journey to Ampere. In 2016, NVIDIA announced their Pascal GPUs which would soon be featured in their top to bottom GeForce 10 series lineup. After the launch of Maxwell, NVIDIA gained a lot of experience in the efficiency department which they put a focus on since their Kepler GPUs. Two years go, NVIDIA, rather than offering another standard leap in the rasterization performance of its GPUs took a different approach & introduced two key technologies in its Turing line of consumer GPUs, one being AI-assisted acceleration with the Tensor Cores and the second being hardware-level acceleration for Ray Tracing with its brand new RT cores.

With Ampere and it's brand new Samsung 8nm fabrication process, NVIDIA is adding even more to its gaming graphics lineup. Starting with the most significant part of the Ampere GPU architecture, the Ampere SM, we are seeing an entirely new graphics core. The Ampere SM features the next-gen FP32, INT32, Tensor Cores, and RT cores.

Coming to the new execution units or cores, Ampere has both INT32 and FP32 units which can execute concurrently. This new architectural design allows Turing to execute floating-point and non-floating point operations in parallel which allows for higher throughput in standard floating-point operations. According to NVIDIA, the updated Ampere graphics core delivers up to 1.7x faster traditional rasterization performance and up to 2x faster ray-tracing performance compared to the Turing GPUs.

The Ampere SM is partitioned into four processing blocks, each with 32 FP32 Cores, 16 INT32 Cores, one Tensor Core, one warp scheduler, and one dispatch unit. This is made possible with an updated data path with one data path offering 16 FP32 execution units while the other offers either 16 FP32 or 16 INT32 execution units. This adds to 128 FP32 Cores, 64 INT 32 Cores,4 Tensor, 4 Wrap Schedulers, and 4 Dispatch Units on a single Ampere SM. Each block also includes a new L0 instruction cache and a 64 KB register file for a total of 256 KB register file per SM.

 One of the key design goals for the Ampere 30-series SM was to achieve twice the throughput for FP32 operations compared to the Turing SM. To accomplish this goal, the Ampere SM includes new datapath designs for FP32 and INT32 operations. One datapath in each partition consists of 16 FP32 CUDA Cores capable of executing 16 FP32 operations per clock. Another datapath consists of both 16 FP32 CUDA Cores and 16 INT32 Cores. As a result of this new design, each Ampere SM partition is capable of executing either 32 FP32 operations per clock, or 16 FP32 and 16 INT32 operations per clock. All four SM partitions combined can execute 128 FP32 operations per clock, which is double the FP32 rate of the Turing SM, or 64 FP32 and 64 INT32 operations per clock.

Doubling the processing speed for FP32 improves performance for a number of common graphics and compute operations and algorithms. Modern shader workloads typically have a mixture of FP32 arithmetic instructions such as FFMA, floating point additions (FADD), or floating point multiplications (FMUL), combined with simpler instructions such as integer adds for addressing and fetching data, floating point compare, or min/max for processing results, etc. Performance gains will vary at the shader and application level depending on the mix of instructions. Ray tracing denoising shaders are good examples that might benefit greatly from doubling FP32 throughput.

Doubling math throughput required doubling the data paths supporting it, which is why the Ampere SM also doubled the shared memory and L1 cache performance for the SM. (128 bytes/clock per Ampere SM versus 64 bytes/clock in Turing). Total L1 bandwidth for GeForce RTX 3080 is 219 GB/sec versus 116 GB/sec for GeForce RTX 2080 Super.

Like prior NVIDIA GPUs, Ampere is composed of Graphics Processing Clusters (GPCs), Texture Processing Clusters (TPCs), Streaming Multiprocessors (SMs), Raster Operators (ROPS), and memory controllers.

The GPC is the dominant high-level hardware block with all of the key graphics processing units residing inside the GPC. Each GPC includes a dedicated Raster Engine, and now also includes two ROP partitions (each partition containing eight ROP units), which is a new feature for NVIDIA Ampere Architecture GA10x GPUs. More details on the NVIDIA Ampere architecture can be found in NVIDIA’s Ampere Architecture White Paper, which will be published in the coming days.

The four processing blocks share a combined 128 KB L1 data cache/shared memory. Traditional graphics workloads partition the 128 KB L1/shared memory as 64 KB of dedicated graphics shader RAM and 64 KB for texture cache and register file spill area. In compute mode, the GA10x SM will support the following configurations:

  • 128 KB L1 + 0 KB Shared Memory
  • 120 KB L1 + 8 KB Shared Memory
  • 112 KB L1 + 16 KB Shared Memory
  • 96 KB L1 + 32 KB Shared Memory
  • 64 KB L1 + 64 KB Shared Memory
  • 28 KB L1 + 100 KB Shared Memory

Ampere also ties its ROPs to the HPC and houses a total of 16 ROP units per GPC. The full GA102 GPU feature 112 ROPs while the GeForce RTX 3080 comes with a total of 96 ROPs.

The block diagram of the NVIDIA Ampere SM Gaming GPUs.

The entire SM works in harmony by using different blocks to deliver high performance and better texture caching, enabling up to twice as much CUDA core performance when compared to the previous generation.

A block diagram of the GA102 GPU featured on the NVIDIA GeForce RTX 3080 graphics card.

Many of these Ampere SMs combine to form the Ampere GPU. Each TPC inside the Ampere GPU houses 2 Turing SMs which are linked to the raster engine. There are a total of 6 TPCs or 12 Ampere SM that are arranged inside the GPC or Graphics Processing Cluster. The top configured GA102 GPU comes with 7 GPCs with a total of 42 TPCs and 84 SMs that are connected to 10 MB of L1 and 6 MB of L2 cache, ROPs, TMUs, memory controllers, and NVLINK High Speed I/O hub. All of this combines to form the massive Ampere GA102 GPU. The following are some perf figures for the top Ampere graphics cards.

NVIDIA GeForce RTX 3090

  • 35.58 TFLOPS of peak single-precision (FP32) performance
  • 71.16 TFLOPS of peak half-precision (FP16) performance
  • 17.79 TIPS1 concurrent with FP, through independent integer execution units
  • 258 Tensor TFLOPS
  • 69 RT-TFLOPs

NVIDIA GeForce RTX 3080

  • 30 TFLOPS of peak single-precision (FP32) performance
  • 60 TFLOPS of peak half-precision (FP16) performance
  • 15 TIPS1 concurrent with FP, through independent integer execution units
  • 238 Tensor TFLOPS
  • 58 RT-TFLOPs

NVIDIA GeForce RTX 3070

  • 20.3 TFLOPS of peak single-precision (FP32) performance
  • 40.6 TFLOPS of peak half-precision (FP16) performance
  • 10.1 TIPS1 concurrent with FP, through independent integer execution units
  • 162.6 Tensor TFLOPS
  • 39.7 RT-TFLOPs

In terms of shading performance which is the direct result of the enhanced core design and GPU architecture revamp, the Ampere GPU offers an uplift of up to 70% better performance per core compared to Turing GPUs.

It should be pointed out that these are just per core performance gains at the same clock speeds without adding the benefits of other technologies that Ampere comes with. That would further increase the performance in a wide variety of gaming applications.

NVIDIA Ampere "GeForce RTX 30" GPUs Full Breakdown:

Graphics CardNVIDIA GeForce RTX 2070 SUPERNVIDIA GeForce RTX 3070NVIDIA GeForce RTX 2080NVIDIA GeForce RTX 3080NVIDIA Titan RTXNVIDIA GeForce RTX 3090
GPU CodenameTU106GA104TU104GA102TU102GA102
GPU ArchitectureNVIDIA TuringNVIDIA AmpereNVIDIA TuringNVIDIA AmpereNVIDIA TuringNVIDIA Ampere
GPCs5 or 666667
TPCs202323343641
SMs404646687282
CUDA Cores / SM641286412864128
CUDA Cores / GPU2560588829448704460810496
Tensor Cores / SM8 (2nd Gen)4 (3rd Gen)8 (2nd Gen)4 (3rd Gen)8 (2nd Gen)4 (3rd Gen)
Tensor Cores / GPU320 (2nd Gen)184 (3rd Gen)368272 (3rd Gen)576 (2nd Gen)328 (3rd Gen)
RT Cores40 (1st Gen)46 (2nd Gen)46 (1st Gen)68 (2nd Gen)72 (1st Gen)82 (2nd Gen)
GPU Boost Clock (MHz)177017251800171017701695
Peak FP32 TFLOPS (non-Tensor)9.120.310.629.816.335.6
Peak FP16 TFLOPS (non-Tensor)18.120.321.229.832.635.6
Peak BF16 TFLOPS (non-Tensor)NA20.3NA29.8NA35.6
Peak INT32 TOPS (non-Tensor)9.110.210.614.916.317.8
Peak FP16 Tensor TFLOPS
with FP16 Accumulate
72.581.3/162.684.8119/238130.5142/284
Peak FP16 Tensor TFLOPS
with FP32 Accumulate
36.340.6/81.342.459.5/11965.271/142
Peak BF16 Tensor TFLOPS
with FP32 Accumulate
NA40.6/81.3NA59.5/119NA71/142
Peak TF32 Tensor TFLOPSNA20.3/40.6NA29.8/59.5NA35.6/71
Peak INT8 Tensor TOPS145162.6/325.2169.6238/476261284/568
Peak INT4 Tensor TOPS290325.2/650.4339.1476/952522568/1136
Frame Buffer Memory Size and
Type
8 GB GDDR68 GB GDDR68 GB GDDR610 GB GDDR6X24 GB GDDR624 GB GDDR6X
Memory Interface256-bit256-bit256-bit320-bit384-bit384-bit
Memory Clock (Data Rate)14 Gbps14 Gbps14 Gbps19 Gbps14 Gbps19.5 Gbps
Memory Bandwidth448 GB/sec448 GB/sec448 GB/sec760 GB/sec672 GB/sec936 GB/sec
ROPs6496649696112
Pixel Fill-rate (Gigapixels/sec)113.3165.6115.2164.2169.9193
Texture Units160184184272288328
Texel Fill-rate (Gigatexels/sec)283.2317.4331.2465509.8566
L1 Data Cache/Shared Memory384058884416 KB8704 KB6912 KB10496 KB
L2 Cache Size4096 KB4096 KB4096 KB5120 KB6144 KB6144 KB
Register File Size10240 KB11776 KB11776 KB17408 KB18432 KB20992 KB
TGP (Total Graphics Power)215 Watts220W225W320W280W350W
Transistor Count13.6 Billion17.4 Billion13.6 Billion28.3 Billion18.6 Billion28.3 Billion
Die Size545 mm2392.5 mm2545 mm2628.4 mm2754mm2628.4 mm2
Manufacturing ProcessTSMC 12 nm FFN
(FinFET NVIDIA)
Samsung 8 nm 8N NVIDIA
Custom Process
TSMC 12 nm FFN
(FinFET NVIDIA)
Samsung 8 nm 8N NVIDIA
Custom Process
TSMC 12 nm FFN
(FinFET NVIDIA)
Samsung 8 nm 8N NVIDIA
Custom Process

NVIDIA Ampere GPUs - GA102 & GA104 For The First Wave of Gaming Cards

NVIDIA is first introducing two brand new Ampere GPUs which include the GA102 and the GA104. The GA102 GPU is going to be featured on the GeForce RTX 3090 and GeForce RTX 3080 graphics cards while the GA104 GPU is going to be featured on the GeForce RTX 3070 graphics cards. The Ampere GPUs are based on the Samsung 8nm custom process node for NVIDIA and as such, the resultant GPU dies are slightly smaller than their Turing based predecessors but do come with a denser transistor layout. There will be several variations of each GPU featured across the RTX 30 series lineup. Following is what the complete GA102 and GA104 GPUs have to offer.

NVIDIA Ampere GA102 GPU

The full GA102 GPU is made up of 7 graphics processing clusters with 12 SM units on each cluster. That makes up 84 SM units for a total of 10752 cores in a 28.3 billion transistor package measuring 628.4mm2.

NVIDIA Ampere GA104 GPU

The full GA104 GPU is made up of 6 graphics processing clusters with 8 SM units on each cluster. That makes up 48 SM units for a total of 6144 cores in a 17.4 billion transistor package measuring 392.5mm2.

NVIDIA has also introduced its 3rd Generation Tensor core architecture and 2nd Generation RT cores on Ampere GPUs. Now Tensor cores have been available since Volta and consumers got a taste of it with the Turing GPUs. One of the key areas where Tensor Cores are put to use for AAA games is DLSS. There's a whole software stack that leverages from Tensor cores and that is known as the NVIDIA NGX. These software-based technologies will help enhance graphics fidelity with features such as Deep Learning Super Sampling (DLSS), AI InPainting, AI Super Rez, RTX Voice, and AI Slow-Mo.

While its initial debut was a bit flawed, DLSS in its 2nd iteration (DLSS 2.0) has done wonders to not only improve gaming performance but also image quality. In titles such as Death Stranding and Control, games are shown to offer higher visual fidelity than at native resolution while running at much higher framerates. With Ampere, we can expect an even higher boost in terms of DLSS 2.0 (and DLSS Next-Gen) performance as the deep-learning model continues working its magic in DLSS supported titles. NVIDIA will also be adding 8K DLSS support to its Ampere GPU lineup which would be great to test out with the 24 GB RTX 3090 graphics card.

With Ampere, Tensor cores add INT8 and INT4 precision in addition to FP16 which is still fully supported. NVIDIA has been at the helm of the deep learning revolution by supporting it since its Kepler generation of graphics cards. Today, NVIDIA has some of the most powerful AI graphics accelerators and a software stack that is widely adopted by this fast-growing industry.

For its 3rd Gen Tensor cores, NVIDIA is using the same sparsity architecture that they've used on the Ampere HPC line of GPUs. While Ampere features 4 Tensor cores per SM compared to Turing's 8 tensor cores per SM, they are not only based on the new 3rd Generation design but also get an increased count with the larger SM array. The Ampere GPU can execute 128 FP16 FMA operations per tensor core utilizing its entire INT16 cores and with sparsity, it can do up to 256. The total FP16 FMA operations per SM are increased to 512 and 1024 with sparsity. That's a 2x increase over the Turing GPU in terms of inference performance with the updated Tensor design.

2nd Gen RT Cores, RTX, and Real-Time Ray Tracing Dissected

Next up, we have the RT Cores which are what will power Real-Time Raytracing. NVIDIA isn't going to distance themselves from traditional rasterization-based rendering, but instead following a hybrid rendering model. The new 2nd Generation RT cores offer increased performance and offer double the ray/triangle intersection testing rate over Turing RT cores.

There's one RT core per SM and all of them combined accelerate Bounding Volume Hierarchy (BVH) traversal and ray/triangle intersection testing (ray casting) functions. RT Cores work together with advanced denoising filtering, a highly-efficient BVH acceleration structure developed by NVIDIA Research, and RTX compatible APIs to achieve real-time ray tracing on a single Turing GPU.

RT Cores traverse the BVH autonomously, and by accelerating traversal and ray/triangle intersection tests, they offload the SM, allowing it to handle another vertex, pixel, and compute shading work. Functions such as BVH building and refitting are handled by the driver, and ray generation and shading are managed by the application through new types of shaders.

To better understand the function of RT Cores, and what exactly they accelerate, we should first explain how ray tracing is performed on GPUs or CPUs without a dedicated hardware ray tracing engine. Essentially, the process of BVH traversal would need to be performed by shader operations and take thousands of instruction slots per ray cast to test against bounding box intersections in the BVH until finally hitting a triangle, and the color at the point of intersection contributes to the final pixel color (or if no triangle is hit, the background color may be used to shade a pixel).

Ray tracing without hardware acceleration requires thousands of software instruction slots per ray to test successively smaller bounding boxes in the BVH structure until possibly hitting a triangle. It’s a computationally-intensive process making it impossible to do on GPUs in real-time without hardware-based ray tracing acceleration.

The RT Cores in Ampere can process all the BVH traversal and ray-triangle intersection testing, saving the SM from spending the thousands of instruction slots per ray, which could be an enormous amount of instructions for an entire scene. The RT Core includes two specialized units. The first unit does bounding box tests, and the second unit does ray-triangle intersection tests.

The SM only has to launch a ray probe, and the RT core does the BVH traversal and ray-triangle tests, and return a hit or no hit to the SM. Also unlike the last generation, Ampere SM can process two compute workloads simultaneously, allowing ray-tracing & graphics/compute workloads to be done concurrently.

In a visual demonstration, NVIDIA has shown how RT and Tensor cores help speed up ray tracing and shader workloads significantly. A fully ray-traced frame from Wolfenstein Youngblood was taken as an example. The last-gen RTX 2080 SUPER will take 51ms to render the frame if it does it all with its shaders (CUDA Cores). With RT cores and shaders working in tandem, the processing times are reduced to just 20ms or less than half the time. Adding in Tensor cores to help reduce the rendering time even lower to just 12ms (~83 FPS).

However, with Ampere, each standard processing block receives a huge performance uplift. With an RTX 3080, the same frame can be rendered within 37ms on the Shader cores alone, 11ms with the RT+Shader cores, and 6.7ms (150 FPS) with all three core technologies working together. That's half the time of what Turing took to render the same scene.

With each new generation of graphics cards, NVIDIA delivers a new range of display technologies. This generation is no different and we see some significant updates to not only the display engine but also the graphics interconnect. With the adoption of faster GDDR6X memory which provides higher bandwidth, faster compression, and more cache, Gaming applications can now run at higher resolutions, supporting more details on the display.

The Ampere Display Engine supports two new display technologies, HDMI 2.1 and DisplayPort 1.4a with DSC 1.2a. HDMI 2.1 allows for up to 48 Gbps of total bandwidth and allows for up to 4K 240Hz HDR and 8K 60Hz HDR.

DisplayPort 1.4a allows for up to 8K resolutions with 60Hz refresh rates and includes VESA's display stream compression 1.2 technology with visually lossless compression. You can run up to two 8K displays at 60 Hz using two cables, one for each display. In addition to that, Ampere also supports HDR processing natively with tone mapping added to the HDR pipeline.

Ampere GPUs also ships with the Fifth Generation NVDEC decoder unit that adds AV1 hardware decode support. Ampere's new NVDEC decoder has also been updated to support the decoding of MPEG-2, VC-1, H.264 (AVCHD), H.265 (HEVC), VP8, VP9, and AV1.

Ampere also adds the 7th Generation NVENC encoder by offering seamless hardware-accelerated encoding of up to 4K on H.264 and 8K on HEVC.

NVIDIA RTX IO - Blazing Fast Read Speeds With GPU Utilization

As storage sizes have grown, so has storage performance. Gamers are increasingly turning to SSDs to reduce game load times: while hard drives are limited to 50-100 MB/sec throughput, the latest M.2 PCIe Gen4 SSDs deliver up to 7 GB/sec. With the traditional storage model, game data is read from the hard disk, then passed from the system memory and CPU before being passed to the GPU.

Historically games have read files from the hard disk, using the CPU to decompress the game image. Developers have used lossless compression to reduce install sizes and to improve I/O performance. However, as storage performance has increased, traditional file systems and storage APIs have become a bottleneck. For example, decompressing game data from a 100 MB/sec hard drive takes only a few CPU cores, but decompressing data from a 7 GB/sec PCIe Gen4 SSD can consume more than twenty AMD Ryzen Threadripper 3960X CPU cores!

Using the traditional storage model, game decompression can consume all 24 cores on a Threadripper CPU. Modern game engines have exceeded the capability of traditional storage APIs. A new generation of I/O architecture is needed. Data transfer rates are the gray bars, CPU cores required are the black/blue blocks.

NVIDIA RTX IO is a suite of technologies that enable rapid GPU-based loading and decompression of game assets, accelerating I/O performance by up to 100x compared to hard drives and traditional storage APIs. When used with Microsoft’s new DirectStorage for Windows API, RTX IO offloads dozens of CPU cores’ worth of work to your RTX GPU, improving frame rates, enabling near-instantaneous game loading, and opening the door to a new era of large, incredibly detailed open-world games.

Object pop-in and stutter can be reduced, and high-quality textures can be streamed at incredible rates, so even if you’re speeding through a world, everything runs and looks great. In addition, with lossless compression, game download and install sizes can be reduced, allowing gamers to store more games on their SSD while also improving their performance.

NVIDIA RTX IO plugs into Microsoft’s upcoming DirectStorage API which is a next-generation storage architecture designed specifically for state-of-the-art NVMe SSD-equipped gaming PCs and the complex workloads that modern games require. Together, streamlined and parallelized APIs specifically tailored for games allow dramatically reduced IO overhead and maximize performance/bandwidth from NVMe SSDs to your RTX IO-enabled GPU.

Specifically, NVIDIA RTX IO brings GPU-based lossless decompression, allowing reads through DirectStorage to remain compressed and delivered to the GPU for decompression. This removes the load from the CPU, moving the data from storage to the GPU in a more efficient, compressed form, and improving I/O performance by a factor of two.

GeForce RTX GPUs will deliver decompression performance beyond the limits of even Gen4 SSDs, offloading potentially dozens of CPU cores’ worth of work to ensure maximum overall system performance for next-generation games. Lossless decompression is implemented with high performance compute kernels, asynchronously scheduled. This functionality leverages the DMA and copy engines of Turing and Ampere, as well as the advanced instruction set, and architecture of these GPU’s SM’s.

The advantage of this is that the enormous compute power of the GPU can be leveraged for burst or bulk loading (at level load for example) when GPU resources can be leveraged as high-performance I/O processor, delivering decompression performance well beyond the limits of Gen4 NVMe. During streaming scenarios, bandwidths are a tiny fraction of the GPU capability, further leveraging the advanced asynchronous compute capabilities of Turing and Ampere. Microsoft is targeting a developer preview of DirectStorage for Windows for game developers next year, and NVIDIA Turing & Ampere gamers will be able to take advantage of RTX IO enhanced games as soon as they become available.

NVLINK For GeForce RTX 3090 And Titan Class Products Only!

NVIDIA has said farewell to their SLI (Scale Link Interface) interconnect for consumer graphics cards. They will now be using the NVLINK interconnect which has already been featured on their Turing GPUs. The reason is that SLI was simply not enough to feed higher bandwidth to Ampere GPUs.

A single x8 NVLINK channel provides 25 GB/s peak bandwidth. There are 4 x4 links on the GA102 GPU. The GA102 GPU features 50 GB/s of bandwidth in parallel and 100 GB/s bandwidth bi-directionally. Using NVLINK on high-end cards would be beneficial in high-resolution gaming but there's a reason NVIDIA still restricts users from doing 3 and 4 way SLI.

Multi-GPU still isn't optimized so you won't see many benefits unless you are running the highest-end graphics cards. That's another reason why the RTX 3080 & RTX 3070 are deprived of NVLINK connectors. The NVLINK connectors cost $79 US each and are sold separately.


The NVIDIA GeForce RTX 3070 is a force to be reckoned with. It takes the throne of the fastest PC gaming graphics card with nothing coming even close to it. It's surprisingly much faster than the GeForce RTX 2080 Ti which is its Turing based predecessor.

NVIDIA designed the GeForce RTX 3070 is designed to be the gaming champ, powering the next generation of AAA gaming titles with superb visuals and insane fluidity. It's not just the FPS that matters these days, its visuals, and a smoother frame rate too and this is exactly what the GeForce RTX 30 series is made to excel at. There's a lot to talk about regarding NVIDIA's flagship Ampere gaming graphics cards so let's start off with the specifications.

Marvels of NVIDIA Ampere Architecture - 2nd Generation RTX
Enabling the blistering performance of the new RTX 30 Series GPUs and the NVIDIA Ampere architecture are cutting-edge technologies and over two decades of graphics R&D, including:

NVIDIA GeForce RTX 3070 Graphics Card Specifications - GA104 GPU & 8 GB GDDR6 Memory

At the heart of the NVIDIA GeForce RTX 3070 graphics card lies the GA104 GPU. The GA104 is one of the many Ampere GPUs that we will be getting on the gaming segment. The GA104 GPU is the second-fastest Ampere chip in the stack. The GPU is based on Samsung's 8nm (N8) process node. The GPU measures at 392.5mm2 and features 17.4 Billion transistors which are almost 93% of the transistors featured on the TU102 GPU. At the same time, the GA104 GPU is almost half the size of the TU102 GPU which is an insane amount of density.

For the GeForce RTX 3070, NVIDIA has enabled a total of 46 SM units on its flagship which results in a total of 5888 CUDA cores. In addition to the CUDA cores, NVIDIA's GeForce RTX 3070 also comes packed with next-generation RT (Ray-Tracing) cores, Tensor cores, and brand new SM or streaming multi-processor units.

In terms of memory, the GeForce RTX 3070 features 8 GB of GDDR6 memory. The GeForce RTX 3070 comes with memory at speeds of 14 Gbps. That along with a full uncut bus interface of 256-bit will deliver a cumulative bandwidth of 448 Gbps. The NVIDIA GeForce RTX 3070 has a TGP of 220W.

NVIDIA GeForce RTX 30 Series 'Ampere' Graphics Card Specifications:

Graphics Card NameNVIDIA GeForce RTX 3060NVIDIA GeForce RTX 3060 TiNVIDIA GeForce RTX 3070NVIDIA GeForce RTX 3080NVIDIA GeForce RTX 3090
GPU NameAmpere GA106-300Ampere GA104-200Ampere GA104-300Ampere GA102-200Ampere GA102-300
Process NodeSamsung 8nmSamsung 8nmSamsung 8nmSamsung 8nmSamsung 8nm
Die SizeTBC395.2mm2395.2mm2628.4mm2628.4mm2
TransistorsTBC17.4 Billion17.4 Billion28 Billion28 Billion
CUDA Cores358448645888870410496
TMUs / ROPs112 / 64152 / 80184 / 96272 / 96328 / 112
Tensor / RT Cores112 / 28152 / 38184 / 46272 / 68328 / 82
Base Clock1320 MHz1410 MHz1500 MHz1440 MHz1400 MHz
Boost Clock1780 MHz1665 MHz1730 MHz1710 MHz1700 MHz
FP32 Compute13 TFLOPs16 TFLOPs20 TFLOPs30 TFLOPs36 TFLOPs
RT TFLOPs25 TFLOPs32 TFLOPs40 TFLOPs58 TFLOPs69 TFLOPs
Tensor-TOPs101 TOPs192 TOPs163 TOPs238 TOPs285 TOPs
Memory Capacity12 GB GDDR68 GB GDDR68 GB GDDR610 GB GDDR6X24 GB GDDR6X
Memory Bus192-bit256-bit256-bit320-bit384-bit
Memory Speed16 Gbps14 Gbps14 Gbps19 Gbps19.5 Gbps
Bandwidth384 Gbps448 Gbps448 Gbps760 Gbps936 Gbps
TGP170W175W220W320W350W
Price (MSRP / FE)$329 US$399 US$499 US$699 US$1499 US
Launch (Availability)25th February 20212nd December 202029th October 202017th September 202024th September 2020

NVIDIA GeForce RTX 3070 Graphics Card Cooling & Design- Next-Gen NVTTM Founders Edition Design

Unlike the new front and back cooling system that the GeForce RTX 3090 and GeForce RTX 3080 incorporate, the NVIDIA GeForce RTX 3070 makes use of a dual-fan cooling system which blows air towards the central heatsink.

The Founders Edition cooling makes use of a full aluminum alloy heatsink which is coated with a nano-carbon coating and should do a really good job at keeping the temperatures in control. The design is interesting in the sense that not only does it goes all out with a fin and heat pipe design.

The Founders Edition cooling makes use of a full aluminum alloy heatsink which is coated with a nano-carbon coating and should do a really good job at keeping the temperatures in control. The design is interesting in the sense that not only does it goes all out with a fin and heat pipe design.

NVIDIA GeForce RTX 3070 Graphics Card PCB & Power - Designed To Be Overclocked!

The GeForce RTX 30 series Founders Edition cards including the GeForce RTX 3070 will be featuring the 12-pin Micro-Fit 3.0 power connectors. These connectors don't require a power supply upgrade as the cards will ship with bundled 2x 8-pin to 1x 12-pin connectors so you can run your latest graphics card without any compatibility issues.

The placement of the 12-pin connector on the PCB is also noteworthy. It is placed in a standard horizontal position but right in the middle of the shroud which does help with better electrical signaling to the GPU and judging by the PCB design, we can tell why NVIDIA moved to a single 12-pin plug instead of the standard dual 8-pin design. There's limited room on the PCB to do stuff and as such, it was necessary to go for a more small and compact power input.

NVIDIA GeForce RTX 3070 Graphics Card Price & Availability - Both Custom & Reference Designs at Launch

The NVIDIA GeForce RTX 3070 is being announced today and will be launching to consumers on 29th October. The first wave of graphics cards to hit the market would be the reference Founders Edition variant which will cost $499 US. The NVIDIA GeForce RTX 3070 will feature a price of $499 (MSRP) however custom models will vary depending on their design and the extra horse-power that they have to offer.

There aren't any performance numbers that NVIDIA is sharing right now but from what has been showcased, the GeForce RTX 3070 is faster than an RTX 2080 Ti, the RTX 3080 is a good bit ahead of the RTX 2080 Ti and the RTX 3090 is about as much as 50% faster than the RTX 2080 Ti which is very impressive for the full lineup stack.

The GeForce RTX 3070 is an insane little beast that features a 60% performance uplift over its RTX 2070 predecessor powering the next-generation AAA titles at smooth frame rates at even the most demanding resolutions of 4K.

The presentation has been key for NVIDIA since the 700 series when it comes to their Founders Edition and the GeForce RTX 3070 is no different. Carrying over the design cues in the packaging of the previous 30 Series launches we are greeted by the GeForce RTX 3070 Founders Edition laying down, ready to be moved to your system.

The card itself is very similar is size and design to the last generation RTX 2070 but with the flow-through cooler heatsink and fans of this generation. Also, the card doesn't light up, I'm not happy about that, it really should light up.

The outward-facing side of the RTX 3070 carries the design signatures of the bigger GA102 based Ampere cards, even holding on to the 12-pin cable despite it only having a single 8-pin on the adapter. Again, the GeForce RTX logo does not light up on the GeForce RTX 3070.

It's easy to see that the open section of the heatsink is reinforced with small pipes throughout in addition to the heatpipes coming from the main GPU section of the card. The heatsink is dense, yet sparse enough to allow for as little noise as possible.

I did attempt to dismantle the card and take a look at the board but after breaking a piece of the retention clip for the rear fan ribbon cable (look closely and you see the notch I broke) I gave up and figured I'd leave this endeavor to the guys over at Gamers Nexus to tear apart. Do your thing Steve, do your thing. But I did at least want to share a look at the back of the PCB and how the card looks with the backplate removed.

We used the following test system for comparison between the different graphics cards. The latest drivers that were available at the time of testing were used from AMD and NVIDIA on an updated version of Windows 10. All games that were tested were patched to the latest version for better performance optimization for NVIDIA and AMD GPUs.

Test System

ComponentsX570
CPURyzen 9 3900X 4.3GHz All Core Lock (disable one CCD for 3600X Results)
Memory 32GB Hyper X Predator DDR4 3600
MotherboardASUS TUF Gaming X570 Plus-WiFi
StorageTeamGroup Cardea 1TB NVMe PCIe 4.0
PSUCooler Master V1200 Platinum
Windows VersionLatest verion of windows at the time of testing
Hardware-Accelerated GPU SchedulingOn if supported by GPU and driver.

Graphics Cards Tested

GPUArchitectureCore Count
Clock SpeedMemory Capacity
Memory Speed
NVIDIA RTX 3070 FEAmpere58881500/17308GB14Gbps
NVIDIA RTX 3080 FEAmpere87041440/17101019Gbps
NVIDIA RTX 2080ti FETuring43521350/163511GB GDDR614Gbps
NVIDIA RTX 2080 SUPER FETuring30721650/18158GB GDDR615.5Gbps
NVIDIA RTX 2070 SUPER FETuring25601605/17708GB GDDR614Gbps
NVIDIA GTX 1080 FEPascal
25601607/17338GB GDDR5X10Gbps
NVIDIA GTX 1070 FEPascal
19201506/16838GB GDDR58Gbps
AMD Radeon RX 5700XTNavi 1025601605/1755/19058GB GDDR614Gbps
AMD RX Vega 64 Vega 1040961247/15468GB HBM2945Mbps
Sapphire RX 5500 XT 4GBNavi 1414081737/18454GB GDDR614Gbps

Drivers Used

Drivers
Radeon Settings 20.10.1
GeForce456.96
  • All games were tested on 2560×1440 (2K), 3440x1440 and 3840x2160 (4K) resolutions.
  • Image Quality and graphics configurations are provided with each game description.
  • The "reference" cards are the stock configs.

Firestrike

Firestrike is running the DX11 API and is still a good measure of GPU scaling performance, in this test we ran the Extreme and Ultra versions of Firestrike which runs at 1440p and 4K and we recorded the Graphics Score only since the Physics and combined are not pertinent to this review.

3DMark Firestrike Extreme Graphics
Score
0
5000
10000
15000
20000
25000
30000
0
5000
10000
15000
20000
25000
30000
RTX 3070
16k
RTX 3080
21k
RTX Titan
17.5k
RTX 2080Ti
16k
RTX 2080
13k
RX 5700XT
12.8k
GTX 1080Ti
13.4k
RTX 2070
11k
RX Vega 64
10.9k
GTX 1080
10.1k
GTX 1070
8.2k
3DMark Firestrike Ultra Graphics
Score
0
4000
8000
12000
16000
20000
24000
0
4000
8000
12000
16000
20000
24000
RTX 3070
8.3k
RTX 3080
10.9k
RTX Titan
8.8k
RTX 2080Ti
8.1k
RTX 2080
6.4k
RX 5700XT
6.5k
GTX 1080Ti
RTX 2070
4.3k
RX Vega 64
5.4k
GTX 1080
4.9k
GTX 1070
4.1k

Time Spy

Time Spy is running the DX12 API and we used it in the same manner as Firestrike Extreme where we only recorded the Graphics Score as the Physics score is recording the CPU performance and isn't important to the testing we are doing here.

3DMark Time Spy Graphics
Score
0
4000
8000
12000
16000
20000
24000
0
4000
8000
12000
16000
20000
24000
RTX 3070
13.4k
RTX 3080
17.7k
RTX Titan
14.9k
RTX 2080Ti
13.8k
RTX 2080
10.9k
RX 5700XT
9.2k
GTX 1080Ti
9.3k
RTX 2070
9.1k
RX Vega 64
7.1k
GTX 1080
7.1k
GTX 1070
5.8k
3DMark Time Spy Extreme Graphics
Score
0
2000
4000
6000
8000
10000
12000
0
2000
4000
6000
8000
10000
12000
RTX 3070
6.7k
RTX 3080
8.9k
RTX Titan
7.1k
RTX 2080Ti
6.6k
RTX 2080
5.1k
RX 5700XT
4.2k
GTX 1080Ti
4.4k
RTX 2070
4.3k
RX Vega 64
3.5k
GTX 1080
3.2k
GTX 1070
2.7k

Port Royal

Port Royal is another great tool in the 3DMark suite, but this one is 100% targeting Ray Tracing performance. It loads up ray traced shadows, reflections, and global illumination to really tax the performance of the graphics cards that either have hardware-based or software-based ray tracing support.

3DMark Port Royal Score
Score
0
4000
8000
12000
16000
20000
24000
0
4000
8000
12000
16000
20000
24000
RTX 3070
8.2k
RTX 3080
11.3k
RTX Titan
9.3k
RTX 2080Ti
8.6k
RTX 2080
6.5k
RX 5700XT
0
GTX 1080Ti
2k
RTX 2070
5.4k
GTX 1080
1.5k
GTX 1070
1.2k

Thermals

Thermals were measured from our open test bench after running the Time Spy graphics test 2 on loop for 30 minutes recording the highest temperatures reported. The room was climate controlled and kept at a constant 22c throughout the testing.

Temperatures (22c Ambient)
Load
Idle
0
20
40
60
80
100
120
0
20
40
60
80
100
120
RTX 3070
73
30
RTX 3080
77
33
RTX Titan
74
31
RTX 2080Ti
76
30
RTX 2080
70
29
RX 5700XT
81
37
GTX 1080Ti
84
33
RTX 2070
74
27
RX Vega 64
84
37
GTX 1080
84
32
GTX 1070
79
32

Forza Horizon 4

Forza Horizon 4 carries on the open-world racing tradition of the Horizon series.  The latest DX12 powered entry is beautifully crafted and amazingly well executed and is a great showcase of DX12 games.  We use the benchmark run while having all of the settings set to non-dynamic with an uncapped framerate to gather these results.

Forza Horizon 4 1440p Ultra
AVG FPS
1% Percentile
0
40
80
120
160
200
240
0
40
80
120
160
200
240
RTX 3070
160
132
RTX 3080
174
145
RTX Titan
160
133
RTX 2080Ti
154
131
RTX 2080
139
121
RX 5700XT
117
100
GTX 1080Ti
110
91
RTX 2070
111
95
RX Vega 64
88
73
GTX 1080
92
78
GTX 1070
79
68

Shadow of the Tomb Raider

Shadow of the Tomb Raider, unlike its predecessor, does a good job putting DX12 to use and results in higher performance than the DX11 counterpart in this title and because of that, we test this title in DX12.  I do use the second segment of the benchmark run to gather these numbers as it is more indicative of in-game scenarios where the foliage is heavy.

Shadow of the Tomb Raider 1440p DX12 Highest
AVG FPS
1% Percentile
0
40
80
120
160
200
240
0
40
80
120
160
200
240
RTX 3070
112
104
RTX 3080
145
130
RTX Titan
119
110
RTX 2080Ti
111
102
RTX 2080
92
87
RX 5700XT
79
74
GTX 1080Ti
80
76
RTX 2070
76
72
RX Vega 64
62
56
GTX 1080
60
57
GTX 1070
50
47

Rainbow 6 Siege

Rainbow 6 Siege has maintained a massive following since its launch and it consistently in Steams Top Ten highest player count game.  In a title where the higher the framerate the better in a tactical yet fast-paced competitive landscape is essential, we include this title despite its ludicrously high framerates.  We use the Vulkan Ultra preset with the High Defenition Texture Pack as well and gather our results from the built-in benchmarking tool.

Rainbow 6 Siege 1440p Vulkan Ultra
AVG FPS
1% Percentile
0
90
180
270
360
450
540
0
90
180
270
360
450
540
RTX 3070
362
299
RTX 3080
439
359
RTX Titan
377
302
RTX 2080Ti
346
278
RTX 2080
289
237
RX 5700XT
258
212
GTX 1080Ti
244
192
RTX 2070
260
209
RX Vega 64
158
128
GTX 1080
189
149
GTX 1070
161
128

DOOM Eternal

 

DOOM Eternal brings hell to earth with the Vulkan powered idTech 7.  We test this game using the Ultra Nightmare Preset and follow our in game benchmarking to stay as consistent as possible.

DOOM Eternal 1440p Ultra Nightmare
AVG FPS
1% Percentile
0
70
140
210
280
350
420
0
70
140
210
280
350
420
RTX 3070
234
162
RTX 3080
313
225
RTX Titan
232
169
RTX 2080Ti
219
159
RTX 2080
177
132
RX 5700XT
156
112
GTX 1080Ti
146
102
RTX 2070
143
109
RX Vega 64
134
98
GTX 1080
116
78
GTX 1070
93
65

Gears Tactics

Gears Tactics is the latest in the Gears franchise and takes things in a completely different direction with the gameplay design. It is built on a DX12 based Unreal Engine 4 build. We used the Maximum settings allowed but refrained from enabling Variable Rate Shading as all cards ar not capable of supporting this feature.

Gears Tactics 1440p Maximum
AVG FPS
1% Percentile
0
40
80
120
160
200
240
0
40
80
120
160
200
240
RTX 3070
99
84
RTX 3080
119
98
RTX Titan
125
98
RTX 2080Ti
96
80
RTX 2080
89
76
RX 5700XT
78
67
GTX 1080Ti
86
75
RTX 2070
81
70
RX Vega 64
65
54
GTX 1080
61
56
GTX 1070
53
44

Ghost Recon Breakpoint

Ghost Recon Breakpoint is powered by the latest iteration of the Anvil Next 2.0 game engine. This is the same engine that was used in Assassin's Creed Odyssey but in Breakpoint has been updated to support the Vulkan API. We performed our tests using the High Preset with the Vulkan API.

Ghost Recon Breakpoint 1440p Vulkan High
AVG FPS
1% Percentile
0
40
80
120
160
200
240
0
40
80
120
160
200
240
RTX 3070
120
86
RTX 3080
153
107
RTX Titan
125
87
RTX 2080Ti
120
91
RTX 2080
96
57
RX 5700XT
86
67
GTX 1080Ti
85
64
RTX 2070
83
63
RX Vega 64
72
57
GTX 1080
67
52
GTX 1070
47
38

Call of Duty Modern Warfare (2019)

Call of Duty Modern Warfare is back and this time on a new engine running DX12 to allow for some sick DXR Ray Traced Shadows, but we're not testing that here since this card isn't designed for that level of rendering. We tested in the 'Fog of War' mission where we tested our RT performance run. At 1440p we set the settings all to High.

Call of Duty Modern Warfare 1440p Highest
AVG FPS
1% Percentile
0
40
80
120
160
200
240
0
40
80
120
160
200
240
RTX 3070
128
113
RTX 3080
165
138
RTX Titan
138
119
RTX 2080Ti
118
106
RTX 2080
107
96
RX 5700XT
103
87
GTX 1080Ti
96
85
RTX 2070
94
80
RX Vega 64
87
76
GTX 1080
75
66
GTX 1070
63
55

Resident Evil 3

The Resident Evil 3 Remake has surpassed the RE2 Remake in visuals and is the latest use of the RE Engine.  While it does have DX12 support the DX11 implementation is far superior and because of that, we will be sticking to DX11 for this title.  We use the cutscene where Jill and Carlos enter the subway car for the first time and a 2 minute capture at that point.

Resident Evil 3 1440p DX11 Maximum
AVG FPS
1% Percentile
0
40
80
120
160
200
240
0
40
80
120
160
200
240
RTX 3070
135
109
RTX 3080
181
147
RTX Titan
146
119
RTX 2080Ti
135
110
RTX 2080
105
86
RX 5700XT
88
76
GTX 1080Ti
91
75
RTX 2070
87
70
RX Vega 64
68
57
GTX 1080
67
54
GTX 1070
54
45

Borderlands 3

Borderlands 3 has made its way into the test lineup thanks to strong demand by gamers and simply delivering MORE Borderlands. This game is rather intensive after the Medium preset but since we're testing the 'Ultimate 1440p' card, High it is. We tested using the built-in benchmark utility

Borderlands 3 1440p High
AVG FPS
1% Percentile
0
40
80
120
160
200
240
0
40
80
120
160
200
240
RTX 3070
101
75
RTX 3080
122
92
RTX Titan
102
72
RTX 2080Ti
97
65
RTX 2080
78
57
RX 5700XT
65
52
GTX 1080Ti
75
57
RTX 2070
65
55
RX Vega 64
44
34
GTX 1080
57
44
GTX 1070
46
40

Total War Saga: Troy

Total War Saga: Troy is powered by their TW Engine 3 (Total War Engine 3) and in this iteration, they have stuck to a strictly DX11 release. We tested the game using the built-in benchmark using the Dynasty model that represents a battle with many soldiers interacting at once and is more representative of normal gameplay.

Total War Saga
AVG FPS
1% Percentile
0
40
80
120
160
200
240
0
40
80
120
160
200
240
RTX 3070
81
64
RTX 3080
107
83
RTX Titan
90
69
RTX 2080Ti
84
64
RTX 2080
65
52
RX 5700XT
50
39
GTX 1080Ti
66
53
RTX 2070
57
43
RX Vega 64
36
28
GTX 1080
51
40
GTX 1070
42
32

Forza Horizon 4

Forza Horizon 4 carries on the open-world racing tradition of the Horizon series.  The latest DX12 powered entry is beautifully crafted and amazingly well executed and is a great showcase of DX12 games.  We use the benchmark run while having all of the settings set to non-dynamic with an uncapped framerate to gather these results.

Forza Horizon 4 UW 1440p Ultra
AVG FPS
1% Percentile
0
40
80
120
160
200
240
0
40
80
120
160
200
240
RTX 3070
140
120
RTX 3080
162
142
RTX Titan
140
120
RTX 2080Ti
135
117
RTX 2080
117
103
RX 5700XT
100
87
GTX 1080Ti
98
82
RTX 2070
96
84
RX Vega 64
75
65
GTX 1080
80
68
GTX 1070
68
59

Shadow of the Tomb Raider

Shadow of the Tomb Raider, unlike its predecessor, does a good job putting DX12 to use and results in higher performance than the DX11 counterpart in this title and because of that, we test this title in DX12.  I do use the second segment of the benchmark run to gather these numbers as it is more indicative of in-game scenarios where the foliage is heavy.

Shadow of the Tomb Raider UW 1440p DX12 Highest
AVG FPS
1% Percentile
0
40
80
120
160
200
240
0
40
80
120
160
200
240
RTX 3070
91
86
RTX 3080
121
112
RTX Titan
98
91
RTX 2080Ti
91
85
RTX 2080
74
69
RX 5700XT
63
59
GTX 1080Ti
64
61
RTX 2070
61
58
RX Vega 64
50
45
GTX 1080
48
46
GTX 1070
40
39

Rainbow 6 Siege

Rainbow 6 Siege has maintained a massive following since its launch and it consistently in Steams Top Ten highest player count game.  In a title where the higher the framerate the better in a tactical yet fast-paced competitive landscape is essential, we include this title despite its ludicrously high framerates.  We use the Vulkan Ultra preset with the High Defenition Texture Pack as well and gather our results from the built-in benchmarking tool.

Rainbow 6 Siege UW 1440p Vulkan Ultra
AVG FPS
1% Percentile
0
70
140
210
280
350
420
0
70
140
210
280
350
420
RTX 3070
301
253
RTX 3080
370
309
RTX Titan
307
249
RTX 2080Ti
284
235
RTX 2080
232
193
RX 5700XT
204
170
GTX 1080Ti
197
159
RTX 2070
209
174
RX Vega 64
123
101
GTX 1080
149
120
GTX 1070
127
104

DOOM Eternal

 

DOOM Eternal brings hell to earth with the Vulkan powered idTech 7.  We test this game using the Ultra Nightmare Preset and follow our in-game benchmarking to stay as consistent as possible.

DOOM Eternal UW 1440p Ultra Nightmare
AVG FPS
1% Percentile
0
50
100
150
200
250
300
0
50
100
150
200
250
300
RTX 3070
186
140
RTX 3080
256
182
RTX Titan
186
140
RTX 2080Ti
173
129
RTX 2080
131
98
RX 5700XT
121
93
GTX 1080Ti
114
80
RTX 2070
109
82
RX Vega 64
106
80
GTX 1080
88
63
GTX 1070
72
52

Gears Tactics

Gears Tactics is the latest in the Gears franchise and takes things in a completely different direction with the gameplay design. It is built on a DX12 based Unreal Engine 4 build. We used the Maximum settings allowed but refrained from enabling Variable Rate Shading as all cards ar not capable of supporting this feature.

Gears Tactics UW 1440p Maximum
AVG FPS
1% Percentile
0
20
40
60
80
100
120
0
20
40
60
80
100
120
RTX 3070
77
67
RTX 3080
88
73
RTX Titan
89
75
RTX 2080Ti
78
68
RTX 2080
73
63
RX 5700XT
64
56
GTX 1080Ti
64
56
RTX 2070
63
55
RX Vega 64
53
46
GTX 1080
52
48
GTX 1070
42
38

Ghost Recon Breakpoint

Ghost Recon Breakpoint is powered by the latest iteration of the Anvil Next 2.0 game engine. This is the same engine that was used in Assassin's Creed Odyssey but in Breakpoint has been updated to support the Vulkan API. We performed our tests using the High Preset with the Vulkan API.

Ghost Recon Breakpoint UW 1440p Vulkan High
AVG FPS
1% Percentile
0
40
80
120
160
200
240
0
40
80
120
160
200
240
RTX 3070
96
71
RTX 3080
124
89
RTX Titan
100
73
RTX 2080Ti
94
73
RTX 2080
76
57
RX 5700XT
67
53
GTX 1080Ti
68
52
RTX 2070
66
49
RX Vega 64
55
44
GTX 1080
52
40
GTX 1070
37
29

Call of Duty Modern Warfare (2019)

Call of Duty Modern Warfare is back and this time on a new engine running DX12 to allow for some sick DXR Ray Traced Shadows, but we're not testing that here since this card isn't designed for that level of rendering. We tested in the 'Fog of War' mission where we tested our RT performance run. At UW 1440p we set the settings all to High.

Call of Duty Modern Warfare UW 1440p Highest
AVG FPS
1% Percentile
0
40
80
120
160
200
240
0
40
80
120
160
200
240
RTX 3070
103
90
RTX 3080
137
120
RTX Titan
112
99
RTX 2080Ti
102
91
RTX 2080
86
76
RX 5700XT
82
69
GTX 1080Ti
76
67
RTX 2070
70
63
RX Vega 64
70
58
GTX 1080
59
52
GTX 1070
49
42

Resident Evil 3

The Resident Evil 3 Remake has surpassed the RE2 Remake in visuals and is the latest use of the RE Engine.  While it does have DX12 support the DX11 implementation is far superior and because of that, we will be sticking to DX11 for this title.  We use the cutscene where Jill and Carlos enter the subway car for the first time and a 2 minute capture at that point.

Resident Evil 3 UW 1440p DX11 Maximum
AVG FPS
1% Percentile
0
40
80
120
160
200
240
0
40
80
120
160
200
240
RTX 3070
108
90
RTX 3080
147
121
RTX Titan
117
98
RTX 2080Ti
108
91
RTX 2080
84
54
RX 5700XT
69
61
GTX 1080Ti
72
61
RTX 2070
69
58
RX Vega 64
61
51
GTX 1080
52
44
GTX 1070
43
37

Borderlands 3

Borderlands 3 has made its way into the test lineup thanks to strong demand by gamers and simply delivering MORE Borderlands. This game is rather intensive after the Medium preset but since we're testing the 'Ultimate UW 1440p' card, High it is. We tested using the built-in benchmark utility

Borderlands 3 UW 1440p High
AVG FPS
1% Percentile
0
40
80
120
160
200
240
0
40
80
120
160
200
240
RTX 3070
81
63
RTX 3080
105
75
RTX Titan
82
64
RTX 2080Ti
77
59
RTX 2080
61
50
RX 5700XT
53
43
GTX 1080Ti
59
52
RTX 2070
51
42
RX Vega 64
33
27
GTX 1080
44
37
GTX 1070
36
31

Total War Saga: Troy

Total War Saga: Troy is powered by their TW Engine 3 (Total War Engine 3) and in this iteration, they have stuck to a strictly DX11 release. We tested the game using the built-in benchmark using the Dynasty model that represents a battle with many soldiers interacting at once and is more representative of normal gameplay.

Total War Saga
AVG FPS
1% Percentile
0
20
40
60
80
100
120
0
20
40
60
80
100
120
RTX 3070
61
50
RTX 3080
82
65
RTX Titan
68
54
RTX 2080Ti
63
50
RTX 2080
50
40
RX 5700XT
38
30
GTX 1080Ti
50
43
RTX 2070
43
33
RX Vega 64
28
22
GTX 1080
38
31
GTX 1070
31
25

Forza Horizon 4

Forza Horizon 4 carries on the open-world racing tradition of the Horizon series.  The latest DX12 powered entry is beautifully crafted and amazingly well executed and is a great showcase of DX12 games.  We use the benchmark run while having all of the settings set to non-dynamic with an uncapped framerate to gather these results.

Forza Horizon 4 4K Ultra
AVG FPS
1% Percentile
0
40
80
120
160
200
240
0
40
80
120
160
200
240
RTX 3070
117
103
RTX 3080
143
124
RTX Titan
114
99
RTX 2080Ti
107
93
RTX 2080
93
81
RX 5700XT
78
67
GTX 1080Ti
80
69
RTX 2070
74
64
RX Vega 64
59
49
GTX 1080
63
55
GTX 1070
53
46

Shadow of the Tomb Raider

Shadow of the Tomb Raider, unlike its predecessor, does a good job putting DX12 to use and results in higher performance than the DX11 counterpart in this title and because of that, we test this title in DX12.  I do use the second segment of the benchmark run to gather these numbers as it is more indicative of in-game scenarios where the foliage is heavy.

Shadow of the Tomb Raider 4K DX12 Highest
AVG FPS
1% Percentile
0
20
40
60
80
100
120
0
20
40
60
80
100
120
RTX 3070
62
58
RTX 3080
84
78
RTX Titan
67
63
RTX 2080Ti
62
58
RTX 2080
50
47
RX 5700XT
41
38
GTX 1080Ti
43
41
RTX 2070
41
39
RX Vega 64
34
31
GTX 1080
32
29
GTX 1070
26
20

Rainbow 6 Siege

Rainbow 6 Siege has maintained a massive following since its launch and it consistently in Steams Top Ten highest player count game.  In a title where the higher the framerate the better in a tactical yet fast-paced competitive landscape is essential, we include this title despite its ludicrously high framerates.  We use the Vulkan Ultra preset with the High Defenition Texture Pack as well and gather our results from the built-in benchmarking tool.

Rainbow 6 Siege 4K Vulkan Ultra
AVG FPS
1% Percentile
0
50
100
150
200
250
300
0
50
100
150
200
250
300
RTX 3070
206
180
RTX 3080
259
224
RTX Titan
221
189
RTX 2080Ti
197
168
RTX 2080
156
134
RX 5700XT
131
113
GTX 1080Ti
131
109
RTX 2070
138
118
RX Vega 64
77
65
GTX 1080
96
79
GTX 1070
83
70

DOOM Eternal

DOOM Eternal brings hell to earth with the Vulkan powered idTech 7.  We test this game using the Ultra Nightmare Preset and follow our in-game benchmarking to stay as consistent as possible.

DOOM Eternal 4K Ultra Nightmare
AVG FPS
1% Percentile
0
40
80
120
160
200
240
0
40
80
120
160
200
240
RTX 3070
111
93
RTX 3080
176
137
RTX Titan
130
99
RTX 2080Ti
120
92
RTX 2080
87
66
RX 5700XT
68
51
GTX 1080Ti
78
58
RTX 2070
71
54
RX Vega 64
62
47
GTX 1080
52
39
GTX 1070
48
38

Gears Tactics

Gears Tactics is the latest in the Gears franchise and takes things in a completely different direction with the gameplay design. It is built on a DX12 based Unreal Engine 4 build. We used the Maximum settings allowed but refrained from enabling Variable Rate Shading as all cards are not capable of supporting this feature.

Gears Tactics 4K Maximum
AVG FPS
1% Percentile
0
20
40
60
80
100
120
0
20
40
60
80
100
120
RTX 3070
52
47
RTX 3080
64
59
RTX Titan
67
58
RTX 2080Ti
56
51
RTX 2080
48
44
RX 5700XT
46
41
GTX 1080Ti
44
40
RTX 2070
40
36
RX Vega 64
34
28
GTX 1080
33
30
GTX 1070
29
26

Ghost Recon Breakpoint

Ghost Recon Breakpoint is powered by the latest iteration of the Anvil Next 2.0 game engine. This is the same engine that was used in Assassin's Creed Odyssey but in Breakpoint has been updated to support the Vulkan API. We performed our tests using the High Preset with the Vulkan API.

Ghost Recon Breakpoint 4K Vulkan High
AVG FPS
1% Percentile
0
20
40
60
80
100
120
0
20
40
60
80
100
120
RTX 3070
66
50
RTX 3080
88
65
RTX Titan
70
52
RTX 2080Ti
65
51
RTX 2080
51
39
RX 5700XT
44
35
GTX 1080Ti
46
36
RTX 2070
44
34
RX Vega 64
38
30
GTX 1080
34
27
GTX 1070
25
20

Call of Duty Modern Warfare (2019)

Call of Duty Modern Warfare is back and this time on a new engine running DX12 to allow for some sick DXR Ray Traced Shadows, those results are in the RT section We tested in the 'Fog of War' mission. At 4K we set the settings all to High.

Call of Duty Modern Warfare4K Highest
AVG FPS
1% Percentile
0
40
80
120
160
200
240
0
40
80
120
160
200
240
RTX 3070
74
66
RTX 3080
101
90
RTX Titan
80
73
RTX 2080Ti
73
67
RTX 2080
60
53
RX 5700XT
57
50
GTX 1080Ti
53
47
RTX 2070
51
45
RX Vega 64
48
39
GTX 1080
40
36
GTX 1070
33
29

Resident Evil 3

The Resident Evil 3 Remake has surpassed the RE2 Remake in visuals and is the latest use of the RE Engine.  While it does have DX12 support the DX11 implementation is far superior and because of that, we will be sticking to DX11 for this title.  We use the cutscene where Jill and Carlos enter the subway car for the first time and a 2 minute capture at that point.

Resident Evil 3 4K DX11 Maximum
AVG FPS
1% Percentile
0
20
40
60
80
100
120
0
20
40
60
80
100
120
RTX 3070
70
59
RTX 3080
97
81
RTX Titan
76
64
RTX 2080Ti
70
59
RTX 2080
54
45
RX 5700XT
43
39
GTX 1080Ti
45
39
RTX 2070
44
37
RX Vega 64
32
27
GTX 1080
33
29
GTX 1070
27
20

Borderlands 3

Borderlands 3 has made its way into the test lineup thanks to strong demand by gamers and simply delivering MORE Borderlands. This game is rather intensive after the Medium preset but since we're testing the 'Ultimate UW 1440p' card, High it is. We tested using the built-in benchmark utility

Borderlands 3 4K High
AVG FPS
1% Percentile
0
20
40
60
80
100
120
0
20
40
60
80
100
120
RTX 3070
56
48
RTX 3080
77
63
RTX Titan
58
49
RTX 2080Ti
53
46
RTX 2080
41
37
RX 5700XT
35
24
GTX 1080Ti
40
34
RTX 2070
34
30
RX Vega 64
21
18
GTX 1080
30
25
GTX 1070
24
21

Total War Saga: Troy

Total War Saga: Troy is powered by their TW Engine 3 (Total War Engine 3) and in this iteration, they have stuck to a strictly DX11 release. We tested the game using the built-in benchmark using the Dynasty model that represents a battle with many soldiers interacting at once and is more representative of normal gameplay.

Total War Saga
AVG FPS
1% Percentile
0
10
20
30
40
50
60
0
10
20
30
40
50
60
RTX 3070
43
33
RTX 3080
57
43
RTX Titan
47
37
RTX 2080Ti
43
33
RTX 2080
34
27
RX 5700XT
26
20
GTX 1080Ti
35
28
RTX 2070
29
22
RX Vega 64
19
14
GTX 1080
26
20
GTX 1070
21
17

Shadow of the Tomb Raider

Shadow of the Tomb Raider, unlike its predecessor, does a good job putting DX12 to use and results in higher performance than the DX11 counterpart in this title, and because of that, we test this title in DX12.  I do use the second segment of the benchmark run to gather these numbers as it is more indicative of in-game scenarios where the foliage is heavy. SotTR features Ray Traced Shadows as well as DLSS and we used both in the benchmarks with the game set to the 'Highest' preset and RT Shadows at Ultra with DLSS enabled.

Shadow of the Tomb Raider 1440p 'Highest', RT Shadows Ultra, DLSS Enabled
AVG FPS
1% Percentile
0
20
40
60
80
100
120
0
20
40
60
80
100
120
RTX 3070
71
62
RTX 3080
94
85
RTX Titan
77
68
RTX 2080Ti
73
65
RTX 2080
57
51
RTX 2070
45
40

Modern Warfare

Call of Duty Modern Warfare is back and this time on a new engine running DX12 to allow for some sick DXR Ray Traced Shadows. We tested in the 'Fog of War' mission where we tested our RT performance run. At 1440p we set the settings all to High with ray-traced shadows enabled.

Call of Duty Modern Warfare 1440p 'High' RT Shadows Enabled
AVG FPS
1% Percentile
0
40
80
120
160
200
240
0
40
80
120
160
200
240
RTX 3070
97
81
RTX 3080
126
104
RTX Titan
100
86
RTX 2080Ti
93
79
RTX 2080
77
65
RTX 2070
62
52

Control

Control is powered by Remedy's Northlight Storytelling Engine but severely pumped up to support multiple functions of ray-traced effects. We ran this through our test run in the cafeteria with all ray tracing functions on high and the game set to high. DLSS was enabled for this title in the quality setting.

Control 1440p 'High', RT Reflections, RT Shadows, DLSS Quality
AVG FPS
1% Percentile
0
40
80
120
160
200
240
0
40
80
120
160
200
240
RTX 3070
87
74
RTX 3080
108
91
RTX Titan
87
75
RTX 2080Ti
82
69
RTX 2080
66
57
RTX 2070
56
49

Battlefield V

Battlefield V was one of the earlier games in the RTX 20 Series lifecycles to receive a DXR update. Battlefield V was tested on the opening sequence of the Tiralleur war story as it's been consistently one of the more demanding scenes for ray traced reflections that are featured in this game. DLSS was enabled for this game.

Battlefield V 1440p 'Ultra' RT Reflections, DLSS Enabled
AVG FPS
1% Percentile
0
20
40
60
80
100
120
0
20
40
60
80
100
120
RTX 3070
77
64
RTX 3080
88
75
RTX Titan
74
62
RTX 2080Ti
72
60
RTX 2080
64
56
RTX 2070
54
48

Metro Exodus

Metro Exodus was the third entry into the Metro series and as Artym vetures away from the Metro he, and you, are able to explore the world with impressive RT Global Illumination. RTGI has proven to be quite the intense feature to run. Metro Exodus also supports DLSS so it was used in our testing. Advanced PhysX was left disabled, but Hairworks was left on.

Metro Exodus 1440p 'Ultra' Ray Tracing 'Ultra' DLSS Enabled
AVG FPS
1% Percentile
0
40
80
120
160
200
240
0
40
80
120
160
200
240
RTX 3070
84
62
RTX 3080
101
79
RTX Titan
92
68
RTX 2080Ti
89
67
RTX 2080
74
48
RTX 2070
61
41

Quake 2 RTX

Quake II RTX is much like Minecraft with RTX in the sense that it is fully path-traced, so no rasterization here. This one however doesn't support DLSS so you're going to have to brute force it to acceptable framerates. Thankfully if these numbers don't do it for you then you can always adjust the resolution slider and enjoy a healthy performance boost.

Quake II RTX 1440p
AVG FPS
1% Percentile
0
20
40
60
80
100
120
0
20
40
60
80
100
120
RTX 3070
64
59
RTX 3080
78
70
RTX Titan
58
53
RTX 2080Ti
54
50
RTX 2080
41
37
RTX 2070
37
33

Boundary

Boundary is a multiplayer tactical shooter...in space. It's not out yet so treat this one as more of a synthetic benchmark as there are likely to be quite a few improvement but for now we had access to the benchmark and it's a doozy to run. Featuring full raytracing effects for the benchmark as well as DLSS, we ran that in Quality mode.

Boundary 1440p RT Enabled, DLSS Quality
AVG FPS
1% Percentile
0
20
40
60
80
100
120
0
20
40
60
80
100
120
RTX 3070
53
31
RTX 3080
77
42
RTX Titan
54
31
RTX 2080Ti
51
29
RTX 2080
40
22
RTX 2070
36
19

Bright Memory

Bright Memory is an action shooter that is currently in early access on Steam, will later be called Bright Memory Infinite when it fully releases. A one man team has turned this game into a showtopper and now it features RT reflections as well as DLSS. We ran it at the High preset with DLSS set to Balanced for our testing.

Bright Memory 1440p RT High, DSLL Balanced
AVG FPS
1% Percentile
0
20
40
60
80
100
120
0
20
40
60
80
100
120
RTX 3070
68
49
RTX 3080
92
62
RTX Titan
64
38
RTX 2080Ti
61
37
RTX 2080
50
33
RTX 2070
39
23

Amid Evil

Amid Evil is a high energy old school shotoer that seems like an unlikely recipient of RT features, but here we are with insane DXR support in a modern retro shooter. Feature RT Reflections, RT Shadows, and NVIDIA's DLSS support we had to put this one through the rounds and see how things went. The RTX version of this game is still in beta but publicly available for those who want to try it. We tested with all RT features on and DLSS enabled.

Amid Evil 1440P RT Reflections, RT Shadows, Lights 100%, DLSS Enabled
AVG FPS
1% Percentile
0
50
100
150
200
250
300
0
50
100
150
200
250
300
RTX 3070
184
141
RTX 3080
223
171
RTX Titan
185
140
RTX 2080Ti
180
132
RTX 2080
147
111
RTX 2070
126
91

Death Stranding

Sam Porter Bridges has delivered one of PS4's most anticipated games to the PC community and opened a whole new wold of possibilities. This was the first game to feature the Decima Engine on PC and unarguably did it the best. Death Stranding may not feature ray tracing effects but it does showcase that DLSS can be used effectively even when RT isn't around. We tested this one just like we did in our launch coverage with DLSS enabled.

Death Stranding 1440p Highest Settings DLSS Quality
AVG FPS
1% Percentile
0
40
80
120
160
200
240
0
40
80
120
160
200
240
RTX 3070
156
132
RTX 3080
165
144
RTX Titan
155
130
RTX 2080Ti
152
129
RTX 2080
130
110
RTX 2070
116
101

Graphics cards and power draw have always been quite synonymous with each other in terms of how much performance they put out for the power they take in. Measuring this has not always been the most straight forward when it comes to accuracy and methods for reviewers and end-users. NVIDIA has developed their PCAT system, or Power Capture Analysis Tool in order to be able to capture direct power consumption from ALL graphics cards that plug into the PCIe slot so that you can get a very clear barometer on actual power usage without relying on hacked together methods

The Old Way

The old method, for most anyway, was to simply use something along the lines of a Kill-A-Watt wall meter for power capture. This isn't the worst way, but as stated in our reviews it doesn't quite capture the amount of power that the graphics card alone is using. This results in some mental gymnastics to figure out how much the graphics card is using by figuring the system idle, CPU load, and the GPU load and estimating about where the graphics card lands, not very accurate to say the least.

Another way is to use GPU-z. This is the least reliable method as you have to rely entirely on the software reading from the graphics card. This is a poor method as the graphics cards vary in how they report to software when it comes to power usage. Some will only send out what the GPU core itself is using and not consider what the memory is drawing or any other component.

The last way I'll mention is the use of a multi-meter amperage clamp across the PCIe slot by way of a riser cable with separate cables then more power clamps on all the PCIe power cables going into the graphics card. This method is very accurate for graphics card power but is also very cumbersome and typically results in you having to watch the numbers and document them as you see them rather than plotting them across a spreadsheet.

The PCAT Way

This is where PCAT (power capture analysis tool) comes into play. NVIDIA has developed quite a robust tool for measuring graphics card power at the hardware level and taking the guesswork out of the equation. The tool is quite simple to set up and get going, as far as components used there are; a riser board for the GPU with a 4-pin Dupont cable, the PCAT module itself that everything plugs into with an OLED screen attached, 3 PCI-e cables for when a card calls for more than 2x 8-pin connectors, and a Micro-USB cable that allows you to capture the data on the system you're hooked up to or a secondary monitoring system.

Well, that's what it looks like when all hooked up on a test bench, you're not going to want to run this one in a case for sure. Before anyone gets worried, performance is not affected at all by this and the riser board is fully compliant with PCIe Gen 4.0. I'm not so certain about those exposed power points however, I will be getting the hot glue gun out soon for that.  Now, what does this do at this point? Well, two options: Plug it into the computer that it's all running on and let FrameView include the metrics, but that's for NVIDIA cards only so a pass, OR (what we do) plug it into a separate monitoring computer and observe and capture during testing scenarios.

The PCAT Power Profile Analyzer is the software tool provided to use to capture and monitor power readings across the PCI Express Power profile. The breadth of this tool is exceptionally useful for us here on the site to really explore what we can monitor. The most useful metric on here to me is the ability to monitor power across all sources, PCIe power cables (individually), and the PCIe slot itself.

Those who rather pull long-form spreadsheets to make their own charts are fully able to do so and even able to quickly form performance per watt metrics. We've found a very fun metric to monitor is actually Watts per frame, how many watts does it take for the graphics card to produce one frame at a locked 60FPS in various games, we'll get into that next.

Control Power

Control was the first game that we wanted to take a look at running at 1440p with RT and DLSS on, and then again with DLSS off, this is the game that NVIDIA used when showcasing the performance per watt improvements of Ampere, and well..they were right in the claim there.

Control 1440p 'High' RT High, DLSS On
GPU Full Load
Total System
1440p60 Power Load
0
90
180
270
360
450
540
0
90
180
270
360
450
540
RTX 3070
230
353
125
RTX 3080
320
475
123
RTX Titan
280
426
227
RTX 2080Ti
258
391
221
RTX 2080
218
350
214
Control RT Watts-Per-FPS
Watts-Per-FPS
0
2
4
6
0
2
4
6
RTX 3070
2
RTX 3080
2
RTX Titan
3
RTX 2080Ti
3
RTX 2080
3

 

Control 1440p 'High' No RT, DLSS Off
GPU Full Load
Total System
1440p60 Power Load
0
90
180
270
360
450
540
0
90
180
270
360
450
540
RTX 3070
225
345
130
RTX 3080
322
468
132
RTX Titan
278
412
184
RTX 2080Ti
262
389
220
RTX 2080
225
352
222
Control non-RT Watts-Per-FPS
Watts-Per-FPS
0
2
4
6
0
2
4
6
RTX 3070
2
RTX 3080
2
RTX Titan
3
RTX 2080Ti
3
RTX 2080
3

From these results for Control is shows that NVIDIAs measurements and claims of improvements were accurate, but it's not always the case. We tested Forza Horizon 4 in a spot to test the same way again but this time at 4K and looking at when we target at 4K60 scene in this game

 

Forza Horizon 4 4K Ultra
GPU Idle
GPU Full Load
Total System
1440p60 Power Load
0
90
180
270
360
450
540
0
90
180
270
360
450
540
070
10
227
360
160
RTX 3080
10
316
475
190
RTX Titan
14
280
427
206
RTX 2080Ti
15
260
405
200
RTX 2080
16
220
350
170
RX 5700XT
12
220
345
192
GTX 1080Ti
14
243
377
203
RTX 2070
10
192
303
160
RX Vega 64
23
315
432
270
GTX 1080
8
180
296
170
Forza Horizon 4 Watts-Per-FPS
Watts-Per-FPS
0
1
2
3
4
5
6
0
1
2
3
4
5
6
070
2
RTX 3080
3
RTX Titan
3
RTX 2080Ti
3
RTX 2080
2
RX 5700XT
3
GTX 1080Ti
3
RTX 2070
2
RX Vega 64
4
GTX 1080
2

Overclocking the GeForce RTX 3070 went quite a bit as it did for the RTX 3080. But instead of given a power budget that allowed for a 115% power limit we were treated to only a 109% power limit since we only had a single 8-pin to work with. Now that allowed us to get an additional 50MHz over the stock clock that resulted in a stable 1950-1995MHz core clock. Memory fared much much better with us being able to see upwards of +1250Mhz to the memory but settling on a much more comfortable +1000Mhz making the effective speed 16Gbps with a memory bandwidth of 512GB/s over the stock 14Gbps and 448GB/s.

We went a step further this time and included undervolting here rather than a separate article like we did for the RTX 3080, check that article if you want a guide for how to do undervolt Ampere for hidden efficiency potential.  We settled with the core at 925mV (not as much downward momentum as the RTX 3080) but a core clock of 1900MHz that ran at around 1925Mhz in-game. We set the memory to the same +1000 for an effective rate of 16Gbps and 512GB/s of bandwidth. Looking at the results overall I think it's safe to say the Undervolted core + overclocked memory is the way to go. Time to tune it up, folks!

Firestrike

Firestrike is running the DX11 API and is still a good measure of GPU scaling performance, in this test we ran the Ultra version of Firestrike which runs at 4K and we recorded the Graphics Score only since the Physics and combined are not pertinent to this review.

3DMark Firestrike Ultra Graphics
Score
0
2000
4000
6000
8000
10000
12000
0
2000
4000
6000
8000
10000
12000
RTX 3070 OC
8.6k
RTX 3070
8.3k
RTX 3070 UV
8.6k

Time Spy

Time Spy is running the DX12 API and we used it in the same manner as Firestrike Extreme where we only recorded the Graphics Score as the Physics score is recording the CPU performance and isn't important to the testing we are doing here.

3DMark Time Spy Extreme Graphics
Score
0
2000
4000
6000
8000
10000
12000
0
2000
4000
6000
8000
10000
12000
RTX 3070 OC
7.1k
RTX 3070
6.7k
RTX 3070 UV
6.9k

Forza Horizon 4

Forza Horizon 4 carries on the open-world racing tradition of the Horizon series.  The latest DX12 powered entry is beautifully crafted and amazingly well executed and is a great showcase of DX12 games.  We use the benchmark run while having all of the settings set to non-dynamic with an uncapped framerate to gather these results.

Forza Horizon 4 4K Ultra
AVG FPS
1% Percentile
0
40
80
120
160
200
240
0
40
80
120
160
200
240
RTX 3070 OC
122
107
RTX 3070
117
103
RTX 3070 UV
116
102

Rainbow 6 Siege

Rainbow 6 Siege has maintained a massive following since its launch and it consistently in Steams Top Ten highest player count game.  In a title where the higher the framerate the better in a tactical yet fast-paced competitive landscape is essential, we include this title despite its ludicrously high framerates.  We use the Vulkan Ultra preset with the High Defenition Texture Pack as well and gather our results from the built-in benchmarking tool.

Rainbow 6 Siege 4K Vulkan Ultra
AVG FPS
1% Percentile
0
50
100
150
200
250
300
0
50
100
150
200
250
300
RTX 3070 OC
210
188
RTX 3070
206
180
RTX 3070 UV
212
185

Resident Evil 3

The Resident Evil 3 Remake has surpassed the RE2 Remake in visuals and is the latest use of the RE Engine.  While it does have DX12 support the DX11 implementation is far superior and because of that, we will be sticking to DX11 for this title.  We use the cutscene where Jill and Carlos enter the subway car for the first time and a 2 minute capture at that point.

Resident Evil 3 4K DX11 Maximum
AVG FPS
1% Percentile
0
20
40
60
80
100
120
0
20
40
60
80
100
120
RTX 3070 OC
75
63
RTX 3070
70
59
RTX 3070 UV
72
61

Thermals

Thermals were measured from our open test bench after running the Time Spy graphics test 2 on loop for 30 minutes recording the highest temperatures reported. The room was climate controlled and kept at a constant 22c throughout the testing.

Temperatures (22c Ambient)
Load
Idle
0
20
40
60
80
100
120
0
20
40
60
80
100
120
RTX 3070 OC
74
30
RTX 3070
73
30
RTX 3070 UV
71
30

Power Consumption While Overclocked

 

Overclocked Power Draw
GPU Idle
GPU Full Load
Total System
0
70
140
210
280
350
420
0
70
140
210
280
350
420
RTX 3070 OC
12
240
385
RTX 3070
10
221
360
RTX 3070 UV
10
177
315

That's what NVIDIA said when regarding what to expect from the GeForce RTX 3070 and, well, they were right. It's been a somewhat expected trend over past launches; the GTX 970 was similar to the 780Ti, the GTX 1070 was similar to the GTX 980Ti and so forth but they never quite matched the previous-gen top dog Ti part, until now. The GeForce RTX 3070 delivers on the promise that NVIDIA made and managed to do it on the smaller tier GA104 die and with lower memory bandwidth to boot.

The card itself comes in a fair bit more compact package fitting that 2080Ti performance in an RTX 2070 overall package, but with a much smaller PCB design allowing for the flow-through heatsink design to be carried over from the bigger GA102 designed cards of the RTX 3080 and RTX 3090. Can I point out one major complaint is the fact that the GeForce logo doesn't light up? Seriously, come on guys.

The new cooler design does allow for the card to run very cool, much cooler than expected for the unusually high 220w TDP for a 70 class series card. The GeForce RTX 3070 does retain the now expected 12pin connector but only needs a solid 8-pin to be plugged in. I did ask NVIDIA if there was any harm in using a dual 8-pin to 12-pin adapter because most of the modular PSU cables plug into dual 8-pin on the PSU itself and they assured me there was no issue. But on a side note, I did borrow a dual 8-pin to 12-pin adapter and found myself able to push the core to the 2100MHz mark with ease, so I imagine there's quite some headroom on those overbuilt aftermarket cards you'll see reviews go up for soon enough.

The performance was absolutely solid. Want to pick the GeForce RTX 3070 for 1440p gaming? It'll kill it. Ultrawide 1440p? It'll kill that too. 4K? I would recommend stepping up a class but the 3070 will get you in the game if you're okay with dialing back some settings. You're going to find RTX enabled games running very well at the 1440p mark when paired with DLSS, especially if the game is one of the ever-growing titles that support DLSS 2.0 which is quite killer and upcoming games like Cyberpunk 2077 and Watchdogs Legions are titles you'll want to take advantage of the RTX suite with.

Features like AV1 might not be fully mainstream yet but the GeForce RTX 3070 carries support for it. And other features like the NVIDIA Broadcast suite are absolutely undeniable in the value department, but that one is available across a large swath of GeForce cards, but something worth mentioning still.

The elephant in the room here is the 8GB of VRAM. It's an understandable concern as this marks the 3rd generation where 8GB was the available VRAM for the 70 class card. I'm sure many were hoping to see more here and I don't blame them. But so far we haven't seen it be an issue at 1440p and even 4K as the results show. What we've found in many cases where it appears there may be a VRAM limited performance scenario we actually found it more likely that it was a bandwidth-limited situation and the gap closed off once the memory was overclocked.

Speaking of overclocking, while the core might not have moved much the memory on these cards is absolutely insane for overclocking. I had no issues pushing this card to +1250MHz memory resulting in a memory bandwidth of 528GB/s and while still less than the RTX 2080Ti's 620GB/s it's still impressive for a 256-bus card especially considering its stock bandwidth is the usual 448GB/s. But looking at the results it's clear the best performance comes from a combination of GPU core undervolting and memory overclocking.

The NVIDIA GeForce RTX 3070 has raised the bar on what to expect in terms of performance at $499 for now. It's able to deliver high frame rate gaming at 1440p and Ultrawide 1440p while still being able to punch about in the 4K class. But its solid RTX feature support and performance all-around at 1440p keeps it firmly in the strong arm of the market for that highly coveted 1440p high refresh market.

Follow Wccftech on Google to get more of our news coverage in your feeds.