DLSS 3 vs DLSS 2 vs Native – GeForce RTX 4090’s Ace?

Alessio Palumbo
NVIDIA GeForce RTX 4090 Custom Graphics Cards Listed In The US By Newegg, Several Models Starting at $1599 US 1

When NVIDIA unveiled the GeForce RTX 4000 Series graphics cards as the big announcement of the GTC 2022 GeForce Beyond special broadcast, it was immediately clear that DLSS 3 played an important role in achieving the unprecedented generational performance jump (2x-4x) claimed by NVIDIA.

Almost all of the benchmarks shared by the manufacturer included the new DLSS 3 technology, and the few that didn't showed performance improvements over the GeForce RTX 3000 Series that were more in line with what we have come to expect from a new generation of graphics cards.

Related StoryAlessio Palumbo
Marvel’s Spider-Man Miles Morales PC Is Another Great Port by Sony

Now that the GeForce RTX 4090, the flagship GPU (at least until the inevitable Ti model) and also the first from the brand new Ada Lovelace architecture to launch, has been in reviewers' hands for a while, we've been able to verify just how much DLSS 3 supercharges performance. First things first, though, let's take a look at what's behind the hood.

The new GeForce RTX graphics cards are equipped with fourth-generation Tensor Cores, which include a new 8-Bit Floating Point (FP8) Tensor Engine, increasing throughput by up to 5X to an estimated 1.32 Tensor-petaFLOPS on the RTX 4090.

However, with DLSS 3, NVIDIA is taking one step beyond DLSS Super Resolution. There is now a new DLSS Frame Generation convolutional autoencoder that generates an entire frame on its own based on optical flow fields calculated with the Optical Flow Accelerator.

Optical Flow Accelerators have been available in NVIDIA GPUs since the Turing architecture. However, as previously explained by VP of Applied Deep Learning Research Bryan Catanzaro, the new graphics cards are equipped with a significantly faster and more advanced version of the OFA, which is why DLSS 3 is currently an exclusive of GeForce RTX 4000 graphics cards.

The generated frame sits in-between frames reconstructed with DLSS Super Resolution. As such, NVIDIA claims that in every two frames, only one-eighth of the displayed pixels were rendered normally, while the rest were reconstructed between Super Resolution and Frame Generation, delivering massive frame rate improvements.

To account for the increased latency caused by Frame Generation, NVIDIA has embedded its latency-lowering Reflex technology to ensure responsiveness would remain optimal.

Our Hassan has been able to test the GeForce RTX 4090 with all the DLSS 3 compatible games that NVIDIA shared with reviewers. He chose the Quality preset (at 4K resolution, obviously) because he felt that the new graphics card already ran most games fast enough that it wouldn't make sense to drop the base rendering resolution by lowering DLSS presets.

Spiderman Remastered

Spiderman Remastered (Quality) 4K (Maxed)
4090 DLSS3
4090 DLSS2
3090 Ti DLSS2
4090 SMAA
3090 Ti TAA
0
40
80
120
160
200
240
0
40
80
120
160
200
240
Avg
171
112
90
92
55
Min
158
94
74
80
42
Latency
8.2
9.5
10.8
10.5
12.2

DLSS 3 Screenshots (Click To Zoom In):

Native Screenshots (Click To Zoom In):

Cyberpunk 2077

First up there is CD Projekt RED's Cyberpunk 2077, the last game to use the studio's in-house Red Engine before the switch to Unreal Engine 5. Do note that the Cyberpunk 2077 build did not include the upcoming Ray Tracing Overdrive Mode, which was also announced during the GeForce Beyond broadcast. Overdrive Mode will add advanced, taxing ray traced techniques like RTX Direct Illumination, full-resolution reflections, and multiple bounce indirect lighting. NVIDIA estimates it will reduce performance by around 51 FPS at 4K with DLSS 3, though it may also be able to absorb the hit better than DLSS 2.

With the current game, though, DLSS 3 only improved average FPS by 16.1% and one percentile frame rate by 15.3% over DLSS 2.

Cyberpunk 2077 (Quality)
4090 DLSS3
4090 DLSS2
3090 Ti DLSS2
4090 Native
3090 Ti Native
0
40
80
120
160
200
240
0
40
80
120
160
200
240
Avg
170
141
85
88
52
Min
153
125
78
80
45
Latency
5
7
11
11
14

DLSS 3 Screenshots (Click To Zoom In):

cp2077-4090-dlss-4k-quality
cp2077-4090-dlss-4k-quality-2
cp2077-4090-dlss-4k-quality-3
cp2077-4090-dlss-4k-quality-4
cp2077-4090-dlss-4k-quality-5

Native Screenshots (Click To Zoom In):

cp2077-4090-native-4k
cp2077-4090-native-4k-2
cp2077-4090-native-4k-3
cp2077-4090-native-4k-4
cp2077-4090-native-4k-5

A Plague Tale: Requiem

Next, one of the first games to be publically released with DLSS 3 support, Asobo Studio's A Plague Tale: Requiem (due next week - look forward to our review shortly). Powered by Unreal Engine 4, A Plague Tale: Requiem features updated tech that can support a much higher number of rats compared to the original game, as well as improved dynamic lighting. The final version will also include some form of ray tracing, but the tested build did not.

In this case, DLSS 3 provides a 29% performance increase over DLSS 2 in average FPS and a 39.1% improvement in one percentile frame rate. The boost will likely be greater once ray tracing is enabled, though.

A Plague Tale Requiem (Quality)
4090 DLSS3
4090 DLSS2
3090 Ti DLSS2
4090 Native
3090 Ti Native
0
40
80
120
160
200
240
0
40
80
120
160
200
240
Avg
142
110
64
74
41
1% Low
128
92
52
60
30
Latency
7
10
15
15
19

DLSS 3 Screenshots (Click To Zoom In):

a-plague-tale-requiem-dlss-3-quality-4
a-plague-tale-requiem-dlss-3-quality-3
a-plague-tale-requiem-dlss-3-quality-custom
a-plague-tale-requiem-dlss-3-quality-custom-2
a-plague-tale-requiem-dlss-3-quality-custom-3
a-plague-tale-requiem-dlss-3-quality-custom-4

Native Screenshots (Click To Zoom In):

a-plague-tale-requiem-native-4k-4
a-plague-tale-requiem-native-4k-3
a-plague-tale-requiem-native-4k-custom
a-plague-tale-requiem-native-4k-custom-2
a-plague-tale-requiem-native-4k-custom-3
a-plague-tale-requiem-native-4k-custom-4

F1 2022

Codemasters' F1 22, powered by the EGO Engine 4.0, is by far the least taxing out of all the games tested, delivering the highest frame rate even with its ray tracing option enabled.

As such, in this year's edition of the officially licensed Formula 1 game, DLSS 3 can only further boost average FPS by 20.5% and minimum FPS by 22.4%.

F1 2022 (Quality) 4K (Maxed RT)
4090 DLSS3
4090 DLSS2
3090 Ti DLSS2
4090 Native
3090 Ti Native
0
40
80
120
160
200
240
0
40
80
120
160
200
240
Avg
170
141
85
88
52
Min
153
125
78
80
45
Latency
5
7
11
11
14

DLSS 3 Screenshots (Click To Zoom In):

f1-2022-dlss-3-quality
f1-2022-dlss-3-quality-2
f1-2022-dlss-3-quality-3
f1-2022-dlss-3-quality-4

Native Screenshots (Click To Zoom In):

f1-2022-native-4k-4
f1-2022-native-4k-2
f1-2022-native-4k-3
f1-2022-native-4k

Microsoft Flight Simulator

The real power of DLSS 3 can be seen in Microsoft Flight Simulator. Whereas DLSS 2 could not improve upon CPU-bound games in any meaningful way, the key component of the new version of DLSS 3, Frame Generation, is completely independent of any CPU bottleneck.

As such, there is a massive 106% increase in average FPS and an even greater 115% improvement in minimum FPS over the DLSS 2 implementation.

Microsoft Flight Simulator 2022 (Quality) 4K (Maxed)
4090 DLSS3
4090 DLSS2
3090 Ti DLSS2
4090 TAA
3090 Ti TAA
0
40
80
120
160
200
240
0
40
80
120
160
200
240
Avg
128
62
32
58
31
Min
112
52
24
50
22
Latency
9
19
30
20
33

DLSS 3 Screenshots (Click To Zoom In):

microsoft-flight-simulator-2022-dlss-3-4k-quality
microsoft-flight-simulator-2022-dlss-3-4k-quality-2
microsoft-flight-simulator-2022-dlss-3-4k-quality-3
microsoft-flight-simulator-2022-dlss-3-4k-quality-4
microsoft-flight-simulator-2022-dlss-3-4k-quality-5

Native Screenshots (Click To Zoom In):

microsoft-flight-simulator-2022-taa-4k
microsoft-flight-simulator-2022-taa-4k-2
microsoft-flight-simulator-2022-taa-4k-3
microsoft-flight-simulator-2022-taa-4k-4
microsoft-flight-simulator-2022-taa-4k-5

Unity Enemies Demo

The last DLSS 3 test provided by NVIDIA was the gorgeous Unity Engine Enemies tech demo, originally showcased at GDC 2022. In this case, though, we could not make a direct comparison with DLSS 2 as it was not available as an option in the demo. Compared to native rendering, DLSS 3 provides a 235% average FPS uplift and a 319% boost in one percentile frame rate.

Unity Engine (Enemies) Demo 4K
4090 DLSS3
4090 Native
0
20
40
60
80
100
120
0
20
40
60
80
100
120
Avg
94
28
1% Low
88
21
Latency
10.8
35.4

DLSS 3 Screenshots (Click To Zoom In):

unity-enemies-4k-dlss-3-custom-3
unity-enemies-4k-dlss-3-custom-2
unity-enemies-4k-dlss-3-custom
unity-enemies-4k-dlss-3-4
unity-enemies-4k-dlss-3-3
unity-enemies-4k-dlss-3-2
unity-enemies-4k-dlss-3

Native Screenshots (Click To Zoom In):

unity-enemies-4k-native-custom-3
unity-enemies-4k-native-custom-2
unity-enemies-4k-native-custom
unity-enemies-4k-native-4
unity-enemies-4k-native-3
unity-enemies-4k-native-2
unity-enemies-4k-native

Summary

As NVIDIA noted during its presentation of the technology, DLSS 3 can really supercharge performance during CPU-bound scenarios like Microsoft Flight Simulator as well as in the most advanced ray traced games. As such, its true potential will be unlocked with tomorrow's games.

When tested in titles that already run at very high frame rates, its boost compared to regular DLSS 2 is more limited (at least when using the Quality preset - I reckon the Performance and Ultra Performance preset may widen the gap). That's mostly because the RTX 4090 is a beast of its own, delivering substantial performance gains over the previous generation's top cards even when using DLSS 2 or native rendering. If you've ever wanted to play games at 4K, 144+FPS with all graphics settings turned to the max, RTX 4090 and DLSS 3 can easily deliver that.

As first noted in Digital Foundry's initial hands-on with the technology, the Frame Generation component can sometimes introduce artifacts. However, those are really hard to notice during regular gameplay. It's also possible that the Frame Generation algorithm will be improved over time to diminish these glitches, much like NVIDIA did with DLSS Super Resolution.

Last but not least, I must admit that I was most impressed by the latency measurements. During press presentations, NVIDIA engineers had kind of hinted that the lowest latency would be obtained by a combination of DLSS 2 and Reflex rather than DLSS 3 due to its Frame Generation component. However, the data shows DLSS 3 coming out on top in all cases, sometimes with a meaningful difference over DLSS 2 + Reflex. More testing will be required, but it seems like RTX 4000 Series owners may not have a reason to turn off Frame Generation.

Deal of the Day

Comments