NVIDIA DLSS Explained – Much Higher Quality Than TAA or Much Faster Performance, Delivered by NVIDIA NGX

Sep 15, 2018

The huge Turing architecture whitepaper published by NVIDIA provides deep, technical dives into all of the architecture’s improvements. In this case, we’ll focus on NVIDIA DLSS and the underlying NVIDIA NGX software architecture.

Standing for Neural Graphics Acceleration, it’s a new deep-learning based technology stack part of the RTX platform. Here’s a brief description from NVIDIA:

Related NVIDIA Reportedly Preparing Turing RTX 20 Series Refresh To Counter AMD Radeon Navi GPUs – Faster GDDR6 Memory Chips and Higher Clocks Expected

NGX utilizes deep neural networks (DNNs) and a set of Neural Services to perform AI-based functions that accelerate and enhance graphics, rendering, and other client-side applications. NGX employs the Turing Tensor Cores for deep learning-based operations and accelerates delivery of NVIDIA deep learning research directly to the end-user. Note that NGX does not work on GPU architectures before Turing.

The NVIDIA NGX is ‘tightly’ integrated with the drivers and hardware. There’s an NGX API (described as thin and easy for developers to use) which provides access to multiple AI-based features, pre-trained by NVIDIA.

All the NVIDIA NGX features will be managed via GeForce Experience if you own a GeForce GPU, or via Quadro Experience (now available in tech preview) if you have a Quadro GPU installed. The software will look for a Turing GPU and, upon finding it in the system, proceeds to download the NVIDIA NGX Core package as well as the deep neural network models available for the installed games and applications.

These DNN models interface with DirectX, Vulkan and CUDA 10, the latest version of NVIDIA’s SDK. Furthermore, the DNN models and services are accelerated with Turing’s Tensor Cores and take advantage of high-performance inference optimizer TensorRT, which delivers low latency and high throughput.

NVIDIA DLSS is the specific DNN model devised to solve the inherent issues, like blurring and transparency, with TAA (Temporal AntiAliasing). Here, NVIDIA leveraged the demonstrated image processing capabilities of a deep learning network. DLSS can deliver either much higher quality than TAA at a certain set of input samples, or much faster performance at a lower input sample count, all while inferring a visual result that’s of similar quality to TAA while using basically half the shading work.

For example, at 4K resolution, DLSS provided two times faster performance than TAA in Epic’s Unreal Engine 4 Infiltrator demo. Of course, the pre-requisite is a training process where the DNN learns how to produce the desired result thanks to a ‘large number of super high-quality examples’.

Related NVIDIA Readies Quadro RTX Mobility GPUs With Turing Architecture, Leaked Within Latest Dell Precision Mobile Workstation Roadmap

To train the network, we collect thousands of “ground truth” reference images rendered with the gold standard method for perfect image quality, 64x supersampling (64xSS). 64x supersampling means that instead of shading each pixel once, we shade at 64 different offsets within the pixel, and then combine the outputs, producing a resulting image with ideal detail and anti-aliasing quality. We also capture matching raw input images rendered normally. Next, we start training the DLSS network to match the 64xSS output frames, by going through each input, asking DLSS to produce an output, measuring the difference between its output and the 64xSS target, and adjusting the weights in the network based on the differences, through a process called backpropagation. After many iterations, DLSS learns on its own to produce results that closely approximate the quality of 64xSS, while also learning to avoid the problems with blurring, disocclusion, and transparency that affect classical approaches like TAA.

There’s also a DLSS 2X mode which is entirely focused on high-quality rather than performance. DLSS 2x provides ‘almost indistinguishable’ quality to a 64x supersampled image, which would be impossible to render in real time for obvious reasons.

As we can see in the image below, DLSS 2X delivers far superior image clarity when compared to TAA. That said, we suspect the ‘performance mode’ will be the main use of NVIDIA DLSS for the time being.

While Turing comes with a variety of performance-oriented shading improvements like Mesh Shading, Variable Rate Shading, and Texture-space Shading, so far DLSS is the one that’s seeing widespread adoption with 25 games already confirmed to adopt it and developers like Phoenix Labs talking positively of its benefits.

It’s indeed promising to say the least. By cross-referencing NVIDIA’s own benchmarks, the GeForce RTX 2080 with DLSS enabled should jump to 57.6 FPS in Shadow of the Tomb Raider, almost catching up with the base 59 FPS registered by the RTX 2080Ti. Which, in turn, could soar to well over 70FPS at 4K resolution with DLSS enabled, and Shadow of the Tomb Raider isn’t even the best game to demonstrate the technology according to NVIDIA’s benchmarks (other titles like Final Fantasy XV and ARK: Survival Evolved had much bigger gains with DLSS).

We look forward to learning much more about it in the coming weeks.