NVIDIA Ada GPU - Next-Gen Display Engine, AV1, HDMI 2.1, RTX IO More

With each new generation of graphics cards, NVIDIA delivers a new range of display technologies. This generation is no different, and we see some significant updates to the display engine and the graphics interconnect. With the adoption of faster GDDR6X memory, which provides higher bandwidth, faster compression, and more cache, gaming applications can now run at higher resolutions, supporting more details on the display.

Related StoryHassan Mujtaba
NVIDIA GeForce RTX 4060 Ti May Only Be As Fast As The RTX 3070

The Ada Display Engine supports two new display technologies, HDMI 2.1 and DisplayPort 1.4a with DSC 1.2a. HDMI 2.1 allows up to 48 Gbps of total bandwidth and up to 4K 240Hz HDR and 8K 60Hz HDR.

DisplayPort 1.4a allows for up to 8K resolutions with 60Hz refresh rates and includes VESA's display stream compression 1.2 technology with visually lossless compression. You can run up to two 8K displays at 60 Hz using two cables, one for each display. In addition to that, Ampere also supports HDR processing natively with tone mapping added to the HDR pipeline.

Ada GPUs take streaming and video content to the next level, incorporating support for AV1 video encoding in the Ada eighth-generation dedicated hardware encoder (known as NVENC). Prior generation Ampere GPUs supported AV1 decoding but not encoding. Ada’s AV1 encoder is 40% more efficient than the H.264 encoder used in GeForce RTX 30 Series GPUs. AV1 will enable users who are streaming at 1080p today to increase their stream resolution to 1440p while running at the same bitrate and quality, or for users with 1080p displays, streams will look similar to 1440p, providing better quality.

Ada GPUs are also equipped with dual NVENC encoders. This enables video encoding at 8K/60 for professional video editing or four 4K/60. (Game streaming services can also take advantage of this to enable more simultaneous sessions, for instance.) Blackmagic Design’s DaVinci Resolve, the popular Voukoder plugin for Adobe Premiere Pro, and Jianying — the top video editing app in China — are all enabling AV1 support, as well as a dual encoder through encode presets. Dual encoder and AV1 availability for these apps will be available in October. NVIDIA is also working with the popular video-effects app Notch to enable AV1, as well as Topaz to enable support for AV1 and the dual encoders.

nvidia-ada-features-_2
nvidia-ada-features-_4
nvidia-ada-features-_5
nvidia-ada-features-_6
nvidia-ada-features-_7
nvidia-ada-features-_8
nvidia-ada-features-_9

In addition to NVENC, Ada GPUs also include the fifth-generation hardware decoder that was first launched with Ampere (known as NVDEC). NVDEC supports hardware-accelerated video decoding of MPEG-2, VC-1, H.264 (AVCHD), H.265 (HEVC), VP8, VP9, and the AV1 video formats. 8K/60 decoding is also fully supported. In the future, NVIDIA is also working to enable high-quality video production using AI.

NVIDIA RTX IO - Blazing Fast Read Speeds With GPU Utilization

As storage sizes have grown, so has storage performance. Gamers are increasingly turning to SSDs to reduce game load times: while hard drives are limited to 50-100 MB/sec throughput, the latest M.2 PCIe Gen4 SSDs deliver up to 7 GB/sec. With the traditional storage model, game data is read from the hard disk, then passed from the system memory and CPU before being passed to the GPU.

Historically games have read files from the hard disk, using the CPU to decompress the game image. Developers have used lossless compression to reduce install sizes and improve I/O performance. However, as storage performance has increased, traditional file systems and storage APIs have become a bottleneck. For example, decompressing game data from a 100 MB/sec hard drive takes only a few CPU cores, but decompressing data from a 7 GB/sec PCIe Gen4 SSD can consume more than twenty AMD Ryzen Threadripper 3960X CPU cores!

Using the traditional storage model, game decompression can consume all 24 cores on a Threadripper CPU. Modern game engines have exceeded the capability of traditional storage APIs. A new generation of I/O architecture is needed. Data transfer rates are the gray bars, CPU cores required are the black/blue blocks.

NVIDIA RTX IO is a suite of technologies that enable rapid GPU-based loading and decompression of game assets, accelerating I/O performance by up to 100x compared to hard drives and traditional storage APIs. When used with Microsoft’s new DirectStorage for Windows API, RTX IO offloads dozens of CPU cores’ worth of work to your RTX GPU, improving frame rates, enabling near-instantaneous game loading, and opening the door to a new era of large, incredibly detailed open-world games.

nvidia-geforce-rtx-30-series-deep-dive_rtx-3080_rtx-3090_rtx-3070_ampere-ga102_ampere-ga104_gpu_graphics-cards_34
nvidia-geforce-rtx-30-series-deep-dive_rtx-3080_rtx-3090_rtx-3070_ampere-ga102_ampere-ga104_gpu_graphics-cards_35
nvidia-geforce-rtx-30-series-deep-dive_rtx-3080_rtx-3090_rtx-3070_ampere-ga102_ampere-ga104_gpu_graphics-cards_36
nvidia-geforce-rtx-30-series-deep-dive_rtx-3080_rtx-3090_rtx-3070_ampere-ga102_ampere-ga104_gpu_graphics-cards_37
nvidia-geforce-rtx-30-series-deep-dive_rtx-3080_rtx-3090_rtx-3070_ampere-ga102_ampere-ga104_gpu_graphics-cards_38
nvidia-geforce-rtx-30-series-deep-dive_rtx-3080_rtx-3090_rtx-3070_ampere-ga102_ampere-ga104_gpu_graphics-cards_39

Object pop-in and stutter can be reduced, and high-quality textures can be streamed at incredible rates, so even if you’re speeding through a world, everything runs and looks great. In addition, with lossless compression, game download and install sizes can be reduced, allowing gamers to store more games on their SSD while also improving their performance.

NVIDIA RTX IO plugs into Microsoft’s upcoming DirectStorage API, which is a next-generation storage architecture designed specifically for state-of-the-art NVMe SSD-equipped gaming PCs and the complex workloads that modern games require. Together, streamlined and parallelized APIs specifically tailored for games allow dramatically reduced IO overhead and maximize performance/bandwidth from NVMe SSDs to your RTX IO-enabled GPU.

Specifically, NVIDIA RTX IO brings GPU-based lossless decompression, allowing reads through DirectStorage to remain compressed and delivered to the GPU for decompression. This removes the load from the CPU, moving the data from storage to the GPU in a more efficient, compressed form, and improving I/O performance by a factor of two.

GeForce RTX GPUs will deliver decompression performance beyond the limits of even Gen4 SSDs, offloading potentially dozens of CPU cores’ worth of work to ensure maximum overall system performance for next-generation games. Lossless decompression is implemented with high-performance compute kernels, asynchronously scheduled. This functionality leverages the DMA and copy engines of Turing and Ampere, as well as the advanced instruction set, and architecture of these GPU’s SM’s.

The advantage of this is that the enormous compute power of the GPU can be leveraged for burst or bulk loading (at level load, for example) when GPU resources can be leveraged as high-performance I/O processors, delivering decompression performance well beyond the limits of Gen4 NVMe. During streaming scenarios, bandwidths are a tiny fraction of the GPU capability, further leveraging the advanced asynchronous compute capabilities of Turing and Ampere. Microsoft is targeting a developer preview of DirectStorage for Windows for game developers next year, and NVIDIA Turing & Ampere gamers will be able to take advantage of RTX IO-enhanced games as soon as they become available.

Products mentioned in this post

AMD Ryzen
USD 340

The links above are affiliate links. As an Amazon Associate, Wccftech.com may earn from qualifying purchases.

Filter videos by
Order