NVIDIA GeForce RTX 20 Series Review Ft. RTX 2080 Ti & RTX 2080 Founders Edition Graphics Cards – Turing Ray Traces The Gaming Industry
NVIDIA GeForce RTX 2080 Ti & GeForce RTX 208019th September, 2018
NVIDIA Turing GPU - Turing Advanced Shading Techniques
NVIDIA is also incorporating new shading models which would significantly help the games process vertex, tesselation, and geometry shading.
- Mesh Shading — new shader model for vertex, tesselation, geometry shading (more objects per scene)
- Variable Rate Shading (VRS) — developer control over shading rates (to limit shading where it does not provide visual benefit)
- Texture-Space Sharing — Storing shading results in memory (no need to duplicate sharing work for the processes)
- Multi-View Rendering (MVR) — Extends Pascal’s Single Pass Stereo to multi-views in a single pass
Mesh Shading introduces two new shader stages, Task Shaders, and Mesh Shaders, that support this same functionality, but with much more flexibility. The mesh shader stage produces triangles for the rasterizer, but internally, instead of using a single-thread program model, it uses a cooperative thread model similar to compute shaders.
Ahead of the mesh shader in the pipeline is the task shader. The task shader operates similarly to the hull shader stage of tessellation, in that it is able to dynamically generate work. However, like the mesh shader, it uses a cooperative thread model and instead of having to take a patch as input and tessellation decisions as output, its input and output are user-defined.
VARIABLE RATE SHADING
Turing introduces a new and dramatically more flexible capability for controlling shading rate called Variable Rate Shading (VRS). With VRS, shading rate can now be adjusted dynamically at an extremely fine level—every 16-pixel x 16-pixel region of the screen can now have a different shading rate.
This fine-level of control enables developers to deploy new algorithms that were not previously possible for optimizing shading rate and increasing performance. The developer has up to seven options to choose from for each 16x16 pixel region, including having one shading result be used to color four pixels (2 x 2), or 16 pixels (4 x 4), or non-square footprints like 1 x 2 or 2 x 4.
Overall, with Turing’s VRS technology, a scene can be shaded with a mixture of rates varying between once per visibility sample (super-sampling) and once per sixteen visibility samples. The developer can specify shading rate spatially (using a texture) and using a per-primitive shading rate attribute. As a result, a single triangle can be shaded using multiple rates, providing the developer with fine-grained control.
CONTENT ADAPTIVE SHADING
In Content Adaptive Shading, shading rate is simply lowered by considering factors like spatial and temporal (across frames) color coherence. The desired shading rate for different parts of the next frame to be rendered are computed in a post-processing step at the end of the current frame. If the amount of detail in a particular region was relatively low (sky or a flat wall etc.), then the shading rate can be locally lowered in the next frame.
The output of the post-process analysis is a texture specifying a shading rate per 16 x 16 tile, and this texture is used to drive shading rate in the next frame. A developer can implement content-based shading rate reduction without modifying their existing pipeline, and with only small changes to their shaders.
MOTION ADAPTIVE SCALING
The second application of Variable Rate Shading exploits objects motion. Our eyes are designed to track moving objects linearly so that we can see their details even when in motion. However, objects on LCD screens do not move smoothly or continuously. Rather, they jump from one location to the next with each 60 Hz frame update.
From the perspective of our eye, which is trying to smoothly track the object, it looks like it is wiggling back and forth on the retina as its location moves ahead and behind of the path the eye is tracking. The net result is that we cannot see the full detail of the object, instead, we see a somewhat lower resolution/blurred version.
The main implication of this phenomenon is that when objects are moving rapidly in the scene, it is wasteful to shade them at full resolution. It would be more efficient to shade at a reduced sampling rate, while still at a high enough rate to be visually equivalent. The savings from optimized shading can be used to deliver a higher frame rate so that the scene is easier to follow.
VRS gives the tools to do this optimization. In the simplest approach, devs can use the motion vectors from Temporal AA to understand motion. The direction and magnitude of motion can be used to directly select an appropriate shading rate per tile. A related approach would be to use VRS to take advantage of blur effects in applications, where both motion blur and depth of field (DOF) are sometimes explicitly rendered. An application can directly compute the degree and direction of blur of individual objects and use the extent of blur to set a per-triangle shading rate.
Note that the methods of these two examples (Content Adaptive Shading and Motion Adaptive Shading) can also be used in combination, with the final shading rate for a region/triangle computed as an application-specified function of the two rates.
Multi-View Rendering MVR) allows developers to efficiently draw a scene from multiple viewpoints or even draw multiple instances of a character in varying poses, all in a single pass. Turing hardware supports up to four views per pass, and up to 32 views are supported at the API level. By fetching and shading geometry only once, Turing optimally processes triangles and their associated vertex attributes while rendering multiple versions. When accessed via the D3D12 View Instancing API, the developer simply uses the variable SV_ViewID to index different transformation matrices, reference different blend weights, or control any shader behavior they like, that varies depending on which view they are processing.
With multiple active views, each triangle can have a mix of view-dependent attributes and view- independent attributes (values that are shared across all views). A simple example of a view-dependent attribute is a reflection direction because it depends on the eye’s position, vertex position, and a normal vector. To improve efficiency, the NVIDIA compiler analyzes the input shader and produce a compiled output that executes view independent code once, with the result shared across all output views, while view dependent attributes are necessarily computed once per output view.
Turing’s MVR is an expansion of the Simultaneous Multi-Projection (SMP) functionality introduced in the Pascal architecture. SMP was designed specifically to accelerate stereo and surround rendering cases. With SMP the developer can specify two views, where view dependent attributes are limited to the vertex X coordinate and viewport(s) used for rasterization. Each view can then be multicast to a set of up to 16 pre-configured projections (or viewports) to support use cases such as Lens Matched Shading. Turing removes the limitations on allowed view dependent attributes and increases the number of views supported while continuing to support up to 16 projections per view.