VXAO Explained – Highest Quality AO, Available on All DX11 GPUs; Supports DX12 and OpenGL 4.5, Too
VXAO (Voxel Ambient Occlusion) was announced ahead of Rise of the Tomb Raider's launch and then added in a recent update, alongside support for DirectX 12. It's part of the GameWorks 3.1 update, along Volumetric Lighting and HFTS (Hyper Frustrum Traced Shadows).
This is the first real world application for VXAO, which NVIDIA considers the "next step in ambient occlusion technology" beyond HBAO+. Alexey Panteleev, Senior Developer Technology Engineer at NVIDIA, was the lead engineer on VXGI (Voxel Global Illumination) and he also presented VXAO (which is derived from VXGI) during his Advanced Ambient Occlusion Methods for Modern Games GDC 2016 presentation.
Unfortunately the slides haven't been published yet, but Panteleev divulged lots of information in a blog post on the official NVIDIA developer website.
He began by reiterating how VXAO was designed.
One of these features has been left with too little attention: the ambient occlusion mode, or VXAO as we call it. The idea is simple - we remove the lighting part [from VXGI] and keep only the occlusion part. Obviously, it is much less resource intensive: computing VXAO for a frame can be 2-10x faster than computing full global illumination solution, depending on settings. At the same time, VXAO is 3-4x slower than HBAO+, while its results are much better than HBAO+.
If you have worked with screen-space ambient occlusion algorithms, you know the primary issues that they come with. These are:
- Dark halos or lack of occlusion behind foreground objects;
- Unstable results near screen borders;
- Locality, which means that only a small volume around a surface contributes to AO;
- Blurriness, which comes from a blur filter required because computing a complete solution for every pixel would be too expensive.
VXAO has none of these issues because it is based on a different principle. Instead of relying on screen space data, it gathers information from a world space voxel representation of the scene, which covers a large area around the viewer. It doesn't matter if some object is not visible to the viewer - it can be behind something else, or even behind the viewer - it's still there, and it still contributes to ambient occlusion. VXAO uses voxel cone tracing, so objects that are relatively far from the surface under consideration can still contribute, and taking them into account is not as expensive as it would be in a screen space algorithm.
Let's take a look at this image comparison.
As highlighted by Panteleev, the obvious differences can be mainly found in the following places:
- Ground under the tank: no occlusion from HBAO+, some occlusion from VXAO.
- Bottom part of the tank tracks: no occlusion from HBAO+, significant occlusion from VXAO.
- Metal stand on the left side: a lot of occlusion from HBAO+, almost none from VXAO.
- Barrel behind the fire hydrant: halo around the hydrant from HBAO+, no halo from VXAO.
According to him, this new Ambient Occlusion technology handles dynamic scenes quite well.
Most of the voxel data can be preserved between frames, unless there are lots of moving objects or the camera moves quickly. And even if no voxel data can be preserved, voxelizing geometry again is not too expensive. Voxelization of a typical, high-detail game scene with a few million triangles can be done in about 3-5 milliseconds on a modern GPU like GeForce GTX 980.
Overall, there are three major passes in the VXAO algorithm: voxelization, voxel post-processing, and cone tracing. Voxelization is performed by rendering the triangle meshes into a 3D texture, and as such, its performance highly depends on the total number of triangles, size of these triangles, and the number of draw calls required to render them. Post-processing combines passes like clearing, filtering and downsampling voxels, and its performance depends on the total number of voxels produced during voxelization. Typical post-processing time is 0.5 – 1.5 ms. And finally, cone tracing is performed in screen space, so its performance depends on the screen resolution, shading rate, and the cone tracing pass in 1080p resolution.
Developers can rest easy in terms of memory management too. The requirements are much lower than VXGI, which can go anywhere from 500MB to 7GB with the highest settings, while VXAO is generally between 6 to 100MB. Panteleev then proceeded to clarify that both VXGI and VXAO are far from exclusive to Maxwell or even NVIDIA GPUs.
Finally, if you saw the original announcement of VXGI at Maxwell launch, you may think it works only on Maxwell. That's not true. Maxwell does have some useful hardware features, but the only one relevant to VXAO is pass-through geometry shaders, which improve voxelization performance by approximately 30%, and they can be safely replaced with regular geometry shaders. So VXGI in general and VXAO in particular can work on all DX11 class GPUs, including ones made by NVIDIA competitors, but Maxwell GPUs deliver the best performance. It’s not limited to DX11 either: DX12 and OpenGL 4.5 are also supported.
Going back to Rise of the Tomb Raider, the first implementation in an actual game of this technique, Panteleev explained that the particular art and lighting solution employed in this title doesn't allow VXAO to shine in all situations equally.
The game has separate channels for ambient lighting and ambient occlusion, and how exactly they are used is determined by materials. The ambient occlusion channel is often applied on top of direct lighting as well, and because VXAO is not a local effect and tends to add occlusion to large surfaces, some lights become dimmer. So we had to apply VXAO to the ambient light channel instead and keep HBAO+ in the ambient occlusion channel in order to achieve the best look. It became clear that VXAO is mostly a long-range effect, and it’s useful to combine it with some short-range SSAO technique to highlight small features which cannot be adequately represented by voxels. For this reason, VXAO now includes an optional screen-space occlusion pass so that you don’t have to work with a separate SSAO library.
VXAO can be easily enabled by those developers using Unreal Engine 4. In that case, you just need to set the console variable “r.VXGI.AmbientOcclusionMode” to 1 and enable VXGI Diffuse Tracing in Post-Process Volume settings. If you're using any other engine you'll have to integrate VXAO into it, though Panteleev promises that the process is not too complicated.
The distribution package, alongside the samples and an integration tutorial, can be found here.