Wolfenstein II – Deferred Rendering and GPU Culling Performance Impact
Wolfenstein II: The New Colossus has been out for a few days now and people who are playing it are loving it – including our own Chris. There’s been a fair bit of talk about performance but so far very few sources have put together comprehensive performance reviews due to the fact that Vulkan isn’t exactly the easiest of the APIs to test. We were getting started on our performance review thanks to a couple of tips since we were having some issues getting OCAT to function properly, but felt this may be a more useful direction for people.
We’re going to start our testing rounds by first taking a look at a couple of the options in the video settings that seem to be confusing a fair number of people; Deferred Rendering and GPU Culling. We’re not going to go very deep into these two setting explanations outside of making sure everyone has a very basic understanding of what each one means and what is going on with them.
For those wanting a deeper explanation of Deferred Rendering, I urge you to read this article here. It’s likely you’ve heard about this before but under the name of Deferred Shading. Basically, the traditional method of Forward Rendering pushes the pixel geometry straight through the pipeline and applies shaders all at once.
Alternatively Deferred Rendering withholds the Final Shaders from being applied until all pixel geometry has been processed then adds the shaders to the final full image like shown in the graphic below from gamedevelopment.tutsplus.com
GPU Culling is very basic to understand. Essentially it involves only rendering what is being seen in the player’s field of view. If an object is not in view, or about to become in view it is simply not rendered in the pipeline. Whether it’s preemptive or hardware based the result is the same, the video below shows an exaggeration of what happens during GPU Culling. Of course, you wouldn’t actually see the culling taking place as it happens off screen.
Test Setup and Methodology
Testing in Vulkan is not much different from testing in DX11 it just needs a different set of tools. Utilizing a combination of OCAT and a spreadsheet developed by a friend of the site we are able to extract the average FPS, the 1% lows, and the .1% lows so that they can be charted on a graph. Testing was done at the beginning of the New Orleans mission where you start outside in a fairly lighting and fog intensive area and move just inside, a short but effective and predictable testing environment. The settings were an interesting bunch as they differ slightly based on which card you use. For these tests, we used the GeForce GTX 1060 6GB and the RX 480 8GB and applied additional power limit to both cards to stabilize the clock rate so that the only variation between the test runs was the settings and not clock fluctuations from GPU Boost or Power Tune.
X370 Test Bench
|CPU||Ryzen 7 1700 3.9GHz|
|Memory||16GB G.Skill Flare X DDR4 3200|
|Motherboard||MSI X370 XPower Gaming Titanium|
|Storage||Adata SU800 128GB
2TB Seagate SSHD
|PSU||Cooler Master V1200 Platinum|
Graphics Cards Tested
||Clock Speed||Memory Capacity
|NVIDIA GTX 1060 FE 6GB||Pascal
|XFX RX 480 8GB||Polaris 10||2304||1266||8GB GDDR5||8Gbps
Testing was done post patch on 10/31/2017
Baseline Performance with Deferred Rendering Off
Ultra Preset at 1920×1080 is where we’ll be basing all of our baseline performance changes from, refer to this chart to see the delta between settings on each card through the four charts included in this article. Something very important to note here, however, is that this preset has GPU Culling enabled by default on Radeon cards and disabled by default on NVIDIA cards. What we are exploring here, more importantly than a direct comparison of the cards, is the change in performance based on these two settings. In the chart below Deferred Rendering is disabled for both cards by default in the Ultra Preset
Deferred Rendering Enabled
We can see by enabling the Deferred Rendering option the RX 480 took a slight hit to performance across the board while the GTX 1060 saw a reciprocating benefit on the lower end while still gaining some performance on average. The visual impact was not noticeable like I had expected when enabling this feature.
GPU Culling Disabled
For those worried, we would not see how each card runs with Deferred Rendering and GPU Culling disabled since the Ultra setting had GPU Culling enabled on one card should be put at ease here. This is representative of both cards with neither GPU Culling or Deferred Rendering enabled. While the GTX 1060 is unchanged from the initial results we see a measurable drop in performance for the RX 480 showing why the developer suggests it be enabled for Radeon.
GPU Culling Enabled
Now to flip the script, as they say, we need to see how the performance responds with GPU Culling enabled on both cards. It is here where we see why they recommend GPU Culling disabled for NVIDIA cards since there is no added benefit. Actual it’s a detriment by a couple FPS, small but still measurable and no need to bother messing with this setting on GeForce cards.
Why not do a full rundown of graphics cards performance like normal with this game? Easy, after seeing that this game can run on an APU, seriously it can, albeit at low settings and 720p, the performance of this title is a non-issue and really just a measuring contest. What we wanted to do was explore a few settings that seem to raise the most questions. I could have included Async Compute but that would have just muddied the water if you’d like for us to dig in a bit to that just let us know. Basically, at the end of the day the developers may have left out some useful information on what these ‘mystery settings’ actually do, but I give them credit for simplifying the ‘who needs to do what’ with them. And it’s nice to see that GeForce owners may be able to squeeze a little extra performance by enabling Deferred Rendering. Radeon owners may not want to touch any of it and let the game fly.