Forspoken Devs Demonstrate First DirectStorage Implementation and Several AMD Features

Alessio Palumbo

Yesterday, Luminous Productions (Final Fantasy XV) have showcased the cutting-edge technologies they're implementing for their next game Forspoken, starting with Microsoft's DirectStorage API (available now as a public SDK release).

During a GDC 2022 presentation titled Breaking Down the World of Athia: The Technologies of Forspoken, Technical Director of Luminous Engine project Teppei Ono discussed the world-first implementation of DirectStorage in a PC game.

Related StoryFrancesco De Meo
Forspoken 1.02 PC Patch Improves Performance On Steam Deck, Tweaks HDR Mode

The stated goal for Forspoken is to reach the astoundingly low loading time of one second on NVMe M.2 SSDs capable of over 5000MB/sec speed. While NVME M.2 SSDs can already improve loading times in existing PC games, DirectStorage can truly leverage their hardware potential, as evidenced in the below slide.

With DirectStorage, multiple queues (such as loading and decompression) can be created and executed in parallel, and multiple read requests can be synchronized at once. The API is also optimized for asynchronous streaming data transfers of file chunks from NVMe M.2 SSDs with low GPU overhead.

As showcased below, DirectStorage unlocks the file I/O speed of an NVMe M.2 SSD, which is nearly doubled compared to the Win32 API. By comparison, the file I/O speed of a SATA SSD is only marginally improved. However, in the actual loading time of a game scene from Forspoken, the SATA SSD demonstrates a larger improvement (0.8 seconds vs 0.2 seconds) than the NVMe M.2 SSD.

The reason is that file I/O speed is no longer a bottleneck for loading times with DirectStorage. Analyzing the Forspoken data, Luminous Productions noticed that the new bottlenecks are decompression and asset initialization.

Ono-san said that both need to be optimized more than ever to further improve loading times. Additionally, the future introduction of GPU decompression (such as the promised NVIDIA RTX IO) will be pivotal to reducing CPU processing and improving efficiency. Still, even with these bottlenecks, the goal to attain a one-second loading time in Forspoken has been achieved in some scenes thanks to DirectStorage.

The other half of the presentation was handled by Aurelien Serandour (Senior Developer Technology Engineer at AMD), who revealed that the collaboration between Luminous Productions and AMD began in July 2021. The goal is to ensure the correctness of the implementation of the many AMD features available in Forspoken.

The game will include:

  • AMD FidelityFX Downsampler
  • AMD FidelityFX Ambient Occlusion
  • AMD FidelityFX Denoiser
  • AMD FidelityFX Screen Space Reflections
  • AMD FidelityFX Variable Shading
  • AMD Hybrid Shadows
  • AMD FidelityFX Super Resolution

The SPD or Single Pass Downsampler offers a good performance improvement over multiple dispatches or draw calls, according to AMD. It is extensively used in the Luminous Engine to downsample depth buffer for screen space reflections, color buffer, water refraction, and so on.

Forspoken supports AMD's CACAO (Combined Adaptive Compute Ambient Occlusion). It was chosen by Luminous Productions because of its sharpness. It can be used in conjunction with RTAO (ray traced ambient occlusion) to further improve the quality of the ambient occlusion effect. The implementation of RTAO takes 2.3 ms of frame time to render at 4K resolution with a Radeon 6900 XT graphics card.

Forspoken also supports stochastic screen space reflections (SSSR). The Luminous Engine already featured support for regular SSR, but AMD's version resolved several existing issues thanks to its occluder rejection.

Variable Rate Shading is available, too, to leverage the Direct12 hardware feature that helps reduce the load on the pixel shader.

Forspoken supports hybrid ray traced shadows, which are only ray traced where it matters most (in penumbra regions). This process takes 3.3 milliseconds to render at 4K resolution on a Radeon 6900 XT graphic card, though that is before optimization.

Last but not least, Serandour talked about AMD FidelityFX Super Resolution 1.0. Forspoken will support FSR 2.0, but the implementation is still a work in progress, even though it should take less than a week overall. Meanwhile, FSR 1.0 already provides 21% faster performance in Ultra Quality mode (1.3x scale) and 26% faster performance in Quality Mode (1.5x scale).

All of those AMD FidelityFX features are available on both PC and PlayStation 5, by the way. It is unclear whether that includes FSR 2.0, but we'll inquire with AMD to find out.

Meanwhile, Luminous Productions published a short video to recap and showcase some of the technologies discussed above. As a reminder, Forspoken has been recently delayed to October 11th.

Share this story

Deal of the Day