NVIDIA Improves Path Tracing Performance By 3x With Enhanced ReSTIR Algorithms, Prepped For Next-Gen Gaming

•

Apr 20, 2026 at 09:30am EDT

NVIDIA Improves Path Tracing Performance By 3x With Enhanced ReSTIR Algorithms, Prepped For Next-Gen Gaming 1

NVIDIA has shared a new and improved ReSTIR algorithm, which improves Path Tracing performance by 2-3x, setting the stage for next-gen gaming.

Ray Tracing Is Cool, But Path Tracing Is Cooler & NVIDIA Is Making PT Faster by 3x With Its New ReSTIR Algorithms

PC games are rapidly adopting Path Tracing as a means to deliver next-generation visual fidelity. Just like Ray Tracing, NVIDIA is the one who has paved the way for Path Tracing on PCs first. However, just like Ray Tracing in its early days, Path Tracing faces a challenge, and that's the requirement of faster hardware. As we have seen with several PT titles, even cards such as the mighty RTX 5090 only manage 30-40 FPS and require a huge supplement of DLSS upscaling and frame-gen to deliver a playable framerate.

The same was the case with Ray Tracing, which arrived on PCs first, and now runs decently on modern-day hardware. Even consoles have started implementing RT in big ways, though the setting is bound mostly to Quality Presets, which run at 30 FPS (or 60 FPS in a few rare cases).

With that said, NVIDIA, being the pioneer of visual graphics on PCs, is now set to advance Path Tracing to the next step. In a new research paper published by NVIDIA, titled "ReSTIR PT Enhanced: Algorithmic Advances for Faster and More Robust ReSTIR Path Tracing", NVIDIA proposes a new set of ReSTIR or spatiotemporal resampling algorithms that can deliver a 2-3x boost in performance, while eliminating visual anomalies with current RT/PT methods.

NVIDIA's solution to Path Tracing is said to be near "Production Ready" and halves the spatial reuse cost. ReSTIR enhanced PT algorithms also offer improved performance and quality thanks to optimizations that unify direct and global illumination while utilizing existing techniques for color noise and disocclusion noise reduction. The full list of enhancements includes:

Halving shift mapping costs in spatial reuse by reciprocal neighbor selection
Newrayfootprint thresholds that adapt to the scene and materials
Reducing correlation artifacts by sample duplication maps
Improving quality and cost by unifying ReSTIR for direct and indirect light
Other optimizations that boost performance and improve robustness by reducing color and disocclusion noise

Table 1 shows performance of our techniques, with each row adding one new feature/optimization on top of a baseline of Lin et al.’s [2022] public source code. We first measure the speedup from our cost-reduction techniques, which provide an average 2.74× speedup across the four tested scenes. These scenes were chosen to reflect a range of geometry and material complexity. Results for individual scenes are provided in the supplemental material.

To provide further insight into the effect of our low-level GPU optimizations, we profiled Opera House using NSight Graphics. The profiler data indicate that the optimizations in Section 6.2.1–6.2.3 reduce thread divergence and improve GPU computation efficiency. Specifically:

SM warp occupancy increases from 22.4% → 31.1%

Active threads per warp increase from 15.3 → 19.9

Warp latency decreases from 347k → 241k cycles

All of this occurs without changing sampler behavior. Applying Russian roulette (Section 6.2.4) further improves these metrics to:

34.9% occupancy

20.6 active threads per warp

82k cycles latency

Our method also reduces storage relative to the baseline through two changes: compressing the ReSTIR PT reservoir and unifying the reservoirs for direct and indirect lighting. Because each ReSTIR pass requires two sets of reservoirs to support temporal reuse, these changes reduce per-pixel storage from 2 × (88 + 16) bytes in the baseline implementation (which uses 16-byte reservoirs for ReSTIR DI) to 2 × 64 bytes. With a 1920×1080 render resolution, this lowers memory consumption from 431 MB to 265 MB.

GPU Optimization Results Compared to Lin et al. [2022]

Technique / Stage SM Warp Occupancy (%) Active Threads per Warp Warp Latency (cycles) Speedup vs. Baseline Notes
Baseline (Lin et al. [2022]) 22.4 15.3 347k 1.0× Public source code baseline
Low-level GPU optimizations (Sec. 6.2.1–6.2.3) 31.1 19.9 241k 2.74× (avg across 4 scenes) Reduced thread divergence, improved efficiency
+ Russian roulette (Sec. 6.2.4) 34.9 20.6 82k — Further efficiency gains
+ New thresholds (Sec. 4, 5, 6) — — — — Scene-independent reconnection criteria, improves shift mapping quality
All improvements (decorrelation, noise reduction) — — — 2.30× Adds 19% cost vs. fastest version, but still faster than

Technique / Stage	SM Warp Occupancy (%)	Active Threads per Warp	Warp Latency (cycles)	Speedup vs. Baseline	Notes
Baseline (Lin et al. [2022])	22.4	15.3	347k	1.0×	Public source code baseline
Low-level GPU optimizations (Sec. 6.2.1–6.2.3)	31.1	19.9	241k	2.74× (avg across 4 scenes)	Reduced thread divergence, improved efficiency
+ Russian roulette (Sec. 6.2.4)	34.9	20.6	82k	—	Further efficiency gains
+ New thresholds (Sec. 4, 5, 6)	—	—	—	—	Scene-independent reconnection criteria, improves shift mapping quality
All improvements (decorrelation, noise reduction)	—	—	—	2.30×	Adds 19% cost vs. fastest version, but still faster than

It's great to see that NVIDIA is improving upon Path Tracing performance. The technology has become relevant ever since the launch of the RTX 40 and RTX 50 GPUs. But moving on, NVIDIA wants to utilize Neural Rendering techniques and AI algorithms to further fine-tune the performance of their gaming hardware to accelerate next-gen visual capabilities.

About the author: A Software Engineer by training and a PC enthusiast by passion, Hassan Mujtaba serves as Wccftech's Senior Editor for hardware section. With years of experience in the industry, he specializes in deep-dive technical analysis of next-generation CPU and GPU architectures, motherboards, and cooling solutions. His work involves not only breaking news on upcoming technologies but also extensive hands-on reviews and benchmarking.

Follow Wccftech on Google to get more of our news coverage in your feeds.

NVIDIA Improves Path Tracing Performance By 3x With Enhanced ReSTIR Algorithms, Prepped For Next-Gen Gaming

Ray Tracing Is Cool, But Path Tracing Is Cooler & NVIDIA Is Making PT Faster by 3x With Its New ReSTIR Algorithms

Related Story NVIDIA Teams With US Government to Slam Shut the Backdoor Feeding Banned AI Chips Into China

GPU Optimization Results Compared to Lin et al. [2022]

Further Reading

AMD Radeon Drivers Silently Add Multi Frame Generation "MFG 8x", Ray Regeneration, and Neural Radiance Overrides, Hinting At A Bigger FSR Push

TSMC Can't Keep Up With CoWoS Demand, Sending Advanced Packaging Orders Spilling Over To Intel & Rival Taiwanese Fabs

NVIDIA GPU Hotspot Temperature Has Been Unlocked Through Mods, & Shows Widespread Thermal Issues Affecting RTX 50 GPUs That Throttle Gaming Performance

Intel EMIB-T Breaks Past Existing AI & HPC Scaling Limits, Enabling Ultra-Large Die Complexes With Over 10x Reticle Dies & 12 Gb/s+ HBM4e DRAM