AMD RDNA 2 GPUs Have Much Better Memory Latency Versus NVIDIA’s Ampere GPU Architecture

Hassan Mujtaba • Apr 19, 2021 at 06:12am EDT

The memory latency performance of AMD's RDNA 2 & NVIDIA's Ampere GPU architectures has been tested by Chips and Cheese. The tech outlet decided to test out the GPU memory latency performance of the latest GPU architectures from team red and team green & found out some interesting results.

AMD's RDNA 2 GPUs Feature Superior Memory Latency Performance Compared To NVIDIA's Ampere GPU Architecture

On the CPU side, measuring cache and latency performance has become a crucial pointer with the ever-increasing use of multi-chiplet dies and several IO chips onboard the same die and in recent instances, off-die too (AMD Zen chiplets). GPUs are also composed of several cache hierarchies that fill in the gaps between compute and memory performance and the source used OpenCL-based pointer chasing benchmarks to measure cache and memory latency performance on current-gen of GPUs such as the NVIDIA Ampere and AMD RDNA 2 architectures.

NVIDIA Ampere GPU vs AMD RDNA 2 GPU Cache & Latency Performance Benchmarks — NVIDIA Ampere GPU vs AMD RDNA 2 GPU cache and latency performance measured. (Image Credits: Chips and Cheese)

In the benchmarks, the AMD Radeon RX 6800 XT (RDNA 2 GPU) & the NVIDIA GeForce RTX 3090 (Ampere GPU) were positioned against each other. The cache and memory benchmark shows that AMD's RDNA 2 architecture fared far better than NVIDIA's Ampere GPU, delivering lower latency despite having to check two more levels of cache on the way to the memory. The use of Infinity cache only adds 20ns over L2 hit and is still faster than NVIDIA's Ampere.

The reason stated is that the NVIDIA Ampere-based GA102 GPU is simply a much larger GPU and while it uses a more conventional GPU memory subsystem with only two cache levels, it has to take a lot of cycles and results in over 100ns latency (L1 to L2). RDNA 2 on the other hand has a latency of just 66ns. Do note that the AMD Navi 21 GPU is much smaller & features a 4 MB L2 cache while the NVIDIA GA102 GPU features a 6 MB L2 cache for the whole chip. The NVIDIA A100 Ampere GPU for HPC features a massive 40 MB L2 cache.

Following is a note on the performance from Chips and Cheese:

RDNA 2’s cache is fast and there’s a lot of it. Compared to Ampere, latency is low at all levels. Infinity Cache only adds about 20 ns over a L2 hit and has lower latency than Ampere’s L2. Amazingly, RDNA 2’s VRAM latency is about the same as Ampere’s, even though RDNA 2 is checking two more levels of cache on the way to memory.

In contrast, Nvidia sticks with a more conventional GPU memory subsystem with only two levels of cache and high L2 latency. Going from Ampere’s SM-private L1 to L2 takes over 100 ns. RDNA’s L2 is ~66 ns away from L0, even with a L1 cache between them. Getting around GA102’s massive die seems to take a lot of cycles.

This could explain AMD’s excellent performance at lower resolutions. RDNA 2’s low latency L2 and L3 caches may give it an advantage with smaller workloads, where occupancy is too low to hide latency. Nvidia’s Ampere chips in comparison require more parallelism to shine.

via Chips and Cheese

Compared to older Pascal and Maxwell chips, the Ampere architecture has led to highly improved latency speeds on much larger GPUs. AMD on the other hand has shown some impressive gains vs older GCN and VLIW architecture-based chips. These numbers are definitely going to be interesting for comparison once the new round of chiplet based GPUs hits the gaming segment in the coming years.

About the author: A Software Engineer by training and a PC enthusiast by passion, Hassan Mujtaba serves as Wccftech's Senior Editor for hardware section. With years of experience in the industry, he specializes in deep-dive technical analysis of next-generation CPU and GPU architectures, motherboards, and cooling solutions. His work involves not only breaking news on upcoming technologies but also extensive hands-on reviews and benchmarking.

Follow Wccftech on Google to get more of our news coverage in your feeds.

Read all comments on AMD RDNA 2 GPUs Have Much Better Memory Latency Versus NVIDIA’s Ampere GPU Architecture

AMD RDNA 2 GPUs Have Much Better Memory Latency Versus NVIDIA’s Ampere GPU Architecture

AMD's RDNA 2 GPUs Feature Superior Memory Latency Performance Compared To NVIDIA's Ampere GPU Architecture

Trending Stories

Valve Says Red Line Of Death On Steam Machine Indicates Memory Training And Not GPU Failure; Confirms Flipped LED Bar On Steam Machine

Intel’s Arc Pro B70 Beats NVIDIA’s RTX 5090D In DeepSeek R1 AI LLM, Despite Costing A Quarter As Much, Offers Over 2000 Tokens/s

AMD’s Next-Gen Medusa Point “10-Core” CPU Beats Strix “10-Core” By 29% In Single-Core & 22% In Multi-Core While Running At Just 2.0 GHz

Fallout 3 Remastered Lives On Despite Bethesda’s Silence, As Gameplay Footage is Reportedly Circulating

After Axing All Of Its Legacy Plans, T-Mobile’s Grubby Hands Are Now Coming After Your $800 Cellphone Subsidies

Popular Discussions

Intel’s Shot At Fabricating Apple’s A20 Chip For The Base iPhone 18 Collapses As A Credible Leaker Calls The Original Source A ‘Blowhard’

AMD’s Next-Gen Medusa Point “10-Core” CPU Beats Strix “10-Core” By 29% In Single-Core & 22% In Multi-Core While Running At Just 2.0 GHz

NVIDIA’s RTX 3060 12 GB Graphics Card Comeback Proves Just How Bad Things Are For The PC Gaming Market

AMD Ryzen Becomes The Top CPU Choice While Radeon Powers 1 In Every 3 Desktop Gaming GPUs Sold at Microcenter

Intel Cites Rising Supply Chain Costs As The Reason For Raising Prices Of Intel Core Ultra 200S Plus Processors

AMD RDNA 2 GPUs Have Much Better Memory Latency Versus NVIDIA’s Ampere GPU Architecture

AMD's RDNA 2 GPUs Feature Superior Memory Latency Performance Compared To NVIDIA's Ampere GPU Architecture

Related Story AMD Expands Hawk Point Lineup With Numerous New SKUs Under Ryzen 200 And 100 Series

Further Reading

Trending Stories

Popular Discussions