NVIDIA GeForce RTX 4090 Is The First Gaming Graphics Card To Deliver 100 TFLOPs of Compute Performance

Hassan Mujtaba
NVIDIA GeForce RTX 4090 Is The First Gaming Graphics Card To Deliver 100 TFLOPs of Compute Performance 1

NVIDIA's GeForce RTX 4090 is the first gaming graphics card to achieve over 100 TFLOPs of compute performance. You can also read our full review of the card here.

Breaking The 100 TFLOPs Barrier! NVIDIA GeForce RTX 4090 Becomes The Fastest Gaming Graphics Card For Compute & Fastest Gaming Graphics Card, Period!

Breaking the 100 TFLOPs barrier is no easy feat. Before today, NVIDIA's fastest gaming graphics card, the GeForce RTX 3090 Ti, only delivered 40 TFLOPs of compute horsepower. With the launch of the GeForce RTX 4090, we get close to the 100 TFLOPs barrier but not officially. NVIDIA states that the GeForce RTX 4090 Founders Edition offers 83 TFLOPs at default settings. This means that the card is 17 TFLOPs shy of that 100 TFLOPs mark.

Related StoryHassan Mujtaba
NVIDIA Says GeForce RTX 4090 Sold Out Within 2 Weeks, Channel Inventory Stabilizing & Will Normalize By 1H 2023

So we decided it was time to test how far we can push the NVIDIA GeForce RTX 4090 Founders Edition with some overclocking. To get to 100 TFLOPs, we first pushed the power limit and temp limit slider all the way to the max and upped the Core and Memory clocks by +275 and +1100 MHz, respectively. This wasn't enough as the card was being limited by its power design. That is when we landed our hands on MSI's latest Afterburner which allowed us to raise the core voltages. At 100%, we saw some performance regression so we had to stick with +55% which showed us some good results.

With the overclock applied on our NVIDIA GeForce RTX 4090 graphics card, we saw a maximum GPU core clock of 3150 MHz on the AD102 Ada GPU, a maximum power draw of 547W and our temps peaked at 69C. All of this was done on air and with no exotic liquid cooling, chillers or LN2 were used.

And behold, we saw the magical number of not 100 but almost 101 TFLOPs right in front of our eyes. To put things into perspective, this is a 22% compute boost over the stock RTX 4090 and a 2.5x compute performance boost over the RTX 3090 Ti. The AD102 GPU also ripped apart the data-center-focused Hopper H100 GPUs by offering over 50% better FP32 performance. Ada Lovelace is truly a game changer and we can definitely see it become a popular compute and AI graphics card when Quadro variants of the said chip launch as the RTX 6000 ADA and L60.

FP32 Compute Horsepower Comparisons (Higher is Better)
Compute Power
RTX 4090 OC
RTX 4090 Stock
RTX 3090 Ti
RX 6900 XTX
Xbox Series X
PlayStation 5

NVIDIA GeForce RTX 4090 'Official' Specifications - $1599 US Pricing

The NVIDIA GeForce RTX 4090 will use 128 SMs of the 144 SMs for a total of 16,384 CUDA cores. The GPU will come packed with 72 MB of L2 cache and a total of 176 ROPs which is simply insane.

As for memory specs, the GeForce RTX 4090 will feature 24 GB GDDR6X capacities that will be clocked at 21 Gbps speeds across a 384-bit bus interface. This will provide up to 1 TB/s of bandwidth. This is the same bandwidth as the existing RTX 3090 Ti graphics card and as far as the power consumption is concerned, the TBP is rated at 450W. The card will be powered by a single 16-pin connector which delivers up to 600W of power. Custom models will be offering higher TBP targets.

The NVIDIA GeForce RTX 4090 GPU officially hits retail shelves tomorrow when NVIDIA and custom card partners' designs become available to the public. You can check out our review here.

NVIDIA GeForce RTX 40 Series Official Specs:

Graphics Card NameNVIDIA GeForce RTX 4090NVIDIA GeForce RTX 4080NVIDIA GeForce RTX 4070 Ti
GPU NameAda Lovelace AD102-300Ada Lovelace AD103-300Ada Lovelace AD104-400
Die Size608mm2378.6mm2294.5mm2
Transistors76 Billion45.9 Billion35.8 Billion
CUDA Cores1638497287680
TMUs / ROPs512 / 176320 / 112240 / 80
Tensor / RT Cores512 / 128304 / 76240 / 60
Base Clock2230 MHz2210 MHz2310 MHz
Boost Clock2520 MHz2510 MHz2610 MHz
FP32 Compute83 TFLOPs49 TFLOPs40 TFLOPs
Tensor-TOPs1321 TOPs780 TOPs641 TOPs
Memory Capacity24 GB GDDR6X16 GB GDDR6X12 GB GDDR6X
Memory Bus384-bit256-bit192-bit
Memory Speed21.0 Gbps23.0 Gbps21.0 Gbps
Bandwidth1008 GB/s736 GB/s504 GB/s
Price (MSRP / FE)$1599 US$1199 USTBD
Launch (Availability)12th October 202216th November 20225th January 2023
Share this story