NVIDIA GeForce RTX 4090 Ti / GeForce RTX 40902022
Expected Price$1999 - $1499 US
Expected Release Date2022
NVIDIA GeForce RTX 4090 Ti & RTX 4090 graphics cards are going to be the next-gen flagships for the green team, ushering in performance levels never before seen in the PC gaming segment, and here's everything from specs, price, and performance that you need to know.
NVIDIA GeForce RTX 4090 Ti & RTX 4090 - The Next-Generation BFGPUs For The Ultimate Gamer
The NVIDIA GeForce RTX 3090 series proved that the green team can go to extreme lengths to secure their lead in the PC graphics segment. Labeled as 'BFGPU', a new breed of enthusiast & ultimate graphics card, these provide the best performance possible with the best possible PC gaming features in a package that's next to none.
NVIDIA's direction with the BFGPU was to design a graphics card not just for the ultimate gamer but also for professional content creators too who also want to have the best graphics performance at hand to power the next generation of AAA gaming titles with superb visuals and insane fluidity. It's not just the FPS that matters these days, it's visuals, and a smoother frame rate too and this is exactly what the GeForce RTX 30 series is made to excel at.
We should expect similar things with the next-generation flagship too but an important factor to consider is that GPUs are becoming more power-hungry and more pricey. It is a trend that might continue into the future as we get better products but in return, there's always a cost to pay for end consumers. So starting with what we know so far, first we should take a look at the brand new Ada Lovelace or AD10* class GPUs that will be powering the next-gen GeForce RTX 40 series cards.
NVIDIA's AD102 'Ada Lovelace' GPU - The Next-Gen Powerhouse
Starting with the GPU configuration, Kopite7kimi compares the top AD102 GPU to various other GPUs from the green team. These include the gaming-focused Ampere GA102 and Turing TU102 while there's also the HPC-Focused Hopper GH100 and Ampere GA100 added to the list. I'll only compare the AD102 to its gaming predecessors since the HPC-focused designs are vastly different than consumer-centric offerings. The GPU is said to measure around 600mm2 and will utilize the TSMC 4N process node which is an optimized version of TSMC's 5nm (N5) node designed for the green team.
The NVIDIA Ada Lovelace AD103 GPU is expected to feature up to 7 GPC (Graphics Processing Clusters). This is the same GPC count as the Ampere GA102 GPU and one additional GPC over the GA103 GPU. Each GPU will consist of 6 TPCs and 2 SMs which is the same configuration as the existing chip. Each SM (Streaming Multiprocessor) will house four sub-cores which is also the same as the GA102 GPU. What's changed is the FP32 & the INT32 core configuration. Each sub-core will include 128 FP32 units but combined FP32+INT32 units will go up to 192. This is because the FP32 units don't share the same sub-core as the IN32 units. The 128 FP32 cores are separate from the 64 INT32 cores.
So in total, each sub-core will consist of 32 FP32 plus 16 INT32 units for a total of 48 units. Each SM will have a total of 128 FP32 units plus 64 INT32 units for a total of 192 units. And since there are a total of 84 SM units (12 per GPC), we are looking at 12,288 FP32 Units and 6,144 INT32 units for a total of 18,432 cores. Each SM will also include two Wrap Schedules (32 thread/CLK) for 64 wraps per SM. This is a 50% increase on the cores (FP32+INT32) and a 33% increase in Wraps/Threads vs the GA102 GPU.
NVIDIA AD102 'Ada Lovelace' Gaming GPU 'SM' Block Diagram (Image Credits: Kopite7kimi):
|GPC||12 (Per GPU)||1.7x||2x||1.5x||1.5x|
|TPC||6 (Per GPC)||Same||Same||0.75x||0.67x|
|SM||2 (Per TPC)||Same||Same||Same||Same|
|Sub-Core||4 (Per SM)||Same||Same||Same||Same|
|FP32||128 (Per SM)||Same||2x||2x||Same|
|FP32+INT32||192 (Per SM)||1.5x||1.5x||1.5x||Same|
|Warps||64 (Per SM)||1.33x||2x||Same||Same|
|Threads||2048 (Per SM)||1.33x||2x||Same||Same|
|L1 Cache||192 KB (Per SM)||1.5x||2x||Same||0.75x|
|L2 Cache||96 MB (Per GPU)||16x||16x||2.4x||1.6x|
|ROPs||32 (Per GPC)||2x||2x||2x||2x|
Moving over to the cache, this is another segment where NVIDIA has given a big boost over the existing Ampere GPUs. The Ada Lovelace GPUs will pack 192 KB of L1 cache per SM, an increase of 50% over Ampere. That's a total of 4.5 MB of L1 cache on the top AD102 GPU. The L2 cache will be increased to 96 MB as mentioned in the leaks. This is a 16x increase over the Ampere GPU that hosts just 6 MB of L2 cache. The cache will be shared across the GPU.
Finally, we have the ROPs which are also increased to 32 per GPC, an increase of 2x over Ampere. You are looking at up to 384 ROPs on the next-gen flagship versus just 112 on the fastest Ampere GPU, the RTX 3090 Ti. There are also going to be the latest 4th Generation Tensor and 3rd Generation RT (Raytracing) cores infused on the Ada Lovelace GPUs which will help boost DLSS & Raytracing performance to the next level. Overall, the Ada Lovelace AD102 GPU will offer:
- 2x GPCs (Versus Ampere)
- 50% More Cores (Versus Ampere)
- 50% More L1 Cache (Versus Ampere)
- 16x More L2 Cache (Versus Ampere)
- Double The ROPs (Versus Ampere)
- 4th Gen Tensor & 3rd Gen RT Cores
NVIDIA AD102 'Ada Lovelace' Gaming GPU Block Diagram Mock-Up (Image Credits: SemiAnalysis):
Do note that clock speeds, which are said to be between the 2-3 GHz range, aren't taken into the equation so they will also play a major role in improving the per-core performance versus Ampere.
NVIDIA GeForce RTX 40 Series Graphics Card Lineup (Rumored):
|Graphics Card||GPU||PCB Variant||SM Units / Cores||Memory / Bus||Memory Clock / Bandwidth||TGP||Power Connectors||Launch|
|NVIDIA Titan A?||AD102-400?||TBD||144 / 18432?||48 GB / 384-bit||24 Gbps / 1.15 TB/s||~900W||2x 16-pin||TBD|
|NVIDIA GeForce RTX 4090 Ti||AD102-350?||TBD||144 / 18432?||24 GB / 384-bit||24 Gbps / 1.15 TB/s||~600W||1x 16-pin||TBD|
|NVIDIA GeForce RTX 4090||AD102-300?||PG137/139 SKU330||128 / 16384?||24 GB / 384-bit||21 Gbps / 1.00 TB/s||~450W||1x 16-pin||Q4 2022|
|NVIDIA GeForce RTX 4080||AD103-300?||PG13*/139 SKU360||80 / 10240?||16 GB / 256-bit||18 Gbps / 576 GB/s||~420W||1x 16-pin||Q4 2022|
|NVIDIA GeForce RTX 4070||AD104-275?||PG141-310 SKU341||56 / 7168?||10 GB / 160-bit||18 Gbps / 360 GB/s||~300W||1x 16-pin||Q4 2022|
|NVIDIA GeForce RTX 4060||AD106-***?||TBD||>36 / 4608?||8 GB / 128-bit||TBD||~200W||1 x 16-pin||Q1 2023|
NVIDIA GeForce RTX 4090 Ti & RTX 4090 Graphics Cards Specifications
The NVIDIA GeForce RTX 4090 Ti & RTX 4090 are expected to be the only two chips powered by the top AD102 GPU which has been detailed above. As we saw with the RTX 3090 Ti and RTX 3090, both will feature different SKUs of the same chip.
NVIDIA GeForce RTX 4090 Ti 'Expected' Specifications
The NVIDIA GeForce RTX 4090 Ti is going to be the full-fat configuration with all of the 144 SMs enabled for a total of 18432 CUDA cores. The GPU will come packed with 96 MB of L2 cache and a total of 384 ROPs which is simply insane. The clock speeds are not confirmed yet but considering that the TSMC 4N process is being used, we are expecting clocks between the 2.0-3.0 GHz range.
As for memory specs, the GeForce RTX 4090 Ti is expected to rock 24 GB GDDR6X capacities that might come at faster 24 Gbps speeds across a 384-bit bus interface. This will provide up to 1.152 TB/s of bandwidth. Now all these boosted specifications will result in higher power draw too and the flagship is expected to operate at a TBP of around 600W. Now for 600W, a single 16-pin Gen 5 connector should be enough but most of the custom variants will definitely end up utilizing dual Gen 5 connectors since AIBs don't necessarily stay within spec and even the slightest of factory overclocks will push the TBP above 600W which is the limit of a single Gen 5 power connector.
We have also seen an alleged NVIDIA GeForce RTX 4090 Ti heatsink and cooler shroud which hints at the use of a beefier cold plate that provides coverage for both the GPU and memory dies along with an overall larger structure. The leaked cooler is a Founders Edition design and judging by how big it looks, the AIB models will end up being vastly bigger and we may even end up with quad-slot designs from all partners.
NVIDIA GeForce RTX 4090 'Expected' Specifications
The NVIDIA GeForce RTX 4090 will use 128 SMs of the 144 SMs for a total of 16,384 CUDA cores. The GPU will come packed with 96 MB of L2 cache and a total of 384 ROPs which is simply insane. The clock speeds are not confirmed yet but considering that the TSMC 4N process is being used, we are expecting clocks between the 2.0-3.0 GHz range.
As for memory specs, the GeForce RTX 4090 is expected to rock 24 GB GDDR6X capacities that will be clocked at 21 Gbps speeds across a 384-bit bus interface. This will provide up to 1 TB/s of bandwidth. This is the same bandwidth as the existing RTX 3090 Ti graphics card and as far as the power consumption is concerned, the TBP is said to be rated at 450W which means that TGP may end up lower than that. The card will be powered by a single 16-pin connector which delivers up to 600W of power. It is likely that we may get 500W+ custom designs as we saw with the RTX 3090 Ti.
As for its feature set, the NVIDIA GeForce RTX 4090 Ti and RTX 4090 graphics cards will rock all the modern NV feature sets such as the latest 4th Gen Tensor Cores, 3rd gen RT cores, the latest NVENC Encoder, and NVCDEC Decoder, and support for the latest APIs. They will pack all the modern RTX features such as DLSS, Reflex, Broadcast, Resizable-BAR, Freestyle, Ansel, Highlights, Shadowplay, and G-SYNC support too.
NVIDIA GeForce RTX 4090 Ti & RTX 4090 'Preliminary' Specs:
|Graphics Card Name||NVIDIA GeForce RTX 4090 Ti||NVIDIA GeForce RTX 4090||NVIDIA GeForce RTX 3090 Ti||NVIDIA GeForce RTX 3090|
|GPU Name||Ada Lovelace AD102-350?||Ada Lovelace AD102-300?||Ampere GA102-350||Ampere GA102-300|
|Process Node||TSMC 4N||TSMC 4N||Samsung 8nm||Samsung 8nm|
|Transistors||TBD||TBD||28 Billion||28 Billion|
|TMUs / ROPs||TBD / 384||TBD / 384||336 / 112||328 / 112|
|Tensor / RT Cores||TBD / TBD||TBD / TBD||336 / 84||328 / 82|
|Base Clock||TBD||TBD||1560 MHz||1400 MHz|
|Boost Clock||~2800 MHz||~2600 MHz||1860 MHz||1700 MHz|
|FP32 Compute||~103 TFLOPs||~90 TFLOPs||40 TFLOPs||36 TFLOPs|
|RT TFLOPs||TBD||TBD||74 TFLOPs||69 TFLOPs|
|Tensor-TOPs||TBD||TBD||320 TOPs||285 TOPs|
|Memory Capacity||24 GB GDDR6X||24 GB GDDR6X||24 GB GDDR6X||24 GB GDDR6X|
|Memory Speed||24.0 Gbps||21.0 Gbps||21.0 Gbps||19.5 Gbps|
|Bandwidth||1152 GB/s||1008 GB/s||1008 GB/s||936 Gbps|
|Price (MSRP / FE)||$1999 US?||$1499 US?||$1999 US||$1499 US|
|Launch (Availability)||July 2022?||July 2022?||29th March 2022||24th September 2020|
NVIDIA GeForce RTX 4090 Ti & RTX 4090 Graphics Cards Performance
As for the performance of these monster GPUs, we can only use theoretical numbers here since the launch is a bit far away but based on what we know, the RTX 4090 series cards might be the first gaming cards to hit the 100 TFLOPs compute horsepower limit.
Just for comparison's sake:
- NVIDIA GeForce RTX 4090 Ti: ~103 TFLOPs (FP32) (Assuming 2.8 GHz clock)
- NVIDIA GeForce RTX 4090: ~90 TFLOPs (FP32) (Assuming 2.8 GHz clock)
- NVIDIA GeForce RTX 3090 Ti: 40 TFLOPs (FP32) (1.86 GHz Boost clock)
- NVIDIA GeForce RTX 3090: 36 TFLOPs (FP32) (1.69 GHz Boost clock)
Based on a theoretical clock speed of 2.8 GHz, you get up to 103 TFLOPs of compute performance and the rumors are suggesting even higher boost clocks. Now, these are definitely sounding like peak clocks, similar to AMD's peak frequencies which are higher than the average 'Game' clock. A 100+ TFLOPs compute performance means more than double the horsepower versus the 3090 Ti flagship. But one should keep in mind that compute performance doesn't necessarily indicate the overall gaming performance but despite that, it will be a huge upgrade for gaming PCs and an 8.5x increase over the current fastest console, the Xbox Series X.
This will be a 2x compute performance uplift for each graphics card versus its predecessor and this is without even factoring in the RT and Tensor core performance which are expected to get major lifts too in their respective department. Now FLOPs aren't necessarily reflective of the graphics or gaming performance but they do provide a metric that can be used for comparison. A 2-2.5x gain over the RTX 3090 & RTX 3090 Ti would be very disruptive and it makes sense why NVIDIA is going so hard with higher power limits on their cards.
Gamers should expect 4K gaming to be buttery smooth on these graphics cards and with DLSS, we might even see playable 60 FPS at 8K resolution which is something that NVIDIA has been trying to achieve with its RTX 3090 series BFGPUs for a while now.
NVIDIA GeForce RTX 4090 TI & RTX 4090 Graphics Cards Price & Availability
Now coming to the prices, the NVIDIA GeForce RTX 3090 Ti & RTX 3090 graphics cards are without a doubt the most expensive single-chip GPUs to date. Starting at $1999 & $1499 US, respectively, the pricing has been catered towards the ultra-enthusiast and Pro segment. Once again, we are going to see some really high prices, and a little price bump here and there can also be expected. For the green team, it would be wise to keep the prices set where they are right now, and as ridiculous as they might be, there are people willing to pay huge sums of cash if the performance is there.
The difference between the RTX 4090 Ti and RTX 4090 is way bigger this time around too. It looks like NVIDIA didn't get as much of a positive response from the RTX 3090 Ti as it hoped so and instead of removing the card entirely from its GPU segment, they could have decided to cut down the specs of the non-Ti variant more. Since the spec downgrade will lead to a bigger difference in performance between the RTX 4090 & RTX 4090 Ti than the RTX 3090 & RTX 3090 Ti, it might just be worth getting the higher-end variant for a 15-20% performance gain instead of the single-digit gain you are getting right now.
The NVIDIA GeForce RTX 40 series graphics cards are rumored for a mid-July launch and while we have seen cooler shrouds of the RTX 4090 Ti leak out earlier, NVIDIA could still release the non-Ti variant first with the RTX 4090 Ti variant hitting the market much later. But this wouldn't be the first time that NVIDIA releases a high-end SKU during the very start of its next generation. The RTX 2080 Ti flagship was launched with the rest of the lineup even though its predecessor, the GTX 1080 Ti appeared months after the launch of the initial lineup. The RTX 3090 launched with the initial line of RTX 30 series cards but the 3090 Ti came more than a year late. This time, NVIDIA could launch the entire family from the start and go for a mid-cycle refresh later on but that remains to be seen.
NVIDIA GeForce GPU Segment/Tier Prices
|Titan Tier||Titan X (Maxwell)||Titan X (Pascal)||Titan Xp (Pascal)||Titan V (Volta)||Titan RTX (Turing)||GeForce RTX 3090||GeForce RTX 3090 Ti
GeForce RTX 3090
|Price||$999 US||$1199 US||$1199 US||$2999 US||$2499 US||$1499 US||$1999 US
|Ultra Enthusiast Tier||GeForce GTX 980 Ti||GeForce GTX 980 Ti||GeForce GTX 1080 Ti||GeForce RTX 2080 Ti||GeForce RTX 2080 Ti||GeForce RTX 3080 Ti||GeForce RTX 3080 Ti|
|Price||$649 US||$649 US||$699 US||$999 US||$999 US||$1199 US||$1199 US|
|Enthusiast Tier||GeForce GTX 980||GeForce GTX 1080||GeForce GTX 1080||GeForce RTX 2080||GeForce RTX 2080 SUPER||GeForce RTX 3080 10 GB||GeForce RTX 3080 12 GB|
|Price||$549 US||$549 US||$549 US||$699 US||$699 US||$699 US||$999 US|
|High-End Tier||GeForce GTX 970||GeForce GTX 1070||GeForce GTX 1070||GeForce RTX 2070||GeForce RTX 2070 SUPER||GeForce RTX 3070 Ti|
GeForce RTX 3070
|GeForce RTX 3070 Ti 16 GB|
|Price||$329 US||$379 US||$379 US||$499 US||$499 US||$599|
|Mainstream Tier||GeForce GTX 960||GeForce GTX 1060||GeForce GTX 1060||GeForce GTX 1060||GeForce RTX 2060 SUPER|
GeForce RTX 2060
GeForce GTX 1660 Ti
GeForce GTX 1660 SUPER
GeForce GTX 1660
|GeForce RTX 3060 Ti|
GeForce RTX 3060 12 GB
|GeForce RTX 3060 Ti
GeForce RTX 3060 12 GB
|Price||$199 US||$249 US||$249 US||$249 US||$399 US|
|Entry Tier||GTX 750 Ti|
|GTX 950||GTX 1050 Ti|
|GTX 1050 Ti|
|GTX 1650 SUPER|
|GTX 1650 SUPER|
|$149 US||$139 US|
As of now, the rumors point out the Mid-July launch so we have to wait two more months to see how well that goes!