NVIDIA Working On An RTX 3080 Ti With 9984 CUDA Cores And 34 TFLOPs
A leaker that predicted the specifications of the RTX 3000 series months in advance has just revealed that the company is working on an RTX 3080 Ti. The Twitter user in question has a stellar record and we have no reason to doubt this information. That said, this card does appear to be in very early stages, and considering there are already rumors of NVIDIA dropping the 20 GB variant of the RTX 3080 it might be wise to take this with a grain of salt in case Jensen changes his mind again.
NVIDIA RTX 3080 Ti with 9984 cores and 34 TFLOPs baking in Jensen's oven right now
According to Kopite, the RTX 3080 Ti will be based on the GA102 and will have specifications between an RTX 3080 and 3090. The exact chip nomenclature is GA102-250-A1 and will feature a 384 bits bus with GDDR6X memory. The bus-size means NVIDIA will either be using a 12 GB buffer or a 24 GB one. Considering Microsoft Flight Simulator 2020 is already bottlenecked by the 11 GB ram on the RTX 2080 Ti, it would be disappointing to see NVIDIA ship another powerful GPU with a small memory buffer. With RTX IO and asset streaming (Unreal Engine demo for next-generation consoles) becoming a thing in the next year or so, every bit of buffer will help in this paradigm shift.
Jesus. A new spec of GA102, between 3080 and 3090.
— kopite7kimi4virgil (@kopite7kimi) October 27, 2020
It is also unclear at this point how the revelation of this new GPU fits into the rumors about NVIDIA planning a move back to TSMC. It does however lend credence to the belief that the RTX 3000 series, at least for now, is staying on the Samsung 8nm process. The company initially faced less than ideal yields and supply constraints at launch but those are expected to significantly improve as we enter into the new year.
GA102-250-A1, 9984FP32, 384bits GD6X
— kopite7kimi4virgil (@kopite7kimi) October 27, 2020
This is going to be an insanely powerful card with 34 TFLOPs of power. That said, the current API, driver, and application infrastructure cannot fully take advantage of all this raw power. This is the actual reason why NVIDIA's insanely powerful cards don't scale linearly with TLFOPs. The company essentially made cards that are ahead of their time when compared to the surrounding software ecosystem.
The hard evidence for this lies in the fact that the RTX 3000 series has been experiencing non-linear positive scaling when going down the stack. An RTX 3070 which has slightly more cores than the RTX 2080 Ti beats the former flagship - putting to rest any rumors or allegations about the CUDA cores in the Ampere series not being as strong as the Turing series or misleading blames at architecture design with lacking INT performance.
AMD's big Navi series drops later today and it remains to be seen how the GPU market shapes up as we enter into the holiday season. NVIDIA's pricing is great but it needs to work with developers to fix the API and driver stacks to properly take advantage of the raw performance offered by the Ampere series (fine wine of the highest order) and in the meantime, AMD is going to churn out budget-friendly performance cards with what appears to be ample supply. They have also taken a lot of steps to make sure that the bot-scalping that happened with NVIDIA does not happen at their Radeon RX 6000 series launch.
NVIDIA GeForce RTX 30 Series 'Ampere' Graphics Card Specifications:
|Graphics Card Name||NVIDIA GeForce RTX 3050||NVIDIA GeForce RTX 3050 Ti||NVIDIA GeForce RTX 3060||NVIDIA GeForce RTX 3060 Ti||NVIDIA GeForce RTX 3070||NVIDIA GeForce RTX 3070 Ti||NVIDIA GeForce RTX 3080||NVIDIA GeForce RTX 3080 Ti||NVIDIA GeForce RTX 3090|
|GPU Name||Ampere GA107||Ampere GA107||Ampere GA106-300||Ampere GA104-200||Ampere GA104-300||Ampere GA104-400||Ampere GA102-200||Ampere GA102-225||Ampere GA102-300|
|Process Node||Samsung 8nm||Samsung 8nm||Samsung 8nm||Samsung 8nm||Samsung 8nm||Samsung 8nm||Samsung 8nm||Samsung 8nm||Samsung 8nm|
|Transistors||TBA||TBA||TBA||17.4 Billion||17.4 Billion||17.4 Billion||28 Billion||28 Billion||28 Billion|
|TMUs / ROPs||64 / 40||80 / 48||112 / 64||152 / 80||184 / 96||192/ 96||272 / 96||320 / 112||328 / 112|
|Tensor / RT Cores||64 / 16||80 / 20||112 / 28||152 / 38||184 / 46||192/ 48||272 / 68||320 / 80||328 / 82|
|Base Clock||TBA||TBA||1320 MHz||1410 MHz||1500 MHz||1575 MHz||1440 MHz||1365 MHz||1400 MHz|
|Boost Clock||TBA||TBA||1780 MHz||1665 MHz||1730 MHz||1770 MHz||1710 MHz||1665 MHz||1700 MHz|
|FP32 Compute||TBA||TBA||12.7 TFLOPs||16.2 TFLOPs||20 TFLOPs||22 TFLOPs||30 TFLOPs||34 TFLOPs||36 TFLOPs|
|RT TFLOPs||TBA||TBA||25.4 TFLOPs||32.4 TFLOPs||40 TFLOPs||42 TFLOPs||58 TFLOPs||67 TFLOPs||69 TFLOPs|
|Tensor-TOPs||TBA||TBA||101 TOPs||129.6 TOPs||163 TOPs||174 TOPs||238 TOPs||273 TOPs||285 TOPs|
|Memory Capacity||4 GB GDDR6?||4 GB GDDR6?||12 GB GDDR6||8 GB GDDR6||8 GB GDDR6||8 GB GDDR6X||10 GB GDDR6X||12 GB GDDR6X||24 GB GDDR6X|
|Memory Speed||TBA||TBA||15 Gbps||14 Gbps||14 Gbps||19 Gbps||19 Gbps||19 Gbps||19.5 Gbps|
|Bandwidth||TBA||TBA||360 Gbps||448 Gbps||448 Gbps||608 Gbps||760 Gbps||912 Gbps||936 Gbps|
|Price (MSRP / FE)||$149?||$199?||$329||$399 US||$499 US||$599 US||$699 US||$1199||$1499 US|
|Launch (Availability)||2021?||2021?||February 2021||December 2020||29th October 2020||10th June 2021||17th September 2020||3rd June 2021||24th September 2020|