AMD’s Radeon MI60 AI Resnet Benchmark Had NVIDIA Tesla V100 GPU Operating Without Tensor Enabled

Usman Pirzada • Nov 9, 2018 at 09:28am EST

AMD's Vega 20 - the GPU powering the Radeon Instinct MI60 - is the world's first 7nm GPU.

Footnotes are very important. They can reveal information that is vital to interpreting the metrics on display and sometimes they can also reveal caveats hidden in plain sight. AMD recently launched the world's first 7nm GPU, the Radeon Instinct MI60, and it is a milestone in the ongoing transformation of AMD's professional GPU side. The specifications are great and the performance spectacular, but the efforts put in by engineers might be overshadowed by something hidden in the footnotes. NVIDIA's Tesla V100 GPU was gimped in the ResNet 50 benchmark.

AMD Next Horizon Resnet 50 AI benchmark caveat: NVIDIA's Tesla V100 in was running at 1/3rds peak performance because Tensor mode was not used

See, the company had claimed comparable inference performance as compared to NVIDIA's Tesla V100 flagship GPU. I remembered seeing ResNet 50 performance before and could distinctly remember it being in the 1000s so I looked through the footnotes and found the cause: the test was conducted in FP32 mode. The Tesla V100 contains Tensor cores and significantly more die space (the GCN architecture is hard-limited to 4096 stream processors) and those can be used to accelerate inference and learning performance by multiple factors. In fact, if you use Tensor mode, the performance of the V100 is just over three times that of the Radeon Instinct MI60.

I did not have an NVIDIA Tesla V100 lying around, so I reached out to NVIDIA and they quickly sent me the data for that particular benchmark running in Tensor mode (the advisory for not trusting first party benchmarks applies here too, but in this case, this result can and has been replicated by third parties). The Radeon Instinct MI60 according to AMD's own testing yields about 334 images per second, while the NVIDIA Tesla V100 yields a maximum of 1189 images per second - a 3.5x speedup in performance. This speedup is in PCIe mode by the way: going to SXM2 results in an even higher differential.

That's not all, NVIDIA's Tesla T4 can actually yield 395 images per second in Tensor mode as well. NVIDIA had the following to say about the issue:

"The 70W Tesla T4 with Turing Tensor Cores delivers more training performance than 300W Radeon Instinct MI60. And Tesla V100 can deliver 3.7x more training performance using Tensor Cores and mixed precision (FP16 compute / FP32 accumulate), allowing faster time to solution while converging neural networks to required levels of accuracy." - NVIIDA

GPUs take a long time to design and develop and it is clear that AMD got blindsided in the Tensor department. That said, while Tensor cores can and do speed up certain calculations, they do not work in every case and FP32 is still a very important metric of performance. So yes, the MI60 has performance comparable to the Tesla V100, but only in FP32 mode. Overall training performance is vastly superior on the V100. If you are someone who uses Tensor to accelerate inference then the T4 is going to be more of a competitor than the V100.

AMD's point of view

Now, I reached out to AMD as well to give them a chance to reply and they had the following to say about it:

"Regarding the comparison – our footnotes for that slide clearly noted the modes so no issues there. Rationale is that FP32 training is used in most cases for FaceID to have 99.99%+ accuracy, for example in banking and other instances that require high levels of accuracy." - AMD

I have to admit I am not familiar with FaceID and other mission-critical training sets so I will not go into a detailed deconstruction of this statement. It is possible that the use of FP16 inputs makes a difference to the final result that I'm not aware of. I'm willing to give AMD the benefit of doubt on this unless my better-peers prove otherwise, but even if that is the case, the fact remains that this was an instance of cherry-picked benchmarks and is somewhat of a disappointment coming from a company that usually retains a high moral ground in these things.

No one expects marketing material to be perfect, and that is something I am painfully aware of considering the recent splattering of bad press that seems to plague the PC triumvirate. It is also worth noting that this statement does not seem to be in agreement with what NVIDIA says. We know that Tensor cores are essentially mixed precision (FP16 multiply/FP32 accumulate) and NVIDIA claims you should be able to get to the "required level of accuracy" using those anyways.

About the author: PC Hardware and Technology Enthusiast, Blood of Silicon (1 nm),

Follow Wccftech on Google to get more of our news coverage in your feeds.

Read all comments on AMD’s Radeon MI60 AI Resnet Benchmark Had NVIDIA Tesla V100 GPU Operating Without Tensor Enabled

AMD’s Radeon MI60 AI Resnet Benchmark Had NVIDIA Tesla V100 GPU Operating Without Tensor Enabled

AMD Next Horizon Resnet 50 AI benchmark caveat: NVIDIA's Tesla V100 in was running at 1/3rds peak performance because Tensor mode was not used

AMD's point of view

Trending Stories

Square Enix’s Final Fantasy VII Rebirth Could Get Cutscene-Level Quality Visuals On PC, As Modder Doubles Down On Global Illumination

Intel Expected To Restart Supply Of 10th, 12th, 13th, And 14th Gen Processors In Mainland China

Bloober Team Ditches Cronos Survival Dread for Aggressive Combat as Lazarus Lands on PC, PS5, Xbox Series, and Switch 2

Doom: The Dark Ages – Revelations Is 35 Years in the Making, id Software Says, and Arguably their Best Content

Apple Is Swapping The Faster TLC For Slower QLC Storage In iPhone 18 Pro Duo’s 1TB And 2TB Models, While Charging Sky-High Prices

Popular Discussions

Intel Nova Lake Dual-Tile CPUs Reportedly Feature Up To 474W PL2 Power Limit

AMD Zen 6 Gains a New Low-Power Core Beyond Zen 6 and Zen 6C, Surfacing in Linux Kernel Patches

RTX 5090 Arrives at Repair Shop With Its 16-Pin Connector Blown to Smithereens, Killing the GPU and VRAM

PlayStation 6 Bill of Materials Is Now Very Close to the Dreaded $1,000 Line, But a Delay Still Isn’t Likely

Sony Just Killed the Disc for PlayStation 6, and Microsoft’s “Project Helix” Xbox Is Reportedly Following

AMD’s Radeon MI60 AI Resnet Benchmark Had NVIDIA Tesla V100 GPU Operating Without Tensor Enabled

AMD Next Horizon Resnet 50 AI benchmark caveat: NVIDIA's Tesla V100 in was running at 1/3rds peak performance because Tensor mode was not used

AMD's point of view

Further Reading

Trending Stories

Popular Discussions