Yesterday, AMD showed off the first real-time benchmarks of the Radeon Vega graphics card against the NVIDIA Pascal based Tesla P100 in deep learning benchmarks. In its first attempt, the RTG developed GPU was able to give NVIDIA's best compute card from last year a good beating but there's more to the benchmarks.
AMD Radeon Vega Vs NVIDIA Pascal Tesla P100 Deep Learning Performance Detailed
NVIDIA launched the Tesla P100 based on Pascal GP100 back in early 2016. Since then, it has been the fastest compute solution available to date. NVIDIA kicked off 2017 with the announcement of the next chapter in graphics deep learning. They announced the Tesla V100 based on Volta GV100 at GTC 2017. We already know the specs of these high-performance compute cards.
The Tesla P100 is a cut down configuration and features 3584 Cores for 10.6 TFLOPs (FP32) and 21.2 TFLOPs (FP16). Moving on, the Radeon Vega Frontier Edition will have 4096 Cores for 13.0 TFLOPs (FP32) and 25 TFLOPs (FP 64). NVIDIA's Tesla V100 is also a cut down configuration like the Tesla P100 but has vast number of cores. The chip houses 5120 cores while there are in fact 5376 cores on the GPU.
The chip delivers an astonishing amount of compute rated at 15 TFLOPs (FP32) and 120 Tensor TFLOPs (FP16) with the new Tensor Cores. The Tensor cores are dedicated units inside the Volta chip which are used for deep learning training and deliver up to 6 times higher FP16 output than GP100 or any GPU of its caliber.
|GPU Family||AMD Vega||AMD Navi||NVIDIA Pascal||NVIDIA Volta|
|Flagship GPU||Vega 10||Navi 10||NVIDIA GP100||NVIDIA GV100|
|GPU Process||14nm FinFET||7nm FinFET||TSMC 16nm FinFET||TSMC 12nm FinFET|
|GPU Transistors||15-18 Billion||TBC||15.3 Billion||21.1 Billion|
|GPU Cores (Max)||4096 SPs||TBC||3840 CUDA Cores||5376 CUDA Cores|
|Peak FP32 Compute||13.0 TFLOPs||TBC||12.0 TFLOPs||>15.0 TFLOPs (Full Die)|
|Peak FP16 Compute||25.0 TFLOPs||TBC||24.0 TFLOPs||120 Tensor TFLOPs|
|VRAM||16 GB HBM2||TBC||16 GB HBM2||16 GB HBM2|
|Memory (Consumer Cards)||HBM2||HBM3||GDDR5X||GDDR6|
|Memory (Dual-Chip Professional/ HPC)||HBM2||HBM3||HBM2||HBM2|
|HBM2 Bandwidth||484 GB/s (Frontier Edition)||>1 TB/s?||732 GB/s (Peak)||900 GB/s|
|Graphics Architecture||Next Compute Unit (Vega)||Next Compute Unit (Navi)||5th Gen Pascal CUDA||6th Gen Volta CUDA|
|Successor of (GPU)||Radeon RX 500 Series||Radeon RX 600 Series||GM200 (Maxwell)||GP100 (Pascal)|
AMD Radeon Vega Frontier Edition and NVIDIA Tesla P100 DeepBench Benchmarks
In the benchmarks, the first slide shows that the NVIDIA Tesla P100 takes 122 ms to complete DeepBench. The NVIDIA M40 trails it by a difference of 166 ms. The Intel Knights Landing takes 569 ms to complete the bench. So it can be seen that the NVIDIA P100 has been the indisputable champion in DeepBench for over a year. But that changes with Vega? The Radeon Vega Frontier Edition was shown in the very next slide, completing the bench in 88 ms, even lower than Tesla P100's 122 ms. This is a really good indication of Vega 10's performance when it comes to compute intensive workloads.
But let's take a quick look at the footnotes that AMD shared in their presentation. The first slide shows the Tesla P100 scoring better because it was tested on a more updated version of the cuDNN driver from NVIDIA. AMD's Vega performance was evaluated against Tesla P100 which was using an older driver. So the larger difference is being seen here. These being internal benchmarks show that AMD used the absolute best case scenarios for their Vega cards to show more impressive numbers. But that's not only being done in comparison against the P100.
AMD Radeon (FAD) Keynotes:
AMD Radeon Vega Frontier Edition vs NVIDIA Titan Xp Performance Numbers, What's Going on Here?
In another comparison, AMD tested the Radeon Vega Frontier edition against NVIDIA's top graphics card, the Titan Xp. In Catia, the NVIDIA Titan Xp scores 107.29 compared to Vega's 135.75. In the SolidWorks benchmark, Titan Xp scores 67.75 compared to Vega's 114.88. Here, we see Vega proving to be much faster than the NVIDIA Titan Xp which is the Pascal flagship but how close to the truth are these benchmarks?
First of all, it's surprising that AMD isn't comparing their Vega Frontier Edition to the Quadro P6000. The Quadro P6000 is essentially the same specifications as the Titan Xp but configured for the Professional market. The AMD Radeon Vega Frontier edition aims at the same professional market but due to some unknown reasons, AMD decided to pit it up against an NVIDIA enthusiast gaming solution. There's a difference of drivers on both NVIDIA Quadro P6000 and Titan Xp cards. The Quadro series drivers are specifically designed to handle pro workloads which AMD has also optimized their drivers for the Frontier Edition card as it's not a gaming focused product.
If we take a look at some benchmarks posted by PCPerspective a while ago, the Quadro P5000 (a GTX 1080 equivalent card) manages to score 152.92 in Catia. Vega scores 135.75 in comparison which is near the P4000, that is a cut down GP104 SKU. In Solidworks test, the Quadro P5000 scores 168.11 versus 114.88 on Vega. The score is even lower than Quadro P2000 which is based on a cut down GP106 die.
As you can see, the Quadro cards have the best optimization on pro workloads so AMD once again picked the best case scenario for them by comparing their best pro card to NVIDIA's best non-Quadro card. You can see in the same benchmarks that the Vega architecture does give them a big boost coming from a Radeon Pro Duo (Fiji GPU). Many users have been asking for when Vega would launch for gaming desktop PCs. AMD made it clear that the Radeon Vega Frontier edition will be available in late June so we can expect more information at Computex 2017 regarding the desktop gaming products.