NVIDIA Claims Ampere A100 Offers Up To 2x Higher Performance & 2.8x Efficiency Versus AMD Instinct MI250 GPUs

Submit

In a new technical blog, NVIDIA has finally shared some numbers comparing its existing Ampere A100 accelerator to the AMD Instinct MI250 GPUs.

NVIDIA Claims 2x Higher Performance & Almost 3x Efficiency For Ampere A100 GPUs Versus AMD's Instinct MI250

NVIDIA has already announced its next-generation H100 GPU based on the Hopper graphics (GPU) architecture which will be shipping to customers later this year. The Hopper GPU will be delivering an estimated 26x increase in performance over the Pascal P100 which was released six years ago & that's 3x faster than the trajectory offered by Moore's Law.

AMD Next To Get Hacked? RansomHouse Extortion Group Claims To Have Stolen 450 Gb Worth of Data

So coming to the performance tests, NVIDIA tested the Ampere A100 GPU in both single and multi-GPU configurations. The same configurations were used for the Instinct MI250 from AMD. Some of the most popular Data Center workloads such as LAMMPS, NAMD, openMM, GROMACS & AMBER, were used for performance tests.

NVIDIA's single Ampere A100 GPU turned out to be up to 1.9x faster than the AMD Instinct MI250 GPU accelerator while the quad-GPU solution showed up to a 2.1x gain for the Ampere system. In energy efficiency, the quad-GPU solution provided 2.8x higher perf/watt.

The excellent performance and power efficiency of the NVIDIA A100 GPU is the result of many years of relentless software-hardware co-optimization to maximize application performance and efficiency. For more information about the NVIDIA Ampere architecture, see the NVIDIA A100 Tensor Core GPU whitepaper.

A100 also presents as a single processor to the operating system, requiring that only one MPI rank be launched to take full advantage of its performance. And, A100 delivers excellent performance at scale thanks to the 600-GB/s NVLink connections between all GPUs in a node.

Following are the notes from the testing:

Efficiency ratio of A100 to MI250 shown – higher is better for NVIDIA.  Geomean over multiple datasets (varies) per application.  Efficiency is Performance / Power consumption (Watts) as measured for the GPUs using measured using NVIDIA SMI and equivalent functionality in ROCm |

AMD MI250 measured on a GIGABYTE M262-HD5-00 with (2) AMD EPYC 7763 with 4x AMD Instinct™ MI250 OAM (128 GB  HBM2e) 500W GPUs with AMD Infinity Fabric™ technology.  NVIDIA runs on ProLiant XL645d Gen10 Plus using dual EPYC 7713 CPUs and 4x A100 (80 GB) SXM4

LAMMPS develop_db00b49(AMD) develop_2a35ec2(NVIDIA) datasets ReaxFF/c, Tersoff, Leonard-Jones, SNAP   | NAMD 3.0alpha9 dataset STMV_NVE | OpenMM 7.7.0 Ensemble runs for datasets: amber20-stmv, amber20-cellulose, apoa1pme, pme|

GROMACS 2021.1(AMD) 2022(NVIDIA) datasets  ADH-Dodec (h-bond), STMV (h-bond) | AMBER 20.xx_rocm_mr_202108(AMD) and 20.12-AT_21.12 (NVIDIA) datasets Cellulose_NVE, STMV_NVE | 1x MI250 has 2x GCD

via NVIDIA

Now it should be noted that the AMD Instinct MI250 used here isn't the full configuration since that sits on the MI250X but based on these results, the A100 should still be very competitive against the AMD CDNA 2 offerings. With Hopper coming soon, NVIDIA will push these numbers even further & that's where AMD's Instinct MI300 comes in with its brand new APU-like design.

Submit