AMD Instinct MI200 CDNA 2 ‘Aldebaran’ MCM HPC GPU Accelerator Launching Later This Year, Will Compete Against Intel Ponte Vecchio & NVIDIA Ampere
AMD's CEO has confirmed that the successor to the CDNA architecture-powered Instinct GPU HPC accelerator is on its way for launch later this year.
AMD Instinct MI200 With CDNA 2 MCM GPU Architecture Landing Later This Year, Will Power HPC Workloads
The confirmation came during JPMorgan's 49th Annual Global Technology Communications and Media conference. AMD's CEO, Lisa Su, stated that they will be launching the next generation of CDNA architecture later this year. Following is the transcript from the conference (Source: Seeking Alpha).
Last year, we talked about our first-generation CDNA architecture. This year, as I said, we’re putting together our next-generation CDNA architecture. This is actually a key component that enabled us to win the largest supercomputer bids in the US around the Frontier Oak Ridge National Labs installment as well as the Lawrence Livermore National Labs installment with El Capitan and many others.
But it’s a coherent interconnect between CPUs and GPUs that allow us to fully optimize for HPC and for AI and ML applications. And we will be launching the next generation of that architecture, actually, later this year. We’re very excited about it. I think it’s progressed extremely well. It’s the next big step in sort of innovation around the data center architectures.
Dr. Lisa Su (AMD CEO)
Here's Everything We Know About AMD's CDNA 2 Architecture Powered Instinct Accelerators
The AMD CDNA 2 architecture will be powering the next-generation AMD Instinct HPC accelerators. We know that one of those accelerators will be the MI200 which will feature the Aldebaran GPU. It's going to be a very powerful chip and possibly the first GPU to feature an MCM design. The Instinct MI200 is going to compete against Intel's 7nm Ponte Vecchio and NVIDIA's refreshed Ampere parts. Intel and NVIDIA are also following the MCM route on their next-generation HPC accelerators but it looks like Ponte Vecchio is going to be available in 2022 and the same can be said for NVIDIA's next-gen HPC accelerator as their own roadmap confirmed.
In the previous Linux patch, it was revealed that l that the AMD Instinct MI200 'Aldebaran' GPU will feature HBM2E memory support. NVIDIA was the first to hop on board the HBM2E standard and will offer a nice boost over the standard HBM2 configuration used on the Arcturus-based MI100 GPU accelerator. HBM2E allows up to 16 GB memory capacity per stack so we can expect up to 64 GB HBM2E memory at blisteringly fast speeds for Aldebaran.
The latest Linux Kernel Patch revealed that the GPU carries 16 KB of L1 cache per CU which makes up 2 MB of the total L1 cache considering that the GPU will be packing 128 Compute Units. The GPU also carries 8 MB of shared L2 cache but carries 14 CUs per Shader Engine compared to 16 CUs per SE in the previous Instinct lineup. Regardless, it is stated that each CU on Aldebaran GPUs will have a significantly higher computing output.
Other features listed include SDMA (System Direct Memory Access) support which will allow data transfers over PCIe and XGMI/Infinity Cache subsystems. As far as Infinity Cache is concerned, it's looking like that won't be happening on HPC GPUs. Do note that AMD's CDNA 2 GPU will be fabricated on a brand new process node & are confirmed to feature a 3rd Generation AMD Infinity architecture that extends to Exascale by allowing up to 8-Way coherent GPU connectivity.
AMD Radeon Instinct Accelerators 2020
|Accelerator Name||AMD Radeon Instinct MI6||AMD Radeon Instinct MI8||AMD Radeon Instinct MI25||AMD Radeon Instinct MI50||AMD Radeon Instinct MI60||AMD Instinct MI100||AMD Instinct MI200||AMD Instinct MI300|
|GPU Architecture||Polaris 10||Fiji XT||Vega 10||Vega 20||Vega 20||Arcturus (CDNA 1)||Aldebaran (CDNA 2)||TBA (CDNA 3)|
|GPU Process Node||14nm FinFET||28nm||14nm FinFET||7nm FinFET||7nm FinFET||7nm FinFET||Advanced Process Node||Advanced Process Node|
|GPU Dies||1 (Monolithic)||1 (Monolithic)||1 (Monolithic)||1 (Monolithic)||1 (Monolithic)||1 (Monolithic)||2 (MCM)||4 (MCM)?|
|GPU Clock Speed||1237 MHz||1000 MHz||1500 MHz||1725 MHz||1800 MHz||~1500 MHz||TBA||TBA|
|FP16 Compute||5.7 TFLOPs||8.2 TFLOPs||24.6 TFLOPs||26.5 TFLOPs||29.5 TFLOPs||185 TFLOPs||TBA||TBA|
|FP32 Compute||5.7 TFLOPs||8.2 TFLOPs||12.3 TFLOPs||13.3 TFLOPs||14.7 TFLOPs||23.1 TFLOPs||TBA||TBA|
|FP64 Compute||384 GFLOPs||512 GFLOPs||768 GFLOPs||6.6 TFLOPs||7.4 TFLOPs||11.5 TFLOPs||TBA||TBA|
|VRAM||16 GB GDDR5||4 GB HBM1||16 GB HBM2||16 GB HBM2||32 GB HBM2||32 GB HBM2||64/128 GB HBM2e?||TBA|
|Memory Clock||1750 MHz||500 MHz||945 MHz||1000 MHz||1000 MHz||1200 MHz||TBA||TBA|
|Memory Bus||256-bit bus||4096-bit bus||2048-bit bus||4096-bit bus||4096-bit bus||4096-bit bus||8192-bit||TBA|
|Memory Bandwidth||224 GB/s||512 GB/s||484 GB/s||1 TB/s||1 TB/s||1.23 TB/s||~2 TB/s?||TBA|
|Form Factor||Single Slot, Full Length||Dual Slot, Half Length||Dual Slot, Full Length||Dual Slot, Full Length||Dual Slot, Full Length||Dual Slot, Full Length||Dual Slot, Full Length / OAM||TBA|
|Cooling||Passive Cooling||Passive Cooling||Passive Cooling||Passive Cooling||Passive Cooling||Passive Cooling||Passive Cooling||TBA|
Stay in the loop
GET A DAILY DIGEST OF LATEST TECHNOLOGY NEWS
Straight to your inbox
Subscribe to our newsletter