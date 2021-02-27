  ⋮  

AMD Aldebaran Rumored To Be Arcturus’s Successor, Could Be Featured on Instinct MI200 GPU Accelerator

It looks like AMD is accelerating the production of its next-generation Instinct accelerator, the MI200, which is expected to feature an MCM GPU design. According to the latest information dump, not only is the codename for the GPU unveiled but also a range of new specifications.

AMD Instinct MI200 Accelerator Rumored To Be Codenamed Aldebaran, Will Succeed Arcturus With MCM GPU Design & HBM2E Memory

The AMD Instinct family, starting in 2020, is all CDNA architecture-based. The first generation CDNA flagship, the Instinct MI100, was internally codenamed Arcturus. It was a follow-up to Vega and the GPUs are named after giant stars. The successor to the Instinct MI100, the MI200, is also seemingly going to be named after a huge star and this time, it is expected to be known as Aldebaran.

In the latest Linux patch support (via Phoronix), the AMD Instinct MI200 could be known as Aldebaran which is a giant star located within the constellation of Taurus and has a solar radius of 44.13 or 75% more than Arcturus. The naming convention seems to suggest that Aldebaran will be twice as powerful as Arcturus since the numbers in the MI accelerator's naming convention represent the theoretical Flops performance. This is just speculation at this point but given that the accelerator is expected to feature an MCM GPU design, it might be real.

The patches also reveal that the AMD Instinct MI200 'Aldebaran' GPU will feature HBM2E memory support. The brand new memory standard was first used by NVIDIA's Ampere GA100 GPUs & will offer a nice boost over the standard HBM2 configuration used on the Arcturus-based MI100 GPU accelerator. HBM2E allows up to 16 GB memory capacity per stack so we can expect up to 64 GB HBM2E memory at blisteringly fast speeds for Aldebaran.

Other features listed include SDMA (System Direct Memory Access) support which will allow data transfers over PCIe and XGMI/Infinity Cache subsystems. It looks like AMD will incorporate its new Infinity Cache design on upcoming Instinct accelerators too so we are looking for a very advanced version of the Vega GPU.

ARCTURUS ALDEBARAN
 .asic_family = CHIP_ARCTURUS,
.asic_name = “arcturus”,
.max_pasid_bits = 16,
.max_no_of_hqd = 24,
.doorbell_size = 8,
.ih_ring_entry_size = 8 * sizeof(uint32_t),
.event_interrupt_class = &event_interrupt_class_v9,
.num_of_watch_points = 4,
.mqd_size_aligned = MQD_SIZE_ALIGNED,
.supports_cwsr = true,
.needs_iommu_device = false,
.needs_pci_atomics = false,
.num_sdma_engines = 2,
.num_xgmi_sdma_engines = 6,
.num_sdma_queues_per_engine = 8,		 .asic_family = CHIP_ALDEBARAN,
.asic_name = “aldebaran”,
.max_pasid_bits = 16,
.max_no_of_hqd = 24,
.doorbell_size = 8,
.ih_ring_entry_size = 8 * sizeof(uint32_t),
.event_interrupt_class = &event_interrupt_class_v9,
.num_of_watch_points = 4,
.mqd_size_aligned = MQD_SIZE_ALIGNED,
.supports_cwsr = true,
.needs_iommu_device = false,
.needs_pci_atomics = false,
.num_sdma_engines = 2,
.num_xgmi_sdma_engines = 3,
.num_sdma_queues_per_engine = 8,

There's also a hint at the MCM GPU design again for the AMD Instinct MI200 'Aldebaran GPU'. The patch states a new mode known as Performance Determinism in which the PMFW will maintain sustained performance level and can be enabled on a per-die basis. This would allow each GPU die to run this feature but a max graphics frequency needs to be specified so they don't exceed the power caps.

Do note that AMD's CDNA 2 GPU will be fabricated on a brand new process node & are confirmed to feature a 3rd Generation AMD Infinity architecture that extends to Exascale by allowing up to 8-Way coherent GPU connectivity.

AMD Radeon Instinct Accelerators 2020

Accelerator NameAMD Radeon Instinct MI6AMD Radeon Instinct MI8AMD Radeon Instinct MI25AMD Radeon Instinct MI50AMD Radeon Instinct MI60AMD Instinct MI100AMD Instinct MI100
GPU ArchitecturePolaris 10Fiji XTVega 10Vega 20Vega 20ArcturusTBA
GPU Process Node14nm FinFET28nm14nm FinFET7nm FinFET7nm FinFET7nm FinFETAdvanced Process Node
GPU Cores2304409640963840409676807680 x 2 (MCM) ?
GPU Clock Speed1237 MHz1000 MHz1500 MHz1725 MHz1800 MHz~1500 MHzTBA
FP16 Compute5.7 TFLOPs8.2 TFLOPs24.6 TFLOPs26.5 TFLOPs29.5 TFLOPs185 TFLOPsTBA
FP32 Compute5.7 TFLOPs8.2 TFLOPs12.3 TFLOPs13.3 TFLOPs14.7 TFLOPs23.1 TFLOPsTBA
FP64 Compute384 GFLOPs512 GFLOPs768 GFLOPs6.6 TFLOPs7.4 TFLOPs11.5 TFLOPsTBA
VRAM16 GB GDDR54 GB HBM116 GB HBM216 GB HBM232 GB HBM232 GB HBM2TBA
Memory Clock1750 MHz500 MHz945 MHz1000 MHz1000 MHz1200 MHzTBA
Memory Bus256-bit bus4096-bit bus2048-bit bus4096-bit bus4096-bit bus4096-bit busTBA
Memory Bandwidth224 GB/s512 GB/s484 GB/s1 TB/s1 TB/s1.23 TB/sTBA
Form FactorSingle Slot, Full LengthDual Slot, Half LengthDual Slot, Full LengthDual Slot, Full LengthDual Slot, Full LengthDual Slot, Full LengthOAM
CoolingPassive CoolingPassive CoolingPassive CoolingPassive CoolingPassive CoolingPassive CoolingPassive Cooling
TDP150W175W300W300W300W300WTBA

