AMD Radeon Instinct MI100 With Arcturus GPU Spotted – 32 GB HBM2 Memory, 200W TDP In Early Prototype


AMD's upcoming Radeon Instinct MI100 HPC accelerator which would feature the Arcturus GPU has been spotted by Komachi. The existence of the AMD Arcturus GPU was confirmed all the way back in 2018 and two years later, we are finally starting to get details regarding the specifications for AMD's next HPC/AI accelerator.

AMD Arcturus GPU Powered Radeon Instinct MI100 HPC / AI Accelerator Features 32 GB HBM2, 200W TDP In Early Prototypes

The "Arcturus" codename comes from the red giant star which is the brightest in the constellation of Bootes and among the brightest stars that can be seen from space. Similar to Vega and Navi, both of which are also some of the brightest stars visible in the night sky, the naming scheme takes inspiration from the time since RTG was created and the founding father, Raja Koduri (ex AMD RTG President), put a lot of emphasis on bright stars when they first introduced Polaris.

Minisforum Unveils Elitemini HM90 Mini PC With AMD Ryzen 9 4900H APU, Starting at $499 US

Previously, we have seen support for Arcturus GPU added to HWiNFO, in particular, the XL variant. To our surprise, the new variant that has leaked out 'D34303' is also based on the XL die and would go on to power the Radeon Instinct MI100. The information for this part is based on a test board so it is likely that final specifications would not be the same but here are the key points:

  • Based on Arcturus XL GPU
  • Test Board has a TDP of 200W
  • Up To 32 GB HBM2 Memory
  • HBM2 Memory Clocks Reported Between 1000-1200 MHz

The Radeon Instinct MI100 test board has a TDP of 200W and is based on the XL variant of AMD's Arcturus GPU. The card also features 32 GB of HBM2 memory with pin speeds of 1.0 - 1.2 GHz. The MI60 in comparison has 64 CUs with a TDP of 300W while clock speeds are reported at 1200 MHz (Base Clock) while the memory operates at 1.0 GHz along with a 4096-bit bus interface, pumping out 1 TB/s bandwidth. There's a big chance that the final design of the Arcturus GPU could be featuring Samsung's latest HBM2E 'Flashbolt' memory which offers 3.2 Gbps speeds for up to 1.5 Tb/s of bandwidth.

AMD Radeon Instinct Accelerators 2020

Accelerator NameAMD Radeon Instinct MI6AMD Radeon Instinct MI8AMD Radeon Instinct MI25AMD Radeon Instinct MI50AMD Radeon Instinct MI60AMD Instinct MI100AMD Instinct MI200AMD Instinct MI300
GPU ArchitecturePolaris 10Fiji XTVega 10Vega 20Vega 20Arcturus (CDNA 1)Aldebaran (CDNA 2)TBA (CDNA 3)
GPU Process Node14nm FinFET28nm14nm FinFET7nm FinFET7nm FinFET7nm FinFETAdvanced Process NodeAdvanced Process Node
GPU Dies1 (Monolithic)1 (Monolithic)1 (Monolithic)1 (Monolithic)1 (Monolithic)1 (Monolithic)2 (MCM)4 (MCM)?
GPU Cores23044096409638404096768014,080?28,160?
GPU Clock Speed1237 MHz1000 MHz1500 MHz1725 MHz1800 MHz~1500 MHzTBATBA
FP16 Compute5.7 TFLOPs8.2 TFLOPs24.6 TFLOPs26.5 TFLOPs29.5 TFLOPs185 TFLOPsTBATBA
FP32 Compute5.7 TFLOPs8.2 TFLOPs12.3 TFLOPs13.3 TFLOPs14.7 TFLOPs23.1 TFLOPsTBATBA
FP64 Compute384 GFLOPs512 GFLOPs768 GFLOPs6.6 TFLOPs7.4 TFLOPs11.5 TFLOPsTBATBA
Memory Clock1750 MHz500 MHz945 MHz1000 MHz1000 MHz1200 MHzTBATBA
Memory Bus256-bit bus4096-bit bus2048-bit bus4096-bit bus4096-bit bus4096-bit bus8192-bitTBA
Memory Bandwidth224 GB/s512 GB/s484 GB/s1 TB/s1 TB/s1.23 TB/s~2 TB/s?TBA
Form FactorSingle Slot, Full LengthDual Slot, Half LengthDual Slot, Full LengthDual Slot, Full LengthDual Slot, Full LengthDual Slot, Full LengthDual Slot, Full Length / OAMTBA
CoolingPassive CoolingPassive CoolingPassive CoolingPassive CoolingPassive CoolingPassive CoolingPassive CoolingTBA

It is also mentioned that the Arcturus XL GPU could be a single huge monolithic die and not a chiplet based design like AMD's Zen 2 based Ryzen CPU lineup. The naming of the Radeon Instinct MI100 itself gives us a hint of its absolute performance metrics which would be around 100 TFLOPs of INT8. That's a 66% increase in INT8 (AI/DNN) compute horsepower. Similarly, the FP16 compute would be rated at around 50 TFLOPs, 25 TFLOPs of FP32 and 12.5 TFLOPs of FP64. The extra GPU horsepower could be coming through either an updated graphics architecture, much higher clocks or higher CUs, which is the best assumption.

AMD XFX BC-160 Cryptocurrency Mining GPU Leaked, Up To 72 MH/s in ETH

We have only seen little details which are also speculation at best such as the GPU cache info that is part of the Virtual CRAT (vCrat) size. The GPU cache correlates with the CU count. In the case of AMD Arcturus GPU, the cache size has been increased and so have the CU count from 64 to 128. That is twice as many CUs as Vega 10 which would give us 8192 stream processors if AMD is using 64 stream processors per CU like their current and modern-day GPU designs.

While Arcturus is a Vega derivative, it's also a custom design solely for the HPC segment. This way, AMD can focus on parallel developments for the gaming/consumer segment and the HPC market which consists of AI/DNN and datacenter customers.

Just a few days ago, some interesting speculation based on the new configuration for the Big Red 200 supercomputer was posted by Dylan522p who suggests that NVIDIA's next-generation Ampere GPU based HPC parts could potentially feature up to 18 TFLOPs of FP64 compute. That would almost be a 50% lead over the Instinct MI100, but AMD has proved that they can offer more FLOPs at a competitive price so maybe that is where Arcturus would be targetting. There's no word on when Arcturus would land, but AMD has hinted at an Instinct product later this year.