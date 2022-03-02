AMD’s Next-Gen Data Center Behemoth, The Instinct MI300 MCM ‘GFX940’ GPU, Makes Possible First Appearance In Linux Patch
It looks like AMD's next-gen Instinct MI300 GPU accelerator has made a possible first appearance in the latest Linux patch.
AMD Instinct MI300 'GFX940' GPU, Next-Gen Data Center MCM Accelerator, Makes Possible First Appearance In Linux Patch
The latest Linux Patch has included a new target for an unreleased AMD 'GFX940' GP which has a similar ISA as the Aldebaran 'GFX90a' GPU. It is speculated that this chip could be powering AMD's next-generation Instinct MI300 GPU accelerator and supports all the data-centric features such as MFMA (Matrix-Fused-Multiply-Add), full-rate FP64, and packed FP32 operations. Other features also include XNACK which is specific to CPU+GPU memory space integration, as Coelacanth-Dream puts it.
The source states that although the GPU ISA is similar, the GFX940 does have a few differences when compared to Aldebaran 'CDNA 2' GPUs which are listed below:
Previous rumors have indicated that the AMD Instinct MI300 will feature a 4-GCD design based on the brand new CDNA 3 architecture. The upcoming Instinct MI200 was going to feature 128 compute units per die but that has changed to 110 compute units since last week's rumor. A total of 220 Compute Units would net 14,080 cores and if we take the exact number and multiply it by 4 (the number of GCDs on Instinct MI300), we end up with 440 Compute Units or an insane 28,160 cores.
MI300 😍https://t.co/B3qlnQBbVG
— Kepler (@Kepler_L2) March 1, 2022
MI300 will feature 4 GCDs 🧐
— Kepler (@Kepler_L2) September 7, 2021
A recent AMD ROCm Developer Tools update that was spotted by Komachi did confirm a maximum of 4 MCM GPUs but those are simply 'Aldebaran' SKUs. There are expected to be at least four CDNA 2 powered Instinct accelerators with their respective (unique IDs) listed below. Note that the number doesn't represent the number of dies on each device but rather the device itself:
- 0x7408
- 0x740C
- 0x740F
- 0x7410
Now that would be true if AMD makes no changes whatsoever when moving from CDNA 2 to CDNA 3 but that's not the case. CDNA 3 is expected to bring forward a revised new architecture that won't be another Vega derivative like Arcturus or Aldebaran which makes this rumor more believable.
The GPU architecture may also use a layout that might end up looking similar to the new WGP/SE arrangement on the new RDNA 3 chips or an entirely new design tailored towards the HPC segment. But one thing is for sure, those quad-MCM GPUs definitely are something that we can't wait to see in action!
AMD Radeon Instinct Accelerators 2020
|Accelerator Name
|AMD Instinct MI300
|AMD Instinct MI250X
|AMD Instinct MI250
|AMD Instinct MI210
|AMD Instinct MI100
|AMD Radeon Instinct MI60
|AMD Radeon Instinct MI50
|AMD Radeon Instinct MI25
|AMD Radeon Instinct MI8
|AMD Radeon Instinct MI6
|GPU Architecture
|TBA (CDNA 3)
|Aldebaran (CDNA 2)
|Aldebaran (CDNA 2)
|Aldebaran (CDNA 2)
|Arcturus (CDNA 1)
|Vega 20
|Vega 20
|Vega 10
|Fiji XT
|Polaris 10
|GPU Process Node
|Advanced Process Node
|6nm
|6nm
|6nm
|7nm FinFET
|7nm FinFET
|7nm FinFET
|14nm FinFET
|28nm
|14nm FinFET
|GPU Dies
|4 (MCM)?
|2 (MCM)
|2 (MCM)
|1 (MCM)
|1 (Monolithic)
|1 (Monolithic)
|1 (Monolithic)
|1 (Monolithic)
|1 (Monolithic)
|1 (Monolithic)
|GPU Cores
|28,160?
|14,080
|13,312
|6656
|7680
|4096
|3840
|4096
|4096
|2304
|GPU Clock Speed
|TBA
|1700 MHz
|1700 MHz
|~1700 MHz?
|~1500 MHz
|1800 MHz
|1725 MHz
|1500 MHz
|1000 MHz
|1237 MHz
|FP16 Compute
|TBA
|383 TOPs
|362 TOPs
|~176 TOPs
|185 TFLOPs
|29.5 TFLOPs
|26.5 TFLOPs
|24.6 TFLOPs
|8.2 TFLOPs
|5.7 TFLOPs
|FP32 Compute
|TBA
|95.7 TFLOPs
|90.5 TFLOPs
|~44 TFLOPs
|23.1 TFLOPs
|14.7 TFLOPs
|13.3 TFLOPs
|12.3 TFLOPs
|8.2 TFLOPs
|5.7 TFLOPs
|FP64 Compute
|TBA
|47.9 TFLOPs
|45.3 TFLOPs
|~22 TFLOPs
|11.5 TFLOPs
|7.4 TFLOPs
|6.6 TFLOPs
|768 GFLOPs
|512 GFLOPs
|384 GFLOPs
|VRAM
|TBA
|128 GB HBM2e
|128 GB HBM2e
|64 GB HBM2e
|32 GB HBM2
|32 GB HBM2
|16 GB HBM2
|16 GB HBM2
|4 GB HBM1
|16 GB GDDR5
|Memory Clock
|TBA
|3.2 Gbps
|3.2 Gbps
|3.2 Gbps?
|1200 MHz
|1000 MHz
|1000 MHz
|945 MHz
|500 MHz
|1750 MHz
|Memory Bus
|TBA
|8192-bit
|8192-bit
|4096-bit
|4096-bit bus
|4096-bit bus
|4096-bit bus
|2048-bit bus
|4096-bit bus
|256-bit bus
|Memory Bandwidth
|TBA
|3.2 TB/s
|3.2 TB/s
|1.6 TB/s
|1.23 TB/s
|1 TB/s
|1 TB/s
|484 GB/s
|512 GB/s
|224 GB/s
|Form Factor
|TBA
|OAM
|OAM
|Dual Slot Card
|Dual Slot, Full Length
|Dual Slot, Full Length
|Dual Slot, Full Length
|Dual Slot, Full Length
|Dual Slot, Half Length
|Single Slot, Full Length
|Cooling
|TBA
|Passive Cooling
|Passive Cooling
|Passive Cooling
|Passive Cooling
|Passive Cooling
|Passive Cooling
|Passive Cooling
|Passive Cooling
|Passive Cooling
|TDP
|TBA
|560W
|500W?
|300W?
|300W
|300W
|300W
|300W
|175W
|150W
Stay in the loop
GET A DAILY DIGEST OF LATEST TECHNOLOGY NEWS
Straight to your inbox
Subscribe to our newsletter