In addition to its MI350 series, AMD is also giving us a glimpse of what to expect from its next-gen Instinct MI400 series, which launches in 2026.
AMD Instinct MI400 Features 2x More AI Compute Than MI350 Series, 50% More Memory, Almost 2.5x Bandwidth Increase With HBM4, 10x Faster Vs MI350
With new details shared for the Instinct MI400 accelerator, it looks like AMD is once again going to go big on the hardware side, essentially doubling the compute capability. The official metrics now list the MI400 as a 40 PFLOP (FP4) & 20 PFLOP (FP8) product, which doubles the compute capability of the MI350 series, which launched today.

In addition to the compute capability, AMD is also going to leverage HBM4 memory for its Instinct MI400 series. The new chip will offer a 50% memory capacity uplift from 288GB HBM3e to 432GB HBM4. The HBM4 standard will offer a massive 19.6 TB/s bandwidth, more than double that of the 8 TB/s for the MI350 series. The GPU will also feature a 300 GB/s scale-out bandwidth/per GPU, so some big things are coming in the next generation of Instinct.
As per previous details, the Instinct MI400 accelerator will feature up to four XCDs (Accelerated Compute Dies), increasing the count from two XCDs per AID on the MI300. That said, there will be two AIDs (Active Interposer Dies) on the MI400 accelerator, and this time, there will be separate Multimedia and I/O dies as well.
For each AID, there will be a dedicated MID tile, and this will offer efficient communication between the compute units and the I/O interfaces compared to what we had in previous generations. Even on the MI350, AMD uses Infinity Fabric for inter-die communication.
So, it's a big change to the MI400 accelerators, which are aimed at large-scale AI training and inference tasks and are going to be based on the CDNA-Next architecture, which is probably going to be rebranded to UDNA as part of the red team's unification strategy of the RDNA and CDNA architectures.

AMD Instinct AI Accelerators:
| Accelerator Name | AMD Instinct MI500 | AMD Instinct MI400 | AMD Instinct MI350X | AMD Instinct MI325X | AMD Instinct MI300X | AMD Instinct MI250X |
|---|---|---|---|---|---|---|
| GPU Architecture | CDNA 6 | CDNA 5 | CDNA 4 | Aqua Vanjaram (CDNA 3) | Aqua Vanjaram (CDNA 3) | Aldebaran (CDNA 2) |
| GPU Process Node | 2nm | 2nm+3nm | 3nm | 5nm+6nm | 5nm+6nm | 6nm |
| XCDs (Chiplets) | TBD | 8 (MCM) | 8 (MCM) | 8 (MCM) | 8 (MCM) | 2 (MCM) 1 (Per Die) |
| GPU Cores | TBD | TBD | 16,384 | 19,456 | 19,456 | 14,080 |
| GPU Clock Speed (Max) | TBD | TBD | 2400 MHz | 2100 MHz | 2100 MHz | 1700 MHz |
| INT8 Compute | TBD | TBD | 5200 TOPS | 2614 TOPS | 2614 TOPS | 383 TOPs |
| FP6/FP4 Matrix | TBD | 40 PFLOPs | 20 PFLOPs | N/A | N/A | N/A |
| FP8 Matrix | TBD | 20 PFLOPs | 5 PFLOPs | 2.6 PFLOPs | 2.6 PFLOPs | N/A |
| FP16 Matrix | TBD | 10 PFLOPs | 2.5 PFLOPs | 1.3 PFLOPs | 1.3 PFLOPs | 383 TFLOPs |
| FP32 Vector | TBD | TBD | 157.3 TFLOPs | 163.4 TFLOPs | 163.4 TFLOPs | 95.7 TFLOPs |
| FP64 Vector | TBD | TBD | 78.6 TFLOPs | 81.7 TFLOPs | 81.7 TFLOPs | 47.9 TFLOPs |
| VRAM | HBM4E | 432 GB HBM4 | 288 GB HBM3e | 256 GB HBM3e | 192 GB HBM3 | 128 GB HBM2e |
| Infinity Cache | TBD | TBD | 256 MB | 256 MB | 256 MB | N/A |
| Memory Clock | TBD | 19.6 TB/s | 8.0 Gbps | 5.9 Gbps | 5.2 Gbps | 3.2 Gbps |
| Memory Bus | TBD | TBD | 8192-bit | 8192-bit | 8192-bit | 8192-bit |
| Memory Bandwidth | TBD | TBD | 8 TB/s | 6.0 TB/s | 5.3 TB/s | 3.2 TB/s |
| Form Factor | TBD | TBD | OAM | OAM | OAM | OAM |
| Cooling | TBD | Passive / Liquid | Passive / Liquid | Passive Cooling | Passive Cooling | Passive Cooling |
| TDP (Max) | TBD | TBD | 1400W (355X) | 1000W | 750W | 560W |
Follow Wccftech on Google to get more of our news coverage in your feeds.













