AMD has officially launched its next-gen Instinct MI350 series, which includes the MI350X and the flagship MI350X, with up to 185 billion transistors.
AMD Instinct MI350 Powers Next-Gen AI With Brand New 3nm Process Node, 20 PFLOPs of AI Compute & New Format Support
Today, AMD has officially launched its Instinct MI350 series of HPC / AI GPUs, which come equipped with a brand new CDNA 4 architecture based on TSMC's 3nm process node.

The chip itself features 185 billion transistors and comes in two flavors, the MI350X and the faster MI355X, offered in both air and liquid-cooled configurations. The new chips support the latest FP6 and FP4 AI data types and are equipped with massive HBM3e memory capacities. For comparison, NVIDIA's B300 chips based on the 4nm process node from TSMC offer up to 208 billion transistors.
The MI350 series chips pack a total of 256 compute units with 128 stream processors for a total of 16,384 cores. These are lower cores than the MI325 and MI300 series, which came packed with 304 compute units and a max core count of 19,456. These compute units are adjusted into eight zones, each with its own XCD, with each XCD packing 32 compute units. The XCDs are based on TSMC's N3P & the dual IO dies are based on TSMC's N6 node. The IOD includes 128 HBM3E channels, the infinity cache, and 4th Gen Infinity Fabric Links.

Just talking about the AI compute uplift, AMD claims that the Instinct MI350 series offers 20 PFLOPs of FP4/FP6 compute, which is a 4x gen-on-gen performance uplift. With HBM3e, you get faster data transfer speeds with a super-high capacity of 288 GB on both variants. There's also 256 MB of new Infinity Cache on the chips.
The memory sits in 8 stacks, with each stack packing 36 GB of memory capacity in 12-Hi stacks. The chips are also equipped with UBB8, which is a new Rapid AI infrastructure deployment standard, allowing faster deployment of air and liquid-cooled nodes.

Coming to the competitive metrics shared by AMD for its MI355X, the chip offers 8 TB/s of aggregate memory bandwidth, 79 TFLOPs of FP64, 5 PFLOPs of FP16, 10 PFLOPs of FP8, and 20 PFLOPs of FP6/FP4 compute. These numbers are for the flagship 1400W configuration of the Instinct MI355X chip. A thing to note is that both the MI350X and MI355X utilize the same die, but the 355X comes with a higher TDP rating.

The following are the numbers compared against the competition:
MI355x vs B200:
- Memory: 1.6x Higher
- Bandwidth: 1.0x Higher
- FP64: 2.1x Higher
- FP16: 1.1x Higher
- FP8: 1.1x Higher
- FP6: 2.2x Higher
- FP4: 1.1x Higher
MI355x vs GB200:
- Memory: 1.6x Higher
- Bandwidth: 1.0x Higher
- FP64: 2.0x Higher
- FP16: 1.0x Higher
- FP8: 1.0x Higher
- FP6: 2.0x Higher
- FP4: 1.0x Higher

But how does Instinct MI355X compare to the last-gen MI300 series? Well, AMD just showed a massive 35x leap in Inference performance using Llama 3.1 405B (Throughput), and that's a huge increase.

For the full MI350 series platform, the new Instinct ecosystem will offer up to 8x MI355 series GPUs with 2.3 TB of HBM3e memory, 64 TB/s of total bandwidth, 0.63 PFLOPs of FP64, 81 PFLOPs of FP8 & 161 PFLOPs of FP6/FP4 compute performance.

A full rack with liquid cooling will house 128-96 Instinct MI350 series GPUs with up to 36 TB of HBM3e memory, 2.6 Exaflops of FP4 compute, 1.3 Exaflops of FP8 compute, and will utilize the company's Turin EPYC CPUs based on the Zen 5 core architecture alongside the Pollara 400 interconnect solution.
With the official metrics out of the way, we can talk about the actual performance metrics in a range of AI tests presented by AMD. Once again, we start with the MI355X vs MI300X performance comparisons, and the new chips offer anywhere from a 2.8x to 4.2x increase in AI:

There's also another metric that compares the AMD Instinct MI355X with various popular AI workloads, such as DeepSeek R1, Llama 4, and Llama 3.1, and the new chips simply decimate the MI300X series:

The Instinct MI355X is also compared to the B200 and the GB200 servers from the competition and shows a 1.2-1.3x increase. In Llama 3.1 405B in FP4 mode, the new Instinct AI chips offer the same performance as the much more expensive Blackwell GB200 server from NVIDIA, which adds to AMD's pref/$ goals.
AMD also showed how the Instinct MI350 series GPUs can generate up to 40% more tokens/% compared to NVIDIA's B200 solution.

AMD also confirmed that while the Instinct MI350 series launches today with availability through various partners starting in Q3 2025, the next-generation MI400 series is already in the works and is planned for launch in 2026.

AMD Instinct AI Accelerators:
| Accelerator Name | AMD Instinct MI500 | AMD Instinct MI400 | AMD Instinct MI350X | AMD Instinct MI325X | AMD Instinct MI300X | AMD Instinct MI250X |
|---|---|---|---|---|---|---|
| GPU Architecture | CDNA Next / UDNA | CDNA 5 | CDNA 4 | Aqua Vanjaram (CDNA 3) | Aqua Vanjaram (CDNA 3) | Aldebaran (CDNA 2) |
| GPU Process Node | TBD | TBD | 3nm | 5nm+6nm | 5nm+6nm | 6nm |
| XCDs (Chiplets) | TBD | 8 (MCM) | 8 (MCM) | 8 (MCM) | 8 (MCM) | 2 (MCM) 1 (Per Die) |
| GPU Cores | TBD | TBD | 16,384 | 19,456 | 19,456 | 14,080 |
| GPU Clock Speed (Max) | TBD | TBD | 2400 MHz | 2100 MHz | 2100 MHz | 1700 MHz |
| INT8 Compute | TBD | TBD | 5200 TOPS | 2614 TOPS | 2614 TOPS | 383 TOPs |
| FP6/FP4 Matrix | TBD | 40 PFLOPs | 20 PFLOPs | N/A | N/A | N/A |
| FP8 Matrix | TBD | 20 PFLOPs | 5 PFLOPs | 2.6 PFLOPs | 2.6 PFLOPs | N/A |
| FP16 Matrix | TBD | 10 PFLOPs | 2.5 PFLOPs | 1.3 PFLOPs | 1.3 PFLOPs | 383 TFLOPs |
| FP32 Vector | TBD | TBD | 157.3 TFLOPs | 163.4 TFLOPs | 163.4 TFLOPs | 95.7 TFLOPs |
| FP64 Vector | TBD | TBD | 78.6 TFLOPs | 81.7 TFLOPs | 81.7 TFLOPs | 47.9 TFLOPs |
| VRAM | TBD | 432 GB HBM4 | 288 GB HBM3e | 256 GB HBM3e | 192 GB HBM3 | 128 GB HBM2e |
| Infinity Cache | TBD | TBD | 256 MB | 256 MB | 256 MB | N/A |
| Memory Clock | TBD | 19.6 TB/s | 8.0 Gbps | 5.9 Gbps | 5.2 Gbps | 3.2 Gbps |
| Memory Bus | TBD | TBD | 8192-bit | 8192-bit | 8192-bit | 8192-bit |
| Memory Bandwidth | TBD | TBD | 8 TB/s | 6.0 TB/s | 5.3 TB/s | 3.2 TB/s |
| Form Factor | TBD | TBD | OAM | OAM | OAM | OAM |
| Cooling | TBD | TBD | Passive / Liquid | Passive Cooling | Passive Cooling | Passive Cooling |
| TDP (Max) | TBD | TBD | 1400W (355X) | 1000W | 750W | 560W |
Follow Wccftech on Google to get more of our news coverage in your feeds.








