ARM made a massive announcement at its ARM Everywhere keynote: according to a new blog post, the firm will sell its own 'AGI CPU' for the first time.
ARM's Pivot to Direct Silicon Sales Ends NVIDIA's Dominance Over High-End Server Core IP
With agentic AI workloads, CPU has started to become the next bottleneck for hyperscalers, which is why we have seen x86 solutions from Intel/AMD and ARM-based chips from NVIDIA gaining massive adoption among customers. In light of this, ARM has decided to capitalize on the hype by introducing its first-ever chip, called the ARM AGI CPU, marking a shift from an IP provider to an end-to-end silicon manufacturer. According to ARM's CEO, Rene Haas, this move is targeted at fulfilling enterprise demand driven by agentic AI workloads.
Today marks the next phase of the Arm compute platform and a defining moment for our company. With the expansion into delivering production silicon with our Arm AGI CPU, we are giving partners more choices all built on Arm’s foundation of high-performance, power-efficient computing, to support agentic AI infrastructure at global scale.
Diving into the specifics of the AGI CPU, we are looking at up to 136 Arm Neoverse V3 cores per CPU, offering 6GB/s memory bandwidth per core. The processor has 2 MB of L2 cache per core and runs at up to 3.7 GHz. As far as I/O specifications are concerned, you are looking at 96x PCIe Gen 6 lanes, along with CXL 3.0 memory expansion, allowing the processor to support "massively parallel, high-performance agentic workloads". Here's a complete rundown of the AGI CPU, based on the details disclosed:
| Category | Specification |
| Core Architecture | Arm Neoverse V3 |
| Core Count | Up to 136 cores |
| Clock Frequency | Up to 3.7GHz |
| Cache | Dedicated 2MB L2 cache per core |
| Manufacturing Process | 3nm |
| Thermal Design Power (TDP) | 300 Watts |
| Packaging / Layout | Dual chiplet design (Memory and I/O on the same die) |
| Memory Supported | Up to DDR5-8800 |
| Memory Capacity | Up to 6TB per chip |
| Memory Performance | 6GB/s memory bandwidth per core; Sub-100ns memory latency |
| I/O & Connectivity | 96x lanes PCIe Gen6 |
| Expansion & Interconnects | CXL 3.0 (for memory expansion) and AMBA CHI extension links |
In terms of rack-scale deployment, ARM offers ultra-thin 1OU (Open Unit) nodes, which are basically a shift away from multi-unit servers. A single chassis can host up to two nodes, providing a total of 272 cores per blade. The physical rack layout can house up to 30 of these open unit nodes, delivering a total of 8160 cores. You are also looking at unified memory pools connected via the CXL 3.0 fabric, and each rack is rated to run at 36kW with air cooling. Given how prevalent CPU-only racks have become, ARM has designed its solution to meet market demand.
ARM says its AGI CPU delivers "two times higher" performance per rack compared to modern x86 solutions. While it doesn't compare its solution to Vera, we expect it to be closer to it, given similar microarchitectures. The AGI CPU also allows vendors to mix-and-match their rack-scale configurations, since ARM has opened up support for any accelerator (Cerebras, Groq, Meta MTIA) that fits into standard OCP server designs. This essentially means that the ARM IP benefits NVIDIA previously used exclusively have likely now ended.
It would be interesting to see how ARM's venture into the standalone CPU segment evolves going forward, since the firm has been discussing the prospects of its own chips for quite some time now. At the same time, NVIDIA, a core customer of ARM's IP technology, now has a formidable competitor to Vera.
Follow Wccftech on Google to get more of our news coverage in your feeds.
