Alibaba has unveiled its latest AI chip, "Zhenwu M890" and AI LLM "Qwen3.7-Max", designed for Agentic AI workloads.
As Agentic AI Rages On, Alibaba Rolls Out Its Own AI Chip & AI LLM: Meet Zhenwu M890 GPU & Qwen3.7-Max Model
The Alibaba Zhenwu M890 is based on the company's in-house PPU (Parallel Processing Unit) architecture and features a Transformer core engine.
The chip is designed for Agentic AI workloads with a focus on AI inferencing, offering 0.6 PFLOPs of FP16 (Half-Precision) compute, which is comparable to the A100 from NVIDIA, and three times faster than the Hopper H20 solution. The company also states that the M890 AI chip offers 3x the compute performance of the previous generation offerings.
In terms of specifications, the Zhenwu M890 is equipped with 144 GB HBM3 memory, a 50% increase over the Zhenwu 810E, which packed 96 GB memory. The interconnect bandwidth is also boosted to 800 GB/s, up 100 GB/s from the 810E chip. In addition to that, the new chip supports FP32, FP16, FP8, & FP4 formats for AI workloads. This puts the chip on par with the capabilities of NVIDIA's Rubin and Huawei's Ascent 950 series.
The company is offering a full ecosystem with the introduction of a new interconnect chip, called ICN Switch 1.0. This chip offers 25.6 Tb/s of interconnect speeds, at a P2P time delay of less than 150ns. The higher bandwidth enables support for massive agent concurrency. There's also the Yitian Arm-based host CPU and Panmai series networking cards, which will all come together within the Panjiu AL128 Supernode Server by Alibaba Cloud.
This new server will tightly integrate 128 AI accelerators within a single rack, delivering PB/s scale bandwidth. T-Head reports that they have shipped approximately 560,000 Zhenwu AI chips to date, with more than 400 external customers spanning across 20 industries.
| Release Time | Model | Core Highlights | Architecture achieves breakthrough innovation; performance expected to continue major leaps, aiming at the international top AI chip level |
|---|---|---|---|
| 2024 Q2 | Zhenwu 810E | Easy-to-use all-in-one AI chip for training & inference; 96GB memory; 700GB/s interconnect bandwidth | Baseline (first generation) |
| 2026 Q2 | Zhenwu M890 | Fully upgraded self-developed parallel computing architecture; 3× performance; 144GB memory; 800GB/s interconnect bandwidth | Overall performance ~3× boost; memory 96GB → 144GB (+50%); bandwidth 700 → 800GB/s (+14%); architecture fully upgraded |
| 2027 Q3 | Zhenwu V900 | Deep iteration of self-developed parallel computing architecture; 3× performance; 216GB memory; 1200GB/s interconnect bandwidth | Performance boosted another 3×; memory 144GB → 216GB (+50%); bandwidth 800 → 1200GB/s (+50%); architecture deeply iterated |
| 2028 Q3 | Zhenwu J900 | Breakthrough innovation in self-developed parallel computing architecture; continuous performance leap | Architecture achieves breakthrough innovation; performance expected to continue major leaps, aiming at international top AI chip level |
Looking ahead, Alibaba Cloud is working on a series of Zhenwu chips following the M890.
Next year in Q3, the company plans to introduce the V900, which will feature an updated architecture, delivering a 3x performance boost, 216 GB of memory, and 1200 GB/s of bandwidth, and the follow-up, the Zhenwu J900, will arrive in Q3 2028 with even more architectural and performance updates.
The model delivers exceptional agent capabilities across diverse domains. As a frontier-level coding assistant, it supports coding tasks from rapid frontend prototyping to complex, multi-file software engineering. To enhance office work productivity, it reliably orchestrates multi-agent workflows to tackle sophisticated operations. Notably, Qwen 3.7-Max can autonomously execute long-horizon agentic tasks—sustaining continuous operation for up to 35 hours and managing over 1,000 tool calls without performance degradation.
Deeply optimized for leading agent frameworks including OpenClaw, Hermes Agent, Claude Code, Qwen Paw and Qoder, it serves as a reliable backbone for different agent systems. The model achieves top-tier results across major benchmarks in coding, general-purpose agents, general capabilities and multilingualism, making it competitive with leading frontier models. It will be soon accessible through Alibaba's model service platform Model Studio for global developers.
Alibaba Cloud
| Work | Opus-4.6 Max | K2.6 Thinking | GLM-5.1 Thinking | DS-V4-Pro Max | Qwen3.6-Plus | Qwen3.7-Max |
|---|---|---|---|---|---|---|
| Coding Agent | ||||||
| Terminal Bench 2.0-Terminus | 65.4 | 66.7 | 63.5 | 67.9 | 61.6 | 69.7 |
| SWE-Verified | 80.8 | 80.2 | -- | 80.6 | 78.8 | 80.4 |
| SWE-Pro | 57.3 | 59.5 | 58.8 | 59.0 | 56.6 | 60.6 |
| SWE-Multilingual | 77.5 | 76.7 | -- | 76.2 | 73.8 | 78.3 |
| NL2repo | 47.6 | 42.8 | 41.0 | 35.5 | 34.4 | 47.2 |
| SciCode | 51.9 | 52.2 | 45.1 | -- | 41.4 | 53.5 |
| QwenWebDev | 1617 | -- | 1564 | 1570 | 1500 | 1568 |
| QwenSVG | 1541 | 1325 | 1605 | 1506 | 1432 | 1608 |
| General Agent | ||||||
Besides the chips, Alibaba Cloud is also launching its latest AI LLM, Qwen3.7-Max. This model is focused on advanced agentic coding, complex reasoning, and long-horizon task execution. The new model will be available to developers and enterprises soon.
Follow Wccftech on Google to get more of our news coverage in your feeds.
