Huawei’s Ascend 950PR AI Chip Just Won Over Chinese Customers By Mimicking CUDA Through CANN Next, Threatening NVIDIA’s Moat

Mar 27, 2026 at 07:32pm EDT
A close-up of a glowing semiconductor chip with intricate circuitry patterns suspended above a blue-lit circuit board.

Huawei's newest AI chip, the Ascend 950PR, might not deliver strong compute performance relative to NVIDIA for domestic hyperscalers, but it offers a major upgrade with CUDA compatibility.

Huawei's Ascend 950PR Sees Massive Interest, Mainly With CUDA-Like Programming Brought In With CANN Next

The Chinese computing industry has been trying to challenge NVIDIA's market dominance, and while the focus has been on upgrading offerings in terms of architecture and onboard features, it hasn't worked out to a large extent. Reports suggest that Chinese hyperscalers remain strongly inclined toward NVIDIA's hardware, and a key reason isn't just the compute gap; CUDA also plays a significant role. Huawei has tried to 'crack' CUDA with its native CANN offering, but that hasn't worked out yet, which is why, with the Ascend 950PR, the idea is to be a direct replacement for NVIDIA in training/inference workloads.

Related Story NVIDIA Beats Everyone To DeepSeek V4 With Day-0 Blackwell Support, Pushing 3,500 Tokens Per Second On 1.6T Models

This ​time around, tech firms intend to use the new 950PR more extensively, much happier now that ⁠the chip is more compatible with Nvidia's CUDA software system and has better response speeds, said the two people and a ​third person with knowledge of those plans.

- Reuters

We'll dive into what the Ascend 950PR chip brings to the table in a bit, but let's talk about CUDA compatibility and Huawei's major achievement with this launch. Huawei's CANN Next software stack has undergone a major upgrade, adding a SIMT programming model with features such as thread blocks, warps, and kernel launches, similar to CUDA. The idea with CANN Next isn't to provide developers with a translation layer; it's to bring in near-drop-in replacements for CUDA equivalents, treating CUDA as a language standard while leveraging the strengths of the Ascend ecosystem.

CANN Next is optimized for compute on Ascend at scale, meaning parameters such as thread counts and block sizes are tuned for Huawei's own chips, enabling co-design scalability. For a layman to understand what Huawei is actually doing, it isn't to replace CUDA at all; rather, it's to make developers feel like they are writing in CUDA, but in reality, the performance achieved with GPU programming is Ascend-optimized and scalable. CANN Next is one of the reasons the Ascend 950PR is seen as a much more attractive solution than previous offerings.

Now, with the Ascend 950PR chip in particular, it is reported that hyperscalers like ByteDance and Alibaba plan to place orders soon, and that the firm is set to produce 750,000 chips this year. In terms of technicals, you are looking at support for low-precision data formats, up to FP8, with 1 PFLOPS of FP8 compute and 2 PFLOPS of FP4. The chip will be equipped with an interconnect bandwidth of 2 TB/s, with the firm's first "self-built HBM," called HiBL 1.0, featuring a capacity of 128GB and a bandwidth of 1.6 TB/s. The HBM technology ensures that Huawei won't face constraints in ramping up production either.

China has been in need of alternatives to NVIDIA's compute offerings, particularly for hyperscalers. Getting involved in the regulatory overhead of sourcing chips like the H200 has been a 'pain', which is why they have resorted to options like renting compute offshore or looking towards domestic options. Huawei, with CANN Next and Ascend 950PR, is looking to step up its influence within the Chinese AI industry, yet the only constraints holding it back are chip volume and whether customers are ready for mass deployment.

About the author: Muhammad Zuhair is a hardware and technology reporter for Wccftech, specializing in the semiconductor industry and the complex interplay between technology, manufacturing, and geopolitics. His coverage focuses on the corporate strategies and technological roadmaps of industry giants like TSMC, NVIDIA, Samsung, and Intel. Zuhair's expertise lies in deconstructing complex topics such as fabrication nodes (e.g., 2nm process), the economic impact of policies like the CHIPS Act, and the strategic development of AI infrastructure from NVIDIA, AMD and Intel.

Follow Wccftech on Google to get more of our news coverage in your feeds.