M3 Ultra Runs DeepSeek R1 With 671 Billion Parameters Using 448GB Of Unified Memory, Delivering High Bandwidth Performance At Under 200W Power Consumption, With No Need For A Multi-GPU Setup

Ali Salman
Apple's M3 Ultra chip performance on DeepSeek R1 model with 671 parametrs on Mac Studio

Apple launched the new Mac Studio earlier this week, housing its most powerful M3 Ultra chip that broke the company's own performance benchmarks. The chip comes with up to a 32-core CPU and up to an 80-core GPU, which will deliver enhanced computational and graphical performance compared to the M2 Ultra chip. The M3 Ultra chip has also proven itself quite capable when running the DeepSeek R1 model, a massive 671-billion-parameter model, compared to prior iterations of Apple silicon.

Apple's new M3 Ultra chip inside the Mac Studio performed surprisingly well when handling DeepSeek R1 model with 671 billion parameters

The DeepSeek R1 model with 671 billion parameters weighs in at a hefty 404GB and demands high-bandwidth memory, which is something reserved for GPU VRAM. Thanks to Apple's unified memory architecture, the M3 Ultra chip offers a unique advantage in the segment, showcasing impressive results with minimal power usage. The details were shared by the YouTube channel Dave2D, which compared the performance of the chip with respect to the DeepSeek R1 model against prior Apple chips.

Related Story iPhone Fold To Feature 3D Printed Hinge To Lower Costs, But Rattling Problems Risk Launch Timeline As Samsung Begins M16 OLED Manufacturing

Given the sheer size of the R1 model, powerful GPU setups with a significant amount of VRAM are required to run efficiently. A conventional PC setup would require multiple GPUs, driving power consumption to extreme levels, but the M3 Ultra chip managed to run the model far more efficiently. The unified memory architecture of the M3 Ultra chip provides a shared pool of high-bandwidth memory, which allows AI models to make use of resources similar to VRAM.

Apple's M3 Ultra chip performance on DeepSeek R1 model with 671 parametrs on Mac Studio

Take note that the smaller AI models run consistently and smoothly without using the full resources, but the DeepSeek R1 model with 671 billion parameters requires Apple's highest configuration of the M3 Ultra chip - a whopping 512GB. However, macOS imposes a limitation on VRAM allocation by default, and Dave Lee had to increase the limit manually through the Terminal, bumping it up to 448GB.

The DeepSeek R1 model ran successfully and smoothly on the M3 Ultra Mac Studio, and even though it is a 4-bit quantized version that sacrifices precision, the model still retained its 671 billion parameters and performed surprisingly well. While the competition can achieve the same performance with multiple GPUs, the M3 Ultra chip has the upper hand when it comes to power consumption. The entire system drew less than 200W while running the hefty DeepSeek R1 model. The power consumption is a fraction of what PCs with comparable performance would have used to achieve similar results. Dave mentions that the traditional multi-GPU configuration would have required 10 times more power than the M3 Ultra chip.

Apple's M3 Ultra chip performance on DeepSeek R1 model with 671 parametrs on Mac Studio

Surprisingly, the R1 model with 671 billion parameters performed better than the smaller 70-billion-parameter version, which could be due to architectural efficiencies. All in all, Apple's new M3 Ultra chip can run models well above its weight. We will share more details on the chip's performance and efficiency, so be sure to keep an eye out.

Ali Salman Photo

About the author: Ali Salman is a technology reporter for Wccftech mobile section with a specialized focus on Apple and the intellectual property that drives mobile innovation. He has cultivated a unique expertise in analyzing and deconstructing complex technology patents, translating dense legal and technical documents into clear, insightful reports on future products.

Follow Wccftech on Google to get more of our news coverage in your feeds.

Button