M3 Ultra Runs DeepSeek R1 With 671 Billion Parameters Using 448GB Of Unified Memory, Delivering High Bandwidth Performance At Under 200W Power Consumption, With No Need For A Multi-GPU Setup

Ali Salman • Mar 12, 2025 at 06:40am EDT

Apple's M3 Ultra chip performance on DeepSeek R1 model with 671 parametrs on Mac Studio

Apple launched the new Mac Studio earlier this week, housing its most powerful M3 Ultra chip that broke the company's own performance benchmarks. The chip comes with up to a 32-core CPU and up to an 80-core GPU, which will deliver enhanced computational and graphical performance compared to the M2 Ultra chip. The M3 Ultra chip has also proven itself quite capable when running the DeepSeek R1 model, a massive 671-billion-parameter model, compared to prior iterations of Apple silicon.

Apple's new M3 Ultra chip inside the Mac Studio performed surprisingly well when handling DeepSeek R1 model with 671 billion parameters

The DeepSeek R1 model with 671 billion parameters weighs in at a hefty 404GB and demands high-bandwidth memory, which is something reserved for GPU VRAM. Thanks to Apple's unified memory architecture, the M3 Ultra chip offers a unique advantage in the segment, showcasing impressive results with minimal power usage. The details were shared by the YouTube channel Dave2D, which compared the performance of the chip with respect to the DeepSeek R1 model against prior Apple chips.

Given the sheer size of the R1 model, powerful GPU setups with a significant amount of VRAM are required to run efficiently. A conventional PC setup would require multiple GPUs, driving power consumption to extreme levels, but the M3 Ultra chip managed to run the model far more efficiently. The unified memory architecture of the M3 Ultra chip provides a shared pool of high-bandwidth memory, which allows AI models to make use of resources similar to VRAM.

Take note that the smaller AI models run consistently and smoothly without using the full resources, but the DeepSeek R1 model with 671 billion parameters requires Apple's highest configuration of the M3 Ultra chip - a whopping 512GB. However, macOS imposes a limitation on VRAM allocation by default, and Dave Lee had to increase the limit manually through the Terminal, bumping it up to 448GB.

The DeepSeek R1 model ran successfully and smoothly on the M3 Ultra Mac Studio, and even though it is a 4-bit quantized version that sacrifices precision, the model still retained its 671 billion parameters and performed surprisingly well. While the competition can achieve the same performance with multiple GPUs, the M3 Ultra chip has the upper hand when it comes to power consumption. The entire system drew less than 200W while running the hefty DeepSeek R1 model. The power consumption is a fraction of what PCs with comparable performance would have used to achieve similar results. Dave mentions that the traditional multi-GPU configuration would have required 10 times more power than the M3 Ultra chip.

Surprisingly, the R1 model with 671 billion parameters performed better than the smaller 70-billion-parameter version, which could be due to architectural efficiencies. All in all, Apple's new M3 Ultra chip can run models well above its weight. We will share more details on the chip's performance and efficiency, so be sure to keep an eye out.

About the author: Ali Salman is a technology reporter for Wccftech mobile section with a specialized focus on Apple and the intellectual property that drives mobile innovation. He has cultivated a unique expertise in analyzing and deconstructing complex technology patents, translating dense legal and technical documents into clear, insightful reports on future products.

Follow Wccftech on Google to get more of our news coverage in your feeds.

Read all comments on M3 Ultra Runs DeepSeek R1 With 671 Billion Parameters Using 448GB Of Unified Memory, Delivering High Bandwidth Performance At Under 200W Power Consumption, With No Need For A Multi-GPU Setup

M3 Ultra Runs DeepSeek R1 With 671 Billion Parameters Using 448GB Of Unified Memory, Delivering High Bandwidth Performance At Under 200W Power Consumption, With No Need For A Multi-GPU Setup

Apple's new M3 Ultra chip inside the Mac Studio performed surprisingly well when handling DeepSeek R1 model with 671 billion parameters

Trending Stories

NVIDIA’s GeForce RTX 5070 Ti SUPER – Specs, Performance, And Price, Everything We Know So Far

NVIDIA GPU Hotspot Temperature Has Been Unlocked Through Mods, & Shows Widespread Thermal Issues Affecting RTX 50 GPUs That Throttle Gaming Performance

Battlestar Galactica: Scattered Hopes Review – Sometimes, You Have to Roll a Hard six

Cygames Revives Project Awakening a Decade After Reveal, Ditching Its Own Engine for Unreal Engine 5

AMD’s “PEPS” Research Pushes Neural Texture Compression Further, Cutting Model Parameters By 25% At Comparable Quality

Popular Discussions

AMD Prepares For Zen 6 EPYC CPUs Launch For July 22nd-23rd, Confirms AMD’s Mark Papermaster

AMD’s Next-Gen Medusa Point “10-Core” CPU Beats Strix “10-Core” By 29% In Single-Core & 22% In Multi-Core While Running At Just 2.0 GHz

NVIDIA’s GeForce RTX 5070 Ti SUPER – Specs, Performance, And Price, Everything We Know So Far

NVIDIA’s RTX 3060 12 GB Graphics Card Comeback Proves Just How Bad Things Are For The PC Gaming Market

AMD Ryzen Becomes The Top CPU Choice While Radeon Powers 1 In Every 3 Desktop Gaming GPUs Sold at Microcenter

M3 Ultra Runs DeepSeek R1 With 671 Billion Parameters Using 448GB Of Unified Memory, Delivering High Bandwidth Performance At Under 200W Power Consumption, With No Need For A Multi-GPU Setup

Apple's new M3 Ultra chip inside the Mac Studio performed surprisingly well when handling DeepSeek R1 model with 671 billion parameters

Related Story iPhone Fold To Feature A Dual-Battery Configuration, But If You Want Better Runtime, You’ll Have To Upgrade To The iPhone 18 Pro Max Instead

Further Reading

Trending Stories

Popular Discussions