Apple launched the new Mac Studio earlier this week, housing its most powerful M3 Ultra chip that broke the company's own performance benchmarks. The chip comes with up to a 32-core CPU and up to an 80-core GPU, which will deliver enhanced computational and graphical performance compared to the M2 Ultra chip. The M3 Ultra chip has also proven itself quite capable when running the DeepSeek R1 model, a massive 671-billion-parameter model, compared to prior iterations of Apple silicon.
Apple's new M3 Ultra chip inside the Mac Studio performed surprisingly well when handling DeepSeek R1 model with 671 billion parameters
The DeepSeek R1 model with 671 billion parameters weighs in at a hefty 404GB and demands high-bandwidth memory, which is something reserved for GPU VRAM. Thanks to Apple's unified memory architecture, the M3 Ultra chip offers a unique advantage in the segment, showcasing impressive results with minimal power usage. The details were shared by the YouTube channel Dave2D, which compared the performance of the chip with respect to the DeepSeek R1 model against prior Apple chips.
Given the sheer size of the R1 model, powerful GPU setups with a significant amount of VRAM are required to run efficiently. A conventional PC setup would require multiple GPUs, driving power consumption to extreme levels, but the M3 Ultra chip managed to run the model far more efficiently. The unified memory architecture of the M3 Ultra chip provides a shared pool of high-bandwidth memory, which allows AI models to make use of resources similar to VRAM.

Take note that the smaller AI models run consistently and smoothly without using the full resources, but the DeepSeek R1 model with 671 billion parameters requires Apple's highest configuration of the M3 Ultra chip - a whopping 512GB. However, macOS imposes a limitation on VRAM allocation by default, and Dave Lee had to increase the limit manually through the Terminal, bumping it up to 448GB.
The DeepSeek R1 model ran successfully and smoothly on the M3 Ultra Mac Studio, and even though it is a 4-bit quantized version that sacrifices precision, the model still retained its 671 billion parameters and performed surprisingly well. While the competition can achieve the same performance with multiple GPUs, the M3 Ultra chip has the upper hand when it comes to power consumption. The entire system drew less than 200W while running the hefty DeepSeek R1 model. The power consumption is a fraction of what PCs with comparable performance would have used to achieve similar results. Dave mentions that the traditional multi-GPU configuration would have required 10 times more power than the M3 Ultra chip.

Surprisingly, the R1 model with 671 billion parameters performed better than the smaller 70-billion-parameter version, which could be due to architectural efficiencies. All in all, Apple's new M3 Ultra chip can run models well above its weight. We will share more details on the chip's performance and efficiency, so be sure to keep an eye out.
Follow Wccftech on Google to get more of our news coverage in your feeds.




