Home ยป Testing Mac Studio’s Dual RAM Configuration with DeepSeek-R1 Running at Full Capacity Achieves 11 token/s.

Testing Mac Studio’s Dual RAM Configuration with DeepSeek-R1 Running at Full Capacity Achieves 11 token/s.

EXO Labs, the software developer for running artificial intelligence, has reported on the test results of the Mac Studio equipped with the M3 Ultra chip and 512GB RAM on two machines. It can run the full DeepSeek-R1 model at a speed of 11 token/s.

The M3 Ultra excels in running artificial intelligence at home due to its support for large unified memory, high bandwidth, and in this version, it also supports Thunderbolt 5 with increased bandwidth of 120Gb/s. Apple itself advertises the speed of running LLM prominently.

On average, the speed of 11 token/s is estimated to be typing 40-50 characters per second, which should be sufficient for general chatting. However, in cases of models that think before responding like R1, the performance may be slower while thinking.

Alex Cheema from EXO Labs suggests that theoretically, the speed may reach up to 20 token/s, and then they may seek other ways to improve efficiency, such as expert parallelism, which could potentially push it to 40 token/s. Additionally, reducing the model to Q6_K may trim it down to 500GB, enabling it to run on a single Mac Studio (almost depleting the RAM instantly). Cheema mentions they will continue testing going forward.

Source – @alexocheema

TLDR: EXO Labs tested Mac Studio with M3 Ultra chip, achieving 11 token/s speed in running DeepSeek-R1 model for AI, with potential for further optimization.

More Reading

Post navigation

Leave a Comment

Leave a Reply

Your email address will not be published. Required fields are marked *

Unconfirmed: Baidu Set to Unveil AI-Powered Smart Glasses Featuring Ernie AI Model Next Week

Collaboration between Google and the Department of Health Studies the Value of Using AI for Diabetic Retinopathy Screening

An Epochal Declaration by Collins Dictionary: Anointing ‘AI’ as the Epitome of Lexicon for the Year 2023