Home ยป Unveiling DeepSeek – A Chinese AI Company Pushing Boundaries Beyond High-Level Chip Export Bans

Unveiling DeepSeek – A Chinese AI Company Pushing Boundaries Beyond High-Level Chip Export Bans

The current hot topic in the tech world revolves around DeepSeek, a Chinese artificial intelligence company that has developed the R1 model. This model has the ability to think in steps and has outperformed OpenAI’s o1 in multiple tests. One of its key strengths is its significantly lower training cost and open-source nature, which could lead to significant changes in AI development not only in China but also globally.

Founded in July 2023 by Liang Wenfeng, a graduate of Zhejiang University, DeepSeek is based in Hangzhou and has received funding from the High-Flyer fund. Liang established High-Flyer back in 2015 with the goal of developing artificial general intelligence (AGI) that rivals human capabilities. The fund has made substantial investments in NVIDIA A100 GPUs before the US banned exports of these chips to China. This has equipped DeepSeek with a considerable amount of processing power, although still limited compared to tech companies on the American side.

Liang mentioned in mid-2024 that the constraints on high-end chip resources have led to significantly higher engineering costs for AI development. Processing must be increased 2-4 times to achieve similar results, prompting companies to seek ways to optimize training models. This includes reducing data redundancy and minimizing supervised fine-tuning in favor of reinforcement techniques, which explains why Chinese tech companies often opt for open-source AI models to encourage data sharing and collaborative development given hardware limitations.

DeepSeek highlights that the training cost for the R1 model is $5.6 million, significantly lower than the minimum costs of AI companies in the US, which can reach up to $100 million or even billion-dollar levels. This cost discrepancy could have far-reaching implications for hardware manufacturers and AI companies that have made investments or are planning future investments.

Nevertheless, The Wall Street Journal reported that Liang recently met with Chinese Premier Li Qiang and discussed how the export ban on chips to China continues to hinder advancements in AI development. Jim Fan, a researcher at NVIDIA, finds DeepSeek’s approach to model development unique, utilizing strategies like learning from scratch like AlphaZero, reducing reward levels, and training for precise outcomes to streamline processing. He believes that a 10-fold reduction in training costs could potentially increase current resources’ efficiency by the same factor, accelerating AI’s all-encompassing capabilities for everyone.

TLDR: DeepSeek, a Chinese AI company, has developed the R1 model with superior abilities and lower training costs compared to US counterparts. Liang Wenfeng’s High-Flyer fund has invested heavily in NVIDIA GPUs, but export bans create challenges for AI development in China. Innovative strategies and cost reductions in training may revolutionize AI capabilities globally.

More Reading

Post navigation

Leave a Comment

Leave a Reply

Your email address will not be published. Required fields are marked *

DeepSeek Ascends to Top Spot in America’s App Store, Outranking ChatGPT