Today, DeepSeek is one of the only major AI companies in China, which does not rely on funding from high -tech companies such as Baidu, Alibaba, and bytedance.
A young group of genius wants to prove himself
According to Liang, when he summarized the DeepSeek research team, he was not looking for an experienced engineer to build consumer products. Instead, he focused on a doctoral program at the top of the University of China, including the University of Beijing and the University of Tinua. Many were listed in the top journals and won the awards at international academic conferences, but according to the Chinese technical publication QBITAI, they had no industry experience.
“The technical position of our core has been fully filled by those who have graduated this year or in the past 1-2 years,” Liang said in 36kr in 2023. Non -formal research project. This is a completely different method from the established Internet companies in China, where the team often competes for resources. (Recent Example: Bytedance blamed the former intern (honorary Academic Award winner) by hindering colleagues to accumulate more computing resources for his team.)
LIANG stated that students were suitable for high investment low -profit research. “Most people can fully concentrate on their mission when they are young,” explained. His pitch to hire in the future was that DeepSeek was created to “solve the most difficult questions in the world.”
Experts say that these young researchers are almost completely educated in China. “This younger generation also embodies patriotism to navigate US restrictions with particularly important hardware and software technology and suffocate suffocation points,” says Zhang. “Their determination to overcome these barriers reflects not only personal ambition but also a wide range of commitments rather than moving in China as a global innovation leader.”
Innovation born from a crisis
In October 2022, the U.S. government began to summarize exports that Chinese AI companies strictly restrict the access to state -of -the -art chips like NVIDIA’s H100. This movement presented a problem of DeepSeek. The company started with 10,000 H100 stockpiling, but needed more to compete with companies such as Openai and Meta. “The problem we faced was not funding, but it was an advanced tip export control,” Liang said in 36kr in the second interview in 2024.
DeepSeek had to come up with more efficient ways to train models. “They have optimized the model architecture using a series of engineering tricks. Reduced memory, the size of the existing communication swim between chips, saving memory, and innovative use of the model mixed ache.” Analyst at the Mercator Research Institute of the Chinese Research Institute. “Many of these approaches are not new ideas, but it is surprising feat to combine them well to create state -of -the -art models.”
Deepseek has also made great progress on mixed mixed and experts mixtures. This is two technical designs that make more computing resources for training more cost -effective. In fact, the latest models of Deepseek were so efficient, and according to the research institution, Epoch AI, the computing power of the LLAMA 3.1 model of Meta was required for training.
Deepseek’s willingness to share these innovation with the general public has gained considerable good intentions in the global AI research community. Open source model development is the only way for many Chinese AI companies to catch up with Western counter parts to attract more users and contributors. “They have now spent a lot of money on the state -of -the -art model and demonstrated that the current norms of the model construction have enough room for optimization,” says Chang. “We will see many attempts in this direction in this direction.”
This news may spell out the current US export management problems focusing on creating a computing resource bottle neck. “The existing estimation of how China has AI computing power and what they can achieve can fall,” says Chang.