Participate in the new and weekly newsletter for the latest updates and exclusive content on the leading AI coverage in the industry. learn more
The AI world shakes last week when Deepseek, a Chinese AI startup, announced the latest language model Deepseek-R1. The announcement has wiped out nearly $ 200 billion from NVIDIA’s market value, causing a wide range of market sales that caused a fierce debate on the future of AI development.
The story that emerged immediately suggests that DeepSeek has fundamentally confused economics of building advanced AI systems, and maybe a few billions of US companies have spent billions of dollars. Achieved for $ 6 million. This interpretation sent a shock wave by companies such as Openai, Anthropic, and Google through a large -scale investment in a large -scale investment in calculating infrastructure to maintain the advantage of technology.
However, in the turbulent headline and breathtaking headlines, Dario, a co -founder of mankind and one of the pioneering researchers behind today’s large language model (LLMS).・ Amday has published a detailed analysis that provides a more subtle perspective on the results of deep seek. His blog post provides some important insights, surviving hysteria and what DeepSeek has actually achieved and what it means for the future of AI development.
Restoring the understanding of DeepSeek’s presentation, four important insights from Amodei’s analysis are shown below.
1. The story of the “$ 6 million model” misses important context
According to Amodei, DeepSeek’s reported development costs must be seen via a wider lens. He will challenge popular interpretation directly:
“DeepSeek does not cost $ 6 million for billions of AI companies for $ 6 million.” I can only speak humanity, but Claude 3.5 Sonnet is a medium -sized model that costs tens of millions of dollars (accurate numbers). Also, 3.5 Sonnets were not trained in a larger and more expensive or more expensive way (contrary to some rumors). “
This shocking revelation fundamentally changes the story, focusing on DeepSeek’s cost efficiency. Sonnet is trained 9 to 12 months ago and thinks that many tasks are better than Deepseek models, which is not an innovative breakthrough, but a natural progress of AI development costs. It looks more lined up along.
Timing and context are also important. Following the historical trend of AI development cost reduction, which is estimated to be about four times a year, the cost structure of DeepSeek seems to be mainly trends, rather than dramatically proceeding the curves.
2. Deepseek-V3 was the actual technical result instead of R1
Market and media focused on Deepseek’s R1 model, but Amodei pointed out that the company’s more important innovation has come before.
“Deepseek-V3 is actually a true innovation, and it should have been a month ago (certainly so). Several important tasks as a prior model. It seems to be approaching the state -of -the -art US model, but it costs a lot of training.
The distinction between V3 and R1 is important to understand the true technical progress of DeepSeek. V3 represents a real engineering innovation, especially when managing the model “key value cache” and pushing the boundary between the mixed expert (MOE) method.
This insight can help you explain why the dramatic response to the R1 was forgotten. R1 basically adds a enhanced learning function to the foundation of V3. This is the step that multiple companies are currently using the model.
3. Investing in total corporate has revealed a different situation
Perhaps the most obvious aspect of Amodei’s analysis is about the overall investment of Deepseek’s AI development.
“It has been reported -we can’t be convinced that it is true -Deep Shik actually had 50,000 hopper genrey chips. These 50,000 hopper chips are ~ 1 billion. The dollar’s order is different from the US AI lab (which is different from the training of each model).
This revelation dramatically reconstructs the story of DeepSeek resource efficiency. The company may have achieved impressive results through individual model training, but the overall investment in AI development seems to be almost comparable to the American counter part.
The distinction between model training costs and total corporate investment emphasizes the continuous importance of substantial resources in AI development. Although it can improve engineering efficiency, it has been suggested that maintaining AI competitiveness still requires large capital investment.
4. The current “crossover point” is temporary
Amodei explains that the current moment in AI development is unique but someday.
“Therefore, we are in the interesting” crossover point. ” He writes there that some companies can create a legitimate inference model. ” “As everyone moves up the scaling curve of these models further up, this is no longer true.”
This observation result offers important contexts to understand the current status of AI competition. The ability of multiple companies to achieve similar results in the inference function represents a temporary phenomenon, not a new situation.
This is important for the future of AI development. Companies continue to expand their models, especially in resource -aged resources for reinforcement learning, so this field may be re -distinguished based on those who can invest in training and infrastructure. This suggests that DeepSeek has achieved an impressive milestone, but has not fundamentally changed the long -term economics of AI development.
True cost of AI construction: What Amodei analysis reveals
Detailed analysis on Amodei’s Deepseek results reveals the actual economics of reducing market speculation for several weeks and building advanced AI systems. His blog posting systematically dismantling both panic and enthusiasm following Deepseek’s presentation, indicating how the 6 million dollar model training cost is suitable in steady marches of AI development.
The market and the media were drawn to a simple story, and it was attractive that Chinese companies dramatically lower US AI development costs. However, Amodei’s failure reveals more complex reality. Deepsque’s total investment, especially reported $ 1 billion computing hardware, reflects the expenditure of the US counter part.
The moment of cost parity during the United States and China’s AI development marks what Amodei calls “crossover point”. This is a temporary window where multiple companies can achieve similar results. His analysis suggests that this window is closed because the AI function is moving forward and training intensifies. This field can return to an advantageous organization with the deepest resources.
The construction of advanced AI is still an expensive effort, and Amodei’s cautious screening indicates why it is necessary to examine the full range of investment to measure its true cost. It may prove that his systematic decoction of DeepSeek’s results is ultimately more important than the first announcement that caused such a turbulence in the market.