China has released a low -cost and open source rival in Openai’s ChatGpt. Scientists are excited and Silicon Valley is worried.
Deepseek, a Chinese artificial intelligence (AI) lab behind innovation, announced a free large language model (LLM) Deepseek-V3 in late December 2024, only 5.58 million dollars for 2 months. It claims that it was built. It is necessary for Silicon Valley competitors.
Following the heel, there is a new model called Deepseek-R1 released on Monday (January 20). In the third-party benchmark test, DeepSeek-V3 matches the features of Openai GPT-4O and Anthropic Claude Sonnet 3.5, solving problems such as Meta LLAMA 3.1 and Alibaba QWEN2.5. Tasks are better than others.
Currently, R1 is more than the latest O1 model of Chatgpt in many of the same tests. This impressive performance in some of the other models, the nature of the semi -open source, and some of the training of the graphic processing unit (GPU), which is significantly small, surprises AI experts and is a Chinese AI model. The exiles of the United States have surpassed the US counter part.
“We should remove the development from China very seriously,” said Satya Nadella, the CEO of Microsoft, a strategic partner of Open I, said on January 22 in the Davos world economic forum in Switzerland. I am.
Related: AI can be duplicated itself -Mile stones that are scared by experts
The AI system learns using training data taken from human input. This allows you to generate output based on the probability of various patterns generated by the training dataset.
For large language models, these data is textbook. For example, Openai’s GPT-3.5, which was released in 2023, has trained about 570GB of text data from Common Crawl, a repository equivalent to about 300 billion words from books, online articles, Wikipedia, and other web pages. I did it.
Progress models such as R1 and O1 are upgrade versions of standard LLMs that re -evaluate and re -evaluate logic using methods called “thinking chains”, and can be worked on more accurately with more complex tasks.
This has gained popularity among scientists and engineers who are trying to integrate AI into their work.
However, unlike Chatgpt’s O1, DeepSeek is an “open weight” model, and (although the training data remains uniquely), users can change and change the algorithm. Similarly, what is important is the price reduction for users. 27 times less than O1.
In addition to its performance, the hype -advertising around DeepSeek comes from its cost efficiency. The model shoe hig budget is a budget of tens of millions to hundreds of millions to hundreds of millions to hundreds of millions to hundreds to hundreds to hundreds to hundreds to hundreds to hundreds to hundreds to hundreds to hundreds to hundreds to hundreds to hundreds to hundreds to hundreds to hundreds to hundreds to hundreds to hundreds to hundreds to hundreds to hundreds to hundreds to hundreds to rivals?
Furthermore, with US exports, which restricts access to Chinese companies’ best AI computing chips, R1 developers will build more smart and energy -efficient algorithms to make up for the lack of computing. I was forced to do that. Chatgpt is reportedly required a 10,000 NVIDIA GPU to handle training data. DeepSeek engineers say they achieved similar results in just 2,000.
I don’t know how much this is converted to useful scientific and technical applications, or whether DeepSeek simply trained the model to the ACE benchmark test. Scientists and AI investors are watching carefully.