Alibaba Group (Alibaba) announces that the upgraded Qwen 2.5 Max model has achieved better performance than the V3 model from the Chinese Artificial Intelligence (AI) Startup DeepSeek in China in several third-party benchmark tests. did. This achievement is an important milestone as Qwen 2.5 Max emerged as a top-ranked, irrational model from China on its well-known benchmark and ranking platform.
Qwen 2.5 Max utilizes a large mix of expert (MOE) architectures trained with over 20 trillion tokens, according to reports from Chinese media, including Securities Time and Ithome. In comparison, DeepSeek’s V3 model also uses the MOE architecture, but is trained at 14.8 trillion tokens. This difference in training is important because models trained on datasets with a large number of tokens tend to perform better.
Competitive performance against global leaders
Data released by Alibaba shows that QWEN 2.5 Max shows functionality comparable to Deepseek’s V3, Anthropic’s Claude 3.5 Sonnet, and Sopenai’s GPT-4.
The rise of global rankings
As reported in the South China Morning Post, data from the Chatbot Arena Test Platform, developed by computer scientists in Berkeley, California, QWEN 2.5 max ranked 7th on the platform leaderboard during the New Year of the month It shows that it has reached its highest level of deration. An AI model from China. Chatbot Arena rankings are determined by user votes based on output quality.
However, overall rankings, the Deepseek R1 remains the highest of China’s AI model, holding its third position. In addition to its presence in the Chinese field, another Chinese AI startup, Zhipu AI, also secured a top 10 spot.
Chatbot Arena posted that four of the top 10 companies are Chinese, highlighting China’s efforts to bridge the gap with advances in AI in the US.