Unlike traditional LLMs, these SR models take extra time to generate a response. This extra time often increases performance on tasks including math, physics, and science. And this latest open model is clearly turning heads to catch up with Openai soon.
For example, Deepseek reports that R1 outperforms Openai’s O1 in several benchmarks and tests, including AIME (a mathematical reasoning test), Math-500 (a collection of word problems), and SWE Bench Verification (a programming evaluation tool). I am reporting that it has exceeded that. As we usually mention, AI benchmarks should be taken with a grain of salt, and these results have not yet been independently verified.

Deepseek R1 benchmark results, chart created by DeepSeek.
Credit: Deepseek
TechCrunch reports that three Chinese labs – Deepseek, Alibaba and Moonshot Ai’s Kimi – have released models they say feature Match O1, and Deepseek will be the first to preview R1 in November.
However, the new DeepSeek model comes with a catch when run on a cloud-hosted version. Due to its Chinese origin, R1 does not generate responses on certain topics such as Tiananmen Square or Taiwan’s autonomy. Internet regulation in China. This filtering comes from an additional moderation layer that is not an issue if the model is running locally outside of China.
Even with potential censorship, Dean Ball, an AI researcher at George Mason University, wrote in X: Hardware, far from the eyes of top-down control regimes. ”