
On Thursday, French Lab Mistral AI launched Small 3. This is called the “most efficient model of the category” and is said to be optimized for latency.
Mistral says the Small 3 can compete with the Lama 3.3 70B and QWEN 32B, among other large models. This is a “a great open alternative to an opaque, unique model like the GPT4O-MINI.”
Also: AI agents will match “good mid-level” engineers this year, Mark Zuckerberg says
Like other models in Mistral, the 24B-Parameter Small 3 is open source and will be released under the Apache 2.0 license.
Small 3 provides a base for building reasoning capabilities, designed for local use, says Mistral. “Small 3 is great in scenarios where fast and accurate response is important,” the release continues, with models having fewer layers than comparable models and helping speed.
This model achieved accuracy of over 81% in MMLU benchmark tests and was not trained with augmented learning (RL) or synthetic data. Mistral says it is “early in the model production pipeline” than the DeepSeek R1.
“Our instruction tuning model delivers competitive performance with an open weight model that is three times its size and a unique GPT4O-MINI model following code, mathematics, general knowledge and benchmarks.” The announcement states.
Using a third-party vendor, Mistral had human evaluators test a small 3 with over 1,000 coding and generalist prompts. Most of the testers preferred Gemma-2 27b and Qwen-2.5 32b from Small 3, but the numbers were split more evenly when Small 3 rose against Llama-3.3 70b and GPT-4o Mini. Ta. Mistral acknowledged the contradictions of human judgments that make this test different from standardized public benchmarks.
Also: Apple researchers reveal the secret source behind DeepseekAI
Mistral recommends Small 3 to build virtual assistants for customers. This is especially true because creating “very accurate subject experts” can be tweaked for rapid needs such as financial services, legal advice, fraud detection in healthcare, and more. release.
Small 3 can also be used for robotics and manufacturing, and can run on a MacBook with a minimum of 32GB of RAM, making it ideal for “lovers and organizations who process sensitive or unique information.”
Mistral teased that models of various sizes can be expected to have “inferior inference capabilities increased in the coming weeks.” Here you can access Small 3 with Huggingface.