IBM wants to be king of enterprise LLM with new open source Granite 3.1 model

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI. learn more

IBM is making a claim to the top of the open source AI leaderboard with its new Granite 3.1 series released today.

Granite 3.1 Large Language Model (LLM) provides enterprise users with an expanded context length of 128K tokens, new embedding models, integrated hallucination detection, and improved performance. According to IBM, the new Granite 8B Instruct model tops similarly sized open source rivals such as Meta Llama 3.1, Qwen 2.5, and Google Gemma 2. IBM ranked models across a set of academic benchmarks included in the OpenLLM Leaderboard.

The new model is part of an accelerated release cycle for IBM’s Granite open source model. Granite 3.0 was just released in October. At the time, IBM claimed to have a $2 billion book of business related to generative AI. With the Granite 3.1 update, IBM is focused on packing more functionality into smaller models. The basic idea is that smaller models are easier for businesses to operate and are more cost-effective to operate.

“We’ve increased all the numbers as well. Almost everything has improved performance across the board,” David Cox, vice president of AI models at IBM Research, told VentureBeat. “We use Granite for a variety of use cases: we use it internally at IBM in products, we use it for consulting, we deliver it to our customers, we release it as open source. Everything.”

Why performance and small models matter for enterprise AI

There are many ways companies can use benchmarks to evaluate LLM performance.

The direction IBM is taking is to run the model through a full range of academic and real-world tests. Cox emphasized that IBM tested and trained the model to optimize it for enterprise use cases. Performance is not just an abstract measure of speed. Rather, it is a more nuanced measure of efficiency.

One aspect of efficiency that IBM is trying to promote is helping users get to their desired results faster.

“You should spend less time fiddling with prompts,” Cox says. “So the more powerful your model is in that area, the less time you spend on engineering prompts.”

Efficiency is also related to model size. Larger models typically require more compute and GPU resources, increasing costs.

“When people are working on something like a minimum viable prototype, they often jump to very large models, so they might have a 70 billion parameter model or 405 billion You might end up using a parametric model,” Cox says. “But the reality is that many of them are not economical. So the other thing we’ve been trying to do is fit as much capacity as possible into the smallest possible package.”

Context matters for enterprise agent AI

In addition to promising performance and efficiency improvements, IBM has significantly extended Granite’s context length.

In the first Granite 3.0 release, context length was limited to 4k. In Granite 3.1, IBM expanded this to 128k, allowing processing of longer documents. Augmented context is an important upgrade for enterprise AI users, both in search extension generation (RAG) and agent AI.

Agenttic AI systems and AI agents often need to process and reason about longer bodies of information, such as larger documents, log traces, or longer conversations. With a 128k increase in context length, these agent AI systems have access to more context information, allowing them to better understand and respond to complex queries and tasks.

IBM has also released a series of embedded models that help speed up the process of converting data to vectors. The Granite-Embedding-30M-English model can achieve performance of 0.16 seconds per query, which IBM claims is faster than competing options such as Snowflake’s Arctic.

How IBM improves Granite 3.1 to meet enterprise AI needs

So how was IBM able to improve the performance of Granite 3.1? It wasn’t anything specific, Cox explained, but rather a set of processes and innovations.

IBM is developing increasingly sophisticated multi-stage training pipelines, he said. This has allowed the company to squeeze more performance out of its models. Also, an important part of LLM training is data. Rather than just focusing on increasing the amount of training data, IBM is focused on improving the quality of the data used to train Granite models.

“This is not a volume game,” Cox said. “It’s not like you’re going to get 10 times more data and your model will magically improve.”

Reduce hallucinations directly within the model

A common approach to reducing the risk of hallucinations and false outputs in LLM is to use guardrails. These are typically deployed as external features along with LLM.

With Granite 3.1, IBM is integrating hallucination protection directly into the model. Granite Guardian 3.1 8B and 2B models now include function-call hallucination detection.

“This model can natively enforce its own guardrails, giving developers different opportunities to capture things,” Cox said.

He explained that performing hallucination detection in the model itself optimizes the entire process. Internal detection reduces inference calls, making models more efficient and accurate.

How businesses can use Granite 3.1 now and what’s next

All new Granite models are now open source and available for free to enterprise users. These models are also available through IBM’s Watsonx enterprise AI service and will be integrated into IBM’s commercial products.

The company plans to maintain an aggressive pace of updating the Granite model. Future plans include adding multimodal functionality to Granite 3.2, which is expected to debut in early 2025.

“You’ll see us in the next few releases adding more of these types of differentiated capabilities, all the way up to what we’ll be announcing at next year’s IBM Think conference,” Cox said.

Daily insights into business use cases with VB Daily

If you want to impress your boss, VB Daily has you covered. From regulatory changes to real-world implementations, we give you the inside scoop on what companies are doing with generative AI. This allows you to share insights to maximize ROI.

Read our privacy policy

Thank you for subscribing. Check out other VB newsletters here.

An error has occurred.

Source link

What's Hot

Instead of Timothée Chalamett or Tom Holland, Sean Penn declares the Oscar-winning actress “the last movie star.” Hollywood

Does an American pope change U.S. politics? : The NPR Politics Podcast : NPR

Amazon will face Elon Musk’s Tesla with the robot launch.

IBM wants to be king of enterprise LLM with new open source Granite 3.1 model

Amazon will face Elon Musk’s Tesla with the robot launch.

This stretchy battery is healed after being cut in half

Apple fixes two zero-days exploited in targeted iPhone attacks

President Trump’s SEC nominee Paul Atkins marries multi-billion dollar roof fortune

Alice Munro’s Passive Voice | New Yorker

2025 Best Actress Oscar Predictions

20 Most Anticipated Sex Movies of 2025

Google, Nvidia invests in AI startup Safe Superintelligence, co-founder of Openai Ilya Sutskever

This $30 billion AI startup can be very strange by a man who said that neural networks may already be aware of it

As Deepseek and ChatGpt Surge, is Delhi behind?

Openai’s Sam Altman reveals his daily use of ChatGpt, and that’s not what you think

Our Picks

Instead of Timothée Chalamett or Tom Holland, Sean Penn declares the Oscar-winning actress “the last movie star.” Hollywood

Does an American pope change U.S. politics? : The NPR Politics Podcast : NPR

Amazon will face Elon Musk’s Tesla with the robot launch.

Most Popular

ATUA AI (TUA) develops cutting-edge AI infrastructure to optimize distributed operations

10 things you should never say to an AI chatbot

Character.AI faces lawsuit over child safety concerns

Subscribe to Updates

What's Hot

IBM wants to be king of enterprise LLM with new open source Granite 3.1 model

Why performance and small models matter for enterprise AI

Context matters for enterprise agent AI

How IBM improves Granite 3.1 to meet enterprise AI needs

Reduce hallucinations directly within the model

How businesses can use Granite 3.1 now and what’s next

Related Posts