Home ยป Enhancing Mathematical Precision with IBM’s New Granite 3.2 Model, Unveiling Granite Vision’s Advanced Document Image Reading Capabilities

Enhancing Mathematical Precision with IBM’s New Granite 3.2 Model, Unveiling Granite Vision’s Advanced Document Image Reading Capabilities

IBM has released its own large language model (LLM) called Granite version 3.2 with 8B parameters (an update from Granite 3.0) that enhances its mathematical and reasoning abilities significantly. This model has outperformed similar models like GPT-4o-0513 and Claude-3.5-Sonnet.

The mathematical and reasoning capabilities of Granite 3.2 come from the use of inference scaling technique from the LLM world. Instead of waiting for the model to provide all possible answers and then selecting the best one, IBM’s approach involves dividing the process into smaller steps and scoring the answers beforehand (referred to as process reward models or RPM).

This method differs from DeepSeek’s long chain of thought technique, as Granite uses two models to cross-check answers (the other model used for RPM is QWEN2.5-Math-PRM-7B).

Another related news is IBM’s debut of a vision-language language model (VLM) called Granite Vision, based on the Granite 3.1 model with 2B parameters fine-tuned to understand images. With expertise in reading documents, Granite Vision is compact, fast-working, and has outperformed competitors like Microsoft Phi 3.5 Vision (phi3.5v) in several tests.

Both IBM Granite 3.2 and IBM Granite Vision models are now available on Hugging Face under the Apache 2.0 license.

TLDR: IBM introduces Granite 3.2 LLM with enhanced mathematical and reasoning abilities, outperforming GPT-4o-0513 and Claude-3.5-Sonnet. Additionally, IBM unveils Granite Vision VLM, a compact, fast-working model specialized in understanding images and documents, surpassing competitors like Microsoft Phi 3.5 Vision. Both models are accessible on Hugging Face under the Apache 2.0 license.

More Reading

Post navigation

Leave a Comment

Leave a Reply

Your email address will not be published. Required fields are marked *

LLM Suite by StarCoder2 for Code Generation with ServiceNow, Hugging Face, and NVIDIA

Efficient AI Algorithm Generates 3D Objects from Images in 0.5 Seconds

Unveiling Stability AI’s Model for Generating Seamless Sound: Introducing Stable Audio Open Source Version, Capable of Producing Tracks up to 47 Seconds in Length.