Home ยป Enhancing Mathematical Precision with IBM’s New Granite 3.2 Model, Unveiling Granite Vision’s Advanced Document Image Reading Capabilities

Enhancing Mathematical Precision with IBM’s New Granite 3.2 Model, Unveiling Granite Vision’s Advanced Document Image Reading Capabilities

IBM has released its own large language model (LLM) called Granite version 3.2 with 8B parameters (an update from Granite 3.0) that enhances its mathematical and reasoning abilities significantly. This model has outperformed similar models like GPT-4o-0513 and Claude-3.5-Sonnet.

The mathematical and reasoning capabilities of Granite 3.2 come from the use of inference scaling technique from the LLM world. Instead of waiting for the model to provide all possible answers and then selecting the best one, IBM’s approach involves dividing the process into smaller steps and scoring the answers beforehand (referred to as process reward models or RPM).

This method differs from DeepSeek’s long chain of thought technique, as Granite uses two models to cross-check answers (the other model used for RPM is QWEN2.5-Math-PRM-7B).

Another related news is IBM’s debut of a vision-language language model (VLM) called Granite Vision, based on the Granite 3.1 model with 2B parameters fine-tuned to understand images. With expertise in reading documents, Granite Vision is compact, fast-working, and has outperformed competitors like Microsoft Phi 3.5 Vision (phi3.5v) in several tests.

Both IBM Granite 3.2 and IBM Granite Vision models are now available on Hugging Face under the Apache 2.0 license.

TLDR: IBM introduces Granite 3.2 LLM with enhanced mathematical and reasoning abilities, outperforming GPT-4o-0513 and Claude-3.5-Sonnet. Additionally, IBM unveils Granite Vision VLM, a compact, fast-working model specialized in understanding images and documents, surpassing competitors like Microsoft Phi 3.5 Vision. Both models are accessible on Hugging Face under the Apache 2.0 license.

More Reading

Post navigation

Leave a Comment

Leave a Reply

Your email address will not be published. Required fields are marked *

Revealing Qwen2.5-VL: An Innovative Image Analysis Model and Data Extraction Agent on Devices.

Reflection Techniques of Open-Source Model Tuning from Llama Outperform Every Major Model Including GPT-4o

Meta’s Magnate Model Reveals Record-breaking 350 Million Downloads of Llama, Projected to Grow Tenfold by 2024.