Home ยป PaliGemma 2: Google Unveils AI-powered OER Image Editor featuring Chemistry Formulas, Sheet Music, and X-ray Images

PaliGemma 2: Google Unveils AI-powered OER Image Editor featuring Chemistry Formulas, Sheet Music, and X-ray Images

Google has released the PaliGemma 2 model, an enhanced version of the LLM multimodal artificial intelligence model that was first unveiled at this year’s Google I/O event. This model comes in various sizes, offers more detailed image descriptions, and expands its capabilities with new features.

There are three sizes of the model available: 3B, 10B, and 28B, all supporting image sizes of 224×224, 448×448, and 896×896. In total, there are 9 models with diverse abilities ranging from basic image captioning to specialized document reading tasks like financial tables, notes, music sheets, and even X-ray images.

PaliGemma can be used for tasks such as document reading, object detection, and other applications that combine text and images. The model is free to use under Gemma’s terms and supports HuggingFace Transformer, Keras, PyTorch, JAX, and Gemma.cpp.

TLDR: Google introduces the PaliGemma 2 model, an upgraded LLM AI model with enhanced image capabilities and new features, available in multiple sizes and supporting various tasks like document reading and object detection.

More Reading

Post navigation

Leave a Comment

Leave a Reply

Your email address will not be published. Required fields are marked *

Brave Browser Enhances AI Leo for Web, PDF, Google Docs, YouTube, and Slack Reading Capabilities

AI HAT+ Board for Raspberry Pi with Maximum Performance of 26 TOPS, Priced at $110