Home ยป Gemma 3 QAT Unleashed for PC Running – Optimize Your Performance with Concise Training

Gemma 3 QAT Unleashed for PC Running – Optimize Your Performance with Concise Training

Google has released the Gemma 3 artificial intelligence (AI) model in a shortened version known as Quantization Aware Training (QAT). This model has been trained by compressing the Q4_O model small enough to run Gemma 3 27B on a graphics card with 14.1GB of RAM.

The QAT model relies on the full BF16 model as a base and trains the model that is currently being compressed to mimic itself after being compressed. This training process is repeated approximately 5,000 times, resulting in a final model that is compressed, but slightly lower in quality compared to the original model.

Gemma 3 QAT supports various frameworks like Ollama, LM Studio, MLX, Gemma.cpp, and llama.cpp. The model comes in 4 versions equivalent to the full Gemma 3, with the smallest model size being just 0.5GB, making it suitable for running on mobile phones.

TLDR: Google introduced the Gemma 3 AI model in a shortened version called QAT, compressed from the Q4_O model, with support for various frameworks and mobile phone compatibility.

More Reading

Post navigation

Leave a Comment

Leave a Reply

Your email address will not be published. Required fields are marked *

Unveiling SCB10X’s Typhoon model on Together.ai with Pricing Yet to Be Announced but Tailored to Actual Usage