Phi-4 AI Unleashed by Microsoft: Audio Perception, Visual Recognition, Messaging Response Capabilities with Gemini 2.0 Flash Scoring.

Microsoft has released its latest artificial intelligence model, LLM Phi-4, in three sub-models: Phi-4, Phi-4-multimodal, and Phi-4-mini. These models boast comparable capabilities to API-sold models in the market, with the added advantage of being able to run independently at home.

The base Phi-4 model has a size of 14B, similar to Qwen2.5-14B, but with test scores reaching up to Qwen2.5-72B. It excels in mathematical problem-solving and slightly outperforms Gemini 1.5 Pro, supporting inputs of just 16,000 tokens.

Phi-4-multimodal, a separate model, is sized at just 5.6B but supports input for sound, image, and text. Its image reading tests outperform Gemini 2.0 Flash, with OCR quality comparable to Gemini 2.0 Flash Lite/Claude 3.5 Sonnet, accurately converting speech to text.

The smallest Phi-3.8-mini model, with a size of 3.8B, exclusively receives and responds to text but supports input of up to 128,000 tokens. Prioritizing efficiency in executing commands and tool utilization, it can be embedded into various programs as an assistant.

All models are available for download under MIT license, allowing for free usage.

TLDR: Microsoft releases LLM Phi-4 AI model in three versions: Phi-4, Phi-4-multimodal, and Phi-4-mini, each with unique features and capabilities, available for download under MIT license.

Phi-4 AI Unleashed by Microsoft: Audio Perception, Visual Recognition, Messaging Response Capabilities with Gemini 2.0 Flash Scoring.

More Reading

Ai2 Research Institute Unveils olmOCR: A High-Quality Image-to-Text Conversion Model Supporting Thai Language

Unveiling of Quantum Prototype Ocelot Chip by Amazon Implements Novel Technique for Error Reduction sans Bit Addition

Leave a Comment

Leave a Reply Cancel reply

More Reading

Post navigation

Leave a Comment

Leave a Reply Cancel reply

Unveiling Roblox’s Cube 3D Model: Crafting Tri-dimensional Objects from Prompts to Open Source

Unveiling Janus-Pro: The Cutting-Edge AI Model for Analyzing and Generating New Images

GraphCast: Cutting-Edge AI Model by DeepMind Unveils Revolutionary Weather Forecasting Capabilities