Introducing Mistral's NeMo 12B Language Model: The Upgrade from Mistral 7B, Featuring a 128k Context Window.

Mistral AI, a leading French artificial intelligence company, has unveiled its latest language model, Mistral NeMo 12B, boasting an impressive 12 billion parameters and supporting a large context window of up to 128K (compared to 8K in similar models). This new model can seamlessly replace the previous Mistral 7B model as a drop-in replacement.

Designed to support multiple languages from the start, Mistral NeMo 12B has performed exceptionally well in languages such as English, French, German, Spanish, Italian, Portuguese, Chinese, Japanese, Korean, Arabic, and Hindi. Its performance in nearly all test sets has outperformed Llama 3 8B and Gemma 2 9B.

Mistral NeMo 12B also features a new tokenizer called “Tekken” (not related to the game), trained to work with over 100 languages and demonstrate superior text compression capabilities compared to Mistral’s previous tokenizer. Some languages have shown up to a threefold improvement in efficiency, with a 30% better source code compression.

In a strategic collaboration, Mistral has partnered with NVIDIA to optimize the NeMo model for NVIDIA NIM inference microservices, enhancing performance for NVIDIA TensorRT-LLM and enabling single-device execution on GeForce 4090. Additionally, the model was trained on NVIDIA DGX Cloud machines.

The NeMo model is licensed under the Apache 2 open-source license and available for download on Hugging Face.

TLDR: Mistral AI introduces Mistral NeMo 12B, a powerful language model with advanced features, multilingual support, and optimized performance for NVIDIA hardware.

Introducing Mistral’s NeMo 12B Language Model: The Upgrade from Mistral 7B, Featuring a 128k Context Window.

More Reading

Rumors: Apple Negotiates with Hollywood Studio to Acquire More Films for Apple TV+

The CrowdStrike Conundrum: Impacting Airlines, Banks, Television Stations, Hospitals, and Various Organizations Alike

Leave a Comment

Leave a Reply Cancel reply

More Reading

Post navigation

Leave a Comment

Leave a Reply Cancel reply

Alibaba Opens Open Source AI Models that Learn Visual Data, Supporting English and Chinese Languages

Gemma: Google Unveils New Open Source Language Model, Commercially Viable with Single-Structure Similarity to Gemini

Introducing Mixtral 8x7B: Unveiling Mistral AI’s LLM Open Source Hybrid Model, Nearing GPT-3.5’s Capabilities