Home ยป MyShell innovates LLM model on par with LLaMA2 at a fraction of the cost – only 3 million baht

MyShell innovates LLM model on par with LLaMA2 at a fraction of the cost – only 3 million baht

MyShell, an artificial intelligence company, has introduced the JetMoE-8B model, which boasts higher efficiency than the LLaMA-2 13B model and at a significantly lower cost for training and running. JetMoE utilizes the Mixture-of-Expert architecture, allowing for real-time model usage with only 2.2B parameters. The running cost is on par with Gemma-2B while training the model requires 96 sets of NVIDIA H100 chips over a period of 2 weeks, totaling approximately 80,000 dollars or around 3 million baht. This cost is expected to be more economical compared to other models with similar performance. In contrast, training the LLaMA-2 13B model requires 368640 hours of A100 chips, which could exceed 500,000 dollars when calculated as cloud cost.

The model is available for use under the Apache 2.0 license and can be tested at Lepton.ai.

TLDR: MyShell introduces the JetMoE-8B model, which offers higher efficiency and lower training costs compared to similar models. The model utilizes the Mixture-of-Expert architecture and is available for testing at Lepton.ai.

More Reading

Post navigation

Leave a Comment

Leave a Reply

Your email address will not be published. Required fields are marked *

Introducing the Samsung S24: Unveiling Cutting-Edge AI Features for Seamless Phone Conversations, Conveniently Multilingual Typing, and Concise Text Summaries While On the Go for Thai Language Users

Apple Intelligence Launches in October for American English Users – Adding More Languages Next Year, Thai Not Included Yet

Unveiling the Ray-Ban Meta Glasses: Over 2 Million Sold – Ramp Up Production and Launching Oakley Brand