Home ยป MyShell innovates LLM model on par with LLaMA2 at a fraction of the cost – only 3 million baht

MyShell innovates LLM model on par with LLaMA2 at a fraction of the cost – only 3 million baht

MyShell, an artificial intelligence company, has introduced the JetMoE-8B model, which boasts higher efficiency than the LLaMA-2 13B model and at a significantly lower cost for training and running. JetMoE utilizes the Mixture-of-Expert architecture, allowing for real-time model usage with only 2.2B parameters. The running cost is on par with Gemma-2B while training the model requires 96 sets of NVIDIA H100 chips over a period of 2 weeks, totaling approximately 80,000 dollars or around 3 million baht. This cost is expected to be more economical compared to other models with similar performance. In contrast, training the LLaMA-2 13B model requires 368640 hours of A100 chips, which could exceed 500,000 dollars when calculated as cloud cost.

The model is available for use under the Apache 2.0 license and can be tested at Lepton.ai.

TLDR: MyShell introduces the JetMoE-8B model, which offers higher efficiency and lower training costs compared to similar models. The model utilizes the Mixture-of-Expert architecture and is available for testing at Lepton.ai.

More Reading

Post navigation

Leave a Comment

Leave a Reply

Your email address will not be published. Required fields are marked *

Acquisition of API by OpenAI from Stack Overflow Marks Second Purchase Following Google

Amazon Engages in Pioneering AI Apprenticeship to Outsmart OpenAI

Notification to Investors: OpenAI Requests Abstention from Investing in Competing Companies such as Anthropic or xAI