Meta has released the Llama 3.1 model, the latest LLM model that has significantly improved capabilities along with the release of the largest model, 405B, with testing abilities on par with GPT-4o. Although it does not yet have multimodal capabilities supporting both images and sound.
Various tests of Llama 3.1 show that Meta has steadily improved the capabilities of the small-sized model, with the 8B model scoring close to the 70B Llama 3.0 in many test sets. While the 70B Llama 3.1 model can score even higher on almost every test set except for the HumanEval programming task, where scores decreased.
A significant change this time around is that Meta has started testing the abilities of languages other than English, with the MMLU Thai language scores of Llama 3.1 8B at 50.32, 70B at 72.95, and 405B at 78.21. Although lower compared to English or other European languages, it shows Meta’s emphasis on the Thai language, especially considering the focus on Asian languages by many Chinese developers.
Meta trains Llama 3.1 to refuse to respond to dangerous prompts, recommending using it not in isolation but in conjunction with Prompt Guard for prompt injection prevention and Llama Guard 3 for message response safety.
Mark Zuckerberg wrote an article about this launch, noting the necessity of open-source LLM models as organizations want to use models with their own data without relying on external services. Some organizations prefer not to send data outside or find the cost of using API calls too expensive when dealing with large amounts of data.
Regarding the free distribution of Llama by Meta, Zuckerberg provides four reasons: 1) Meta believes that opening up technology will lead to a wider range of tool development, integrating models across different platforms, 2) AI is developing rapidly, opening up current models for use does not create a competitive disadvantage in the future, while Llama has the potential to become an industry standard, 3) Meta is not currently in the business of selling APIs, so opening up models does not impact revenue, 4) Meta believes in open-source as seen in projects like Open Compute Project, PyTorch, React, and many other tools.
Moving forward, Meta AI will provide the Llama 3.1 405B model, but it is not yet available for use in Thailand.
TLDR: Meta releases the Llama 3.1 model with improved capabilities, conducts tests on various languages, emphasizes the importance of open-source models for organizations, and provides reasons for distributing Llama for free while focusing on safety measures for prompt responses.
Leave a Comment