Microsoft has released the LLM model in mini size, phi-3-mini, with just 3.8 billion parameters, yet it can achieve a 69% MMLU score and an 8.38 MT-Bench score, closely approaching GPT-3.5 (MMLU 70.0%, MT-Bench 7.94), surpassing the recently released Llama 3 model with 8B parameters.
The Phi-3 family of models also includes phi-3-small with 7B parameters and phi-3-medium with 14B parameters. When used in quantized 4-bit mode, the phi-3-mini model only requires 1.8GB of RAM and can run on an iPhone 14 at a rate of 12 tokens per second.
In terms of security, the phi-3 model responds to dangerous content much less than phi-2, with a decrease of only 0.75% compared to phi-2’s 2.93%.
Microsoft’s report indicates that while phi-3 can provide reasoning comparable to larger models, its memory limitations result in poorer performance on certain test categories such as TriviaQA. Additionally, the dataset used is limited to only English. Initial tests incorporating multilingual data into phi-3-small have shown promising results, but Microsoft has not yet disclosed any conclusive testing outcomes.
Source: ArXiv
TLDR: Microsoft’s phi-3-mini LLM model boasts impressive performance metrics, outperforming larger models in certain aspects but revealing limitations in memory capacity and multilingual capabilities.
Leave a Comment