Home ยป Unveiling Magika: Unearthing File Types with Unparalleled Precision – A Google Project Surpassing 99% Accuracy in Replacing libmagic

Unveiling Magika: Unearthing File Types with Unparalleled Precision – A Google Project Surpassing 99% Accuracy in Replacing libmagic

Google has launched the Magika project, a small-scale artificial intelligence model for accurately identifying file types, also known as MIME types, in order to enable various programs to handle files correctly.

The process of identifying file types has no clear guidelines. The “file” program, which was first developed in Unix version 4 in 1973, or 51 years ago, is still being used today. The code has been continuously used since then. The file project uses the first source control system, RCS, which was developed in 1987, before CVS.

Magika takes a different approach by utilizing deep learning models developed with Keras and run with ONNX. The resulting model is only 1MB in size and can accurately predict file types in just a few milliseconds, even when running on a CPU. The significant advantage is its high accuracy, with an overall F1-score of up to 99.31%, compared to the “file” command, which only achieves 81.30% accuracy.

Magika is available for free as an Apache 2.0 licensed open source project. It can be installed immediately using the command “pip install magika”. There is also an npm version available, but it is still in the experimental stage.

TLDR: Google’s Magika project introduces a small-scale AI model that accurately identifies file types for efficient file handling. Using deep learning techniques, the 1MB-sized model can predict file types within milliseconds with an impressive accuracy of 99.31%, outperforming the traditional “file” command. Magika is open source and can be easily installed via pip.

More Reading

Post navigation

Leave a Comment

Leave a Reply

Your email address will not be published. Required fields are marked *

Google Cloud’s first-ever earnings report exceeds ten billion dollars – Alphabet’s financial performance disclosed.

Revisiting the Timeline: Chrome’s Revised Plan to Phase Out Third Party Cookies from Q4/2024 to Q1/2025

Enhancing User Security: Google Introduces Passkey Recommendations to Elevate Login Experience