At Google I/O 2024, one of the highlights was the showcase of the Gemini Nano model running directly on Android smartphones. This offline model could detect scam conversations, showcasing the potential of on-device model execution.
Following that, Google also released a video session titled “Android on-device AI under the hood” to elaborate on the details of running models on Android smartphones. The advantages of On-device Gen AI include in-device processing, reduced latency, offline functionality, and no requirement for cloud rental fees.
Examples of using On-device Gen AI include content summarization, message response suggestions, and text emotion classification. However, there are limitations regarding parameter size and context window size, indicating the need for fine-tuning to enhance accuracy.
Google offers two ways to run models on Android, either through Gemini Nano models prepared by Google or by running external models via MediaPipe LLM API for research and testing purposes.
Currently, Gemini Nano is available on Pixel 8 Pro and Galaxy S24 Series, limited to specific Google apps like Google Messages with Magic Compose feature, Recorder for voice transcription supporting only English, and Gboard for chat response recommendations.
Google has allowed select partners to test Gemini Nano since late 2023, with plans to broaden access to developers in 2024. The technology behind Gemini Nano runs on AICore in Android 14, setting a standard for AI execution on Android.
A crucial aspect of running models on-device involves fine-tuning to tailor the model for specific tasks. Google employs Low-Rank Adaptation (LoRA) as an efficient tuning technique, emphasizing the importance of diverse and concise training data.
Moreover, Google supports running external models on Android through MediaPipe LLM API, primarily for text-to-text models. Some showcased models include Falcon-1B by Technology Innovation Institute, Phi-2 by Microsoft, Stable LM 3B by Stability.ai, and Google’s own Gemma.
TLDR: Google unveiled the Gemini Nano model at Google I/O 2024 for on-device AI execution, emphasizing the advantages of in-device processing and fine-tuning for enhanced model performance.
Leave a Comment