Home
» Mobile Tips
»
How to Install and Run LLM Locally on Android Phone
How to Install and Run LLM Locally on Android Phone
Running large language models (LLMs) locally on your Android phone means you can access AI models without relying on cloud servers or an Internet connection . This local setup ensures privacy by keeping your data secure and on-device. With advances in mobile hardware, running AI models locally has become a reality. The MLC Chat app makes it easy to experience this powerful technology right on your phone.
This article will explain the importance of running LLM locally on Android phones and provide step-by-step instructions to install and run them using the MLC Chat app.
Why run LLM on Android phone?
LLMs are typically run on cloud servers because they require significant computing power. While Android phones have some limitations when running LLMs, they also open up some interesting possibilities.
Improved Privacy : Since all the computation happens on the phone, your data stays local, which is important for any sensitive information you share.
Offline Access : No continuous Internet connection is required to access or interact with these models. This is especially useful for users in remote areas or those with limited Internet connectivity.
Cost Effective : Running LLM on cloud servers involves operational costs such as cloud processing power and storage. This approach provides a cost-effective solution for users.
Step by Step Guide to Install and Run MLC Chat on Android
The MLC Chat application is designed to allow users to run and interact with large language models (LLMs) locally on a variety of devices, including mobile phones, without relying on cloud-based services. Follow the steps below to run LLMs locally on an Android device.
Step 1: Install MLC Chat application
First, you need to download the APK for MLC Chat app (112MB) from the link below.
Once the APK is downloaded, tap on the file to start the installation.
Step 2: Download LLM
After successfully installing the app, open it and you will see a list of LLMs available for download. Models of different sizes and capacities, such as LLama-3.2, Phi-3.5, and Mistral, are available. Select the model according to your needs and tap the download icon next to it to start downloading. For example, if you are using a mid-range phone like Redmi Note 10, choose a lightweight model like Qwen-2.5 for smoother performance.
Download LLM
Step 3: Run the installed LLM
Once the model is downloaded, a chat icon will appear next to it. Tap the icon to start modeling.
Run the installed LLM
Once the model is ready, you can start typing prompts and interacting with the local LLM.
For example, on a device like the Redmi Note 10, running a smaller model like Qwen2.5 provides a fairly smooth experience, generating around 1.4 tokens per second. While this performance is slower than high-end devices like the Galaxy S23 Ultra , it still works fine for basic tasks like short chats and simple content creation.
Running LLM locally on Android devices via the MLC Chat app provides an accessible and privacy-preserving way to interact with AI models. Performance is highly dependent on the phone’s hardware. This solution is ideal for users who need offline access to AI models, experiment with LLM in real time, or are concerned about privacy. As mobile hardware continues to improve, the capabilities of local LLM will only expand, making this an exciting frontier for AI technology.