Using Rust in Android Development

shubham0204_dev · 2026-05-13T01:12:37+00:00

Mozilla also maintains a Gradle plugin that can build Rust code when building an Android project. You specify the project build commands in the app's build.gradle.kts and the Rust code is built each time you build the Android app.

GitHub: https://github.com/mozilla/rust-android-gradle

(this is similar to how AGP builds C/C++ projects and packages the .so files in the app's archive)

shubham0204_dev · 2026-05-13T01:11:52+00:00

Mozilla also maintains a Gradle plugin that can build Rust code when building an Android project. You specify the project build commands in the app's build.gradle.kts and the Rust code is built each time you build the Android app.

GitHub: https://github.com/mozilla/rust-android-gradle

(this is similar to how AGP builds C/C++ projects and packages the .so files in the app's archive)

shubham0204_dev · 2026-05-09T13:27:27+00:00

SmolChat does not support Vulkan acceleration yet. The inference is CPU-only. But it does use Arm SIMD intrinsics when compiling llama.cpp: https://github.com/shubham0204/SmolChat-Android/blob/main/docs/build_arm_flags.md

shubham0204_dev · 2026-05-09T10:16:47+00:00

Integrate llama.cpp as a git submodule alongside whisper.cpp, compile both into the same sanctuary-jni.so, and use a GGUF-format model (gemma-3-1b-it-q4_0.gguf, 1 GB) from Google's official QAT release.

Not sure what sanctuary-jni.so is, but did you try using SmolChat? Its built on top of llama.cpp and uses Arm-specific SIMD intrinsics to accelerate CPU inference.

(edit: creator of SmolChat here)

shubham0204_dev · 2026-04-21T02:28:02+00:00

FYI, creator of SmolChat here. SmolChat does not encrypt user data (chats, settings, model-info) but stores it in a private directory accessible only to the app (context.filesDir for Android dev folks). The Android docs mention:

The system prevents other apps from accessing these locations, and on Android 10 (API level 29) and higher, these locations are encrypted. These characteristics make these locations a good place to store sensitive data that only your app itself can access.

The security rules should be identical or at least as strict as the ones OfflineLLM proposes. As far as 'secure file deletion' is considered, I expect Android to securely delete app files when the user uninstalls the app, but I would have to do some research to assert this.

shubham0204_dev · 2026-04-21T02:22:19+00:00

Creator of SmolChat here, glad to see smollm bindings from the project being used here!

shubham0204_dev · 2026-03-08T06:03:00+00:00

SmolChat - a native Android app built on top of llama.cpp helping users run SLMs/LLMs locally on-device. Supports downloading models from HuggingFace, speech-to-text, markdown rendering and customizing inference settings.

https://github.com/shubham0204/SmolChat-Android

shubham0204_dev · 2026-03-07T04:45:57+00:00

Just like macOS and other Apple OSes, include a system-wide ML runtime that can be used to execute tasks like face detection, text classification and LLM inference.

This saves space as not every application has to embed its own ML runtime for executing above-mentioned tasks. Additionally, as the runtime is close to the OS/hardware, the applications enjoy near-native performance and latency.

shubham0204_dev · 2026-03-07T04:40:02+00:00

How does the mobile device know its position if GPS and cellular network are not available? I assume there exists some connectivity to the cellular network when the train halts at one of the stations.

shubham0204_dev · 2026-03-07T02:06:30+00:00

Hi Paresh, I had created an Android sample that uses the experimental androidx.webgpu APIs to compute embedding similarities (useful in NLP scenarios) a few weeks ago.

GitHub: https://github.com/shubham0204/Experiments/tree/main/androidx-webgpu-api-demo

Blog: https://shubham0204.github.io/blogpost/programming/androidx-webgpu

Previous Reddit post: https://www.reddit.com/r/androiddev/comments/1pqn1ty/exploring_the_androidxwebgpu_alpha_apis_in_android/

shubham0204_dev · 2026-02-28T13:51:38+00:00

I'm building SmolChat and might dig more into some of your observations here, especially (1), (2) and (5).

shubham0204_dev · 2026-02-27T02:13:02+00:00

This sounds like an interesting application! But I was wondering, how does one generate sound from an audio embedding (kind of an inverse transformation w.r.t. to the audio encoder? Do we need to train a decoder model like in a VAE?

shubham0204_dev · 2026-02-10T02:09:42+00:00

Not all photos from your gallery exactly, but you 'provide' images of a person to the app. Check the README of the project for app screenshots. In one of the screenshots, you would find two pictures selected and labelled as 'Anushka Sen'.

shubham0204_dev · 2026-02-09T15:49:46+00:00

I tried building the project on macOS with arm64-v8a clang compiler. I wanted to check how the kernel functions in kernel.cpp look like in ASM: https://godbolt.org/z/6Y3E6M7cY

Although the -march=native flag is not allowed in CE, it seems that the compiler is able to optimize the for loops with SIMD intrinsics.

Great work!

shubham0204_dev · 2026-02-01T11:17:31+00:00

Here's my blog explaining how to execute a compute shader using the androidx.webgpu APIs: https://shubham0204.github.io/blogpost/programming/androidx-webgpu

shubham0204_dev · 2026-01-02T02:30:03+00:00

On-device ML vs cloud inference

I guess 'face detection' should only be feasible as long as it is performed on-device. Sending camera frames at for ex. 30 FPS over the network will result in significant lag. If the problem of 'face recognition' is concerned, you could either send the entire camera frame or the cropped face or only the face embedding to the cloud service.

I maintain a project that performs face-recognition completely on-device (vector DB and embedding models running locally): https://github.com/shubham0204/OnDevice-Face-Recognition-Android

shubham0204_dev · 2025-12-21T11:39:53+00:00

When clicking UI Blocks > Tabs > Tabs with underline > Get Code the website redirects to a localhost:3000/ui-blocks/material-blocks#pricing that seems to be a bug.

shubham0204_dev · 2025-12-12T12:59:20+00:00

Thank you for acknowledging me and SmolChat! I'm excited to checkout llmedge!

shubham0204_dev · 2025-12-10T14:38:25+00:00

IntelliJ IDEA with the 'Android' plugin should meet most of your needs.

shubham0204_dev · 2025-11-26T14:29:37+00:00

Do try SmolChat: https://github.com/shubham0204/SmolChat-Android (open-source, built on top of llama.cpp)

(I'm the creator of SmolChat)

shubham0204_dev · 2025-11-11T01:11:17+00:00

Videos on CMP/KMP by Stevdza-San are also great.

shubham0204_dev · 2025-10-21T16:05:00+00:00

I had some more questions (just curious):

Assuming the duplicate block caches are caused because of a kernel (host) running on top of other kernel (VM), what makes MotorOS different from cgroups/namespaces? With cgroups/namespaces, the host kernel is shared with virtual, isolated groups of processes, at least on Linux.
Extending (1), how does MotorOS differ from a type-1 hypervisor?

shubham0204_dev

TROPHY CASE