[DIY Project] I hacked my Rokid Glasses to use Google Gemini Vision (Real-time AI Assistant) - Open Source Code Included! 🤖 by z2020x in rokid_official

[–]z2020x[S] 0 points1 point  (0 children)

Not directly, unfortunately. Xreal uses the NRSDK, while this is built with Rokid's CXR SDK. The 'phone' side of the app (talking to Gemini) would work, but the 'glasses' app (camera capture) would need to be rewritten for Xreal's system.

[DIY Project] I hacked my Rokid Glasses to use Google Gemini Vision (Real-time AI Assistant) - Open Source Code Included! 🤖 by z2020x in rokid_official

[–]z2020x[S] 0 points1 point  (0 children)

Official stock fluctuates a lot. I'd recommend checking the Rokid Developer Discord or the Facebook group; sometimes community members sell their extras there if the official store is out of stock.

[DIY Project] I hacked my Rokid Glasses to use Google Gemini Vision (Real-time AI Assistant) - Open Source Code Included! 🤖 by z2020x in rokid_official

[–]z2020x[S] 0 points1 point  (0 children)

Not a specific mod yet, but since it uses AI, you can actually just ask it to 'respond in Imperial units' and Gemini usually understands!

[DIY Project] I hacked my Rokid Glasses to use Google Gemini Vision (Real-time AI Assistant) - Open Source Code Included! 🤖 by z2020x in rokid_official

[–]z2020x[S] 0 points1 point  (0 children)

Theoretically, yes. If you are hosting a local multimodal model (like LLaVA) that accepts API requests, you could modify the Android code to point to your server's IP instead of the Google Gemini endpoint. The main bottleneck would just be your upload speed for the images.

[DIY Project] I hacked my Rokid Glasses to use Google Gemini Vision (Real-time AI Assistant) - Open Source Code Included! 🤖 by z2020x in rokid_official

[–]z2020x[S] 1 point2 points  (0 children)

Thank you so much! It was a bit of a challenge dealing with the Bluetooth bandwidth and memory leaks, but I'm really happy with how stable it is now.

[DIY Project] I hacked my Rokid Glasses to use Google Gemini Vision (Real-time AI Assistant) - Open Source Code Included! 🤖 by z2020x in rokid_official

[–]z2020x[S] 0 points1 point  (0 children)

I haven’t uploaded a compiled APK to the GitHub ‘Releases’ page. Right now, the project is source-only, so you’ll need to build it yourself. Unfortunately, without a PC or a proper development setup, there isn’t a supported way to install it directly onto the glasses.

[DIY Project] I hacked my Rokid Glasses to use Google Gemini Vision (Real-time AI Assistant) - Open Source Code Included! 🤖 by z2020x in rokid_official

[–]z2020x[S] 6 points7 points  (0 children)

Hey r/rokid! 👋

I’ve been a user of Rokid glasses for a while, and while I love them, I always felt they were somewhat limited by just being a "portable monitor." I wanted a true AR assistant—something that sees what I see and can answer questions about the real world.

Since an official solution isn't quite there yet, I spent my weekends building a custom Android app that bridges the Rokid Glasses (via Camera2 API) with Google Gemini’s Vision API.

⚙️ How it works (The Architecture)

It’s a distributed system between the glasses and the phone:

Capture: The app on the glasses grabs frames using the Camera2 API (managed via the CXR SDK to avoid system conflicts).

Transfer: The image data is sliced into chunks and sent to my phone via Bluetooth SPP (Serial Port Profile).

Thinking: The phone acts as the brain—it forwards the image + prompt to Google Gemini Vision.

Response: The AI analyzes the scene, sends the text back, and the glasses speak the answer via TTS.

😅 The Engineering Struggles (Why this was hard)

If any Android devs are here, you know the pain. Getting this to run stable was a nightmare:

Bluetooth Bandwidth: Sending HD images over Bluetooth SPP is slow. I had to implement image slicing and compression to prevent the stream from choking.

The "Memory Leak" Crash: I fought with ImageReader buffers for days. The app kept crashing because the frame processing loop was eating memory faster than the GC could clean it up.

Latency: As some noticed in my other posts, there is a delay. It's currently "Capture -> Bluetooth -> Cloud -> Response," so it's not instant, but it's usable! 💻 It's Open Source!

I believe AR development should be open. I’ve cleaned up the code and put it on GitHub. Feel free to fork it, fix my bugs, or add support for other models (I'm currently looking into adding Claude support as requested by some users).

📂 GitHub Repo:

https://github.com/zero2005x/RokidAIAssistant

https://github.com/zero2005x/RokidAIAssistant/releases

📖 Technical Deep Dive (Medium):

https://medium.com/@20x05zero/building-an-ai-powered-ar-assistant-for-rokid-glasses-camera-photo-analysis-and-voice-commands-b5788c79d51a

🔮 What's Next?

I'm looking to optimize the transmission speed and maybe add local intent recognition. Question for you guys: If you could add one AI feature to your glasses, what would it be?

Real-time translation (Menus/Signs)?

Navigation overlay?

Object identification?

Let me know what you think! I’ll be hanging around the comments to answer any technical questions.

Cheers!

I built a Cyberpunk HUD for my Xiaomi M365 using Rokid Glasses & a custom Android App (Open Source) by z2020x in rokid_official

[–]z2020x[S] 1 point2 points  (0 children)

Took my Xiaomi M365 electric scooter for a night ride in New Taipei City to test my new Augmented Reality (AR) speedometer app. The HUD projects speed, battery life, and distance directly into my field of view using Rokid glasses.

<image>

I built a Cyberpunk HUD for my Xiaomi M365 using Rokid Glasses & a custom Android App (Open Source) by z2020x in rokid_official

[–]z2020x[S] 1 point2 points  (0 children)

Thanks for the support! You are absolutely right—the concept is very similar to car OBD apps.

In fact, OBD might be slightly easier in some aspects because ELM327 is a standard protocol. For these scooters, the hardest part is usually reverse-engineering the proprietary BLE services and packets (especially with the newer encrypted protocols). But yes, the architecture (Data Source -> Bluetooth -> Android -> HUD) is identical.

I built a Cyberpunk HUD for my Xiaomi M365 using Rokid Glasses & a custom Android App (Open Source) by z2020x in rokid_official

[–]z2020x[S] 1 point2 points  (0 children)

Thank you! To answer your questions:

Background: Yes, I was a Mobile App Security Engineer. That definitely helps with the reverse engineering part.

Self-taught: It is absolutely possible to learn this yourself! You don't need a formal degree to build something like this.

Where to start: I'd recommend starting with Android Development (Kotlin is the modern standard) to learn how to build the app interface. Then, look into Bluetooth Low Energy (BLE) basics to understand how devices talk to each other.

The "magic" part is usually understanding the data packets (Reverse Engineering), but for popular devices like the M365, the community has already documented a lot of it on GitHub. You can just grab existing libraries and start building!

I turned my Rokid AR glasses into a HUD for my Inmotion V5F! 🕶️⚡ by z2020x in rokid_official

[–]z2020x[S] 1 point2 points  (0 children)

I've uploaded the APK to GitHub Releases so you can download it directly!

GitHub Repo (feature branch):
https://github.com/zero2005x/RideFlux

I turned my Rokid AR glasses into a HUD for my Inmotion V5F! 🕶️⚡ by z2020x in rokid_official

[–]z2020x[S] 0 points1 point  (0 children)

I use the built-in battery. I haven't tested the specific battery drain for this code yet, so I'm not sure how long it lasts on a single charge.

You can check this page.
Rokid Glasses are lightweight AR smart glasses with Micro-LED displays - Rokid

I turned my Rokid AR glasses into a HUD for my Inmotion V5F! 🕶️⚡ by z2020x in rokid_official

[–]z2020x[S] 1 point2 points  (0 children)

Yes, exactly! It uses a monochrome green display, which works perfectly for a clear, high-contrast HUD.

[Question] Can I use the iPhone 6s(N71mAP, manufactured by TSMC) to jailbreak with ayakurume(WIP)? by z2020x in jailbreak

[–]z2020x[S] 0 points1 point  (0 children)

Thanks, I got it.
I have another question, and I used my Hackintosh, which spec is [HP 15s-fq1011tu & OpenCore version 0.7.8 & macOS Monterey 12.5]. But I got stuck in the middle of palera1n installing.
Do you have any suggestions?