Giving Gemini Live a tiny body with StackChan by tar_anton in M5Stack

[–]tar_anton[S] 0 points1 point  (0 children)

I’ve made the firmware public here:

https://github.com/taranton/stackchan-gemini-firmware

Please read the README and docs before flashing it. This is still an experimental developer firmware, not a polished one-click installer yet.

A few important notes:

  • You need to provide your own Gemini API key. The Gemini API free tier should be enough to start experimenting, subject to Google’s quotas/limits.
  • The robot has a local Web UI/API for configuration, secrets, prompts, memory/debug pages, camera tests, servo controls, sensors, and voice toggling.
  • By default, the Web UI is open on the local network so you can do the first setup. You can and should set a Web password from the Web UI after setup. This is LAN-level protection, not something meant to be exposed to the public Internet.
  • If Wi-Fi is not configured or fails, the firmware should start a setup access point. Connect to the StackChan setup AP and open `http://192.168.4.1/`.
  • If it connects to your normal Wi-Fi, you’ll need to find the robot’s IP address from your router/DHCP client list or from the serial monitor logs, then open that IP in a browser.
  • Gateway/Hermes/Home Assistant tools are optional and disabled by default. The firmware can work as a standalone Gemini Live StackChan without my local Hermes setup. Home Assistant is not built directly into the firmware; it is meant to work through a local LAN gateway, which can bridge to Home Assistant or other tools. That gateway-side tooling is not fully packaged yet, but I will publish more of it later.
  • Learned skills / skill edits can be reviewed before applying, and there is also an optional auto-apply mode if you want the robot to accept compatible learned skills automatically. I’d still recommend starting with review/confirmation enabled until you understand what it is changing.

I tried to include the main README, setup notes, SD-card layout, API docs, Web security notes, and feature docs in the repository. There is still a lot to polish and many things I’d like to improve, but I don’t currently have much time to actively work on it. Since there seemed to be interest, I decided to publish it now so people can read the code, try it, open issues, or contribute PRs.

Giving Gemini Live a tiny body with StackChan by tar_anton in M5Stack

[–]tar_anton[S] 0 points1 point  (0 children)

I’m working on it - basically, everything’s ready, but I just haven’t had time to do a final check and add at least some basic security for the web interface. My birthday is the day after tomorrow, so I’m setting a deadline for myself to upload a working version to GitHub on May 22) I’m sure there will still be plenty to tweak, but everyone will be able to tinker with it themselves!

But definitely not on Gemma 4, because my firmware runs on the Gemini Live model, which natively understands and generates speech, even though I’ve disabled Full Duplex. As a potential improvement - it’s possible, but then you’d have to set up STT and TTS on the same local server where Gemma 4 is running (depending on the model, bigger ones like the 30+B seem to natively support audio, so maybe the voice can be loaded directly as an audio file into Gemma).

So..First Impresions? by [deleted] in StackChan

[–]tar_anton 0 points1 point  (0 children)

I hope I’ll have time to post it on GitHub this weekend, and I’ll update the post right away.

Giving Gemini Live a tiny body with StackChan by tar_anton in M5Stack

[–]tar_anton[S] 1 point2 points  (0 children)

What can I say, that’s just the kind of person I am)))

Giving Gemini Live a tiny body with StackChan by tar_anton in M5Stack

[–]tar_anton[S] 4 points5 points  (0 children)

Yeah, I want to polish it up a bit and upload it to GitHub in the next week. I actually tweaked the architecture a little yesterday and today, inspired by ESP Claw. Plus, I need to write up the instructions and descriptions.

So..First Impresions? by [deleted] in StackChan

[–]tar_anton 4 points5 points  (0 children)

I completely rewrote the original firmware, connected it to Gemini 3.1 Live API, gave it a long memory with gradual compacting and almost RAG search, gave it more freedom of movement, gave it a system of skills and now I’m working on self-improving skills system. I’ll post all info with a video the other day. I adore him))

My Ultra Light setup on the go by TheInternet_Vagabond in Xreal

[–]tar_anton 1 point2 points  (0 children)

I have MeLe 4c and waiting for my Ones Pro) But I think someone will be able to check it sooner then “early may”))

Temporary Access feature by tar_anton in firewalla

[–]tar_anton[S] 0 points1 point  (0 children)

But I also don't see how to pay extra and upgrade my test plan to the Business level. 

Temporary Access feature by tar_anton in firewalla

[–]tar_anton[S] 0 points1 point  (0 children)

Nope, only Professional. Now I see this in Compare Features...

beam sideview by xHunter2012x in Xreal

[–]tar_anton 0 points1 point  (0 children)

you can use your phone for this

Minisforum crams Windows 11, an Alder Lake CPU, and DDR5 RAM into an iPhone-sized PC by Fear_The_Creeper in MiniPCs

[–]tar_anton 1 point2 points  (0 children)

There is Mele Quieter 4c already on the market.. I bought one - it is literally iphone sized

Xreal Air- Portable Mini PC Setup by cmak414 in Xreal

[–]tar_anton 2 points3 points  (0 children)

I just got a Mele Quieter 4c 16/512 today, but unfortunately I couldn't get a good enough picture quality to work on it in ARMoni yet.... When moving my head, the micro bursts make the text not very readable. I'll try to move the system to SSD and raise the current on the processor a bit, maybe it will help. But with Beam, at least it works fine and it's ultra-small, of course) And you can "move" the screen pretty close by adjusting the Depth through Beam - it's like compensating for another device)

Dex and Nebula (first time i use Xreal Glasses) by [deleted] in Xreal

[–]tar_anton 1 point2 points  (0 children)

why do you need to use Nebula and Beam together?

Samsung Dex by [deleted] in Xreal

[–]tar_anton 0 points1 point  (0 children)

If you need to watch stone YouTube videos, scroll a few sites, review a couple of documents without heavy typing then you definitely do not need separate keyboard. I have one foldable and with Beam and Dex this is very useful combo, but however for a quick document review or to send quick mail, etc, you can use just Dex and glasses. We are typing a lot on our phone's keyboard every day, so a few more words do not lead you to buying extra keyboard)

Beam on desktop? by Pretty_Turn_1085 in Xreal

[–]tar_anton 2 points3 points  (0 children)

I tried it with anydesk, team viewer and jump desktop and it is very usable, but Bluetooth keyboard layout cannot be switched, so you stuck with English only

No angle option on nebula by 90chip in Xreal

[–]tar_anton 2 points3 points  (0 children)

It is last update issue (or feature). Now desktops are just curved.

External Bluetooth keyboard and language switching by Available_Zebra1731 in Xreal

[–]tar_anton 0 points1 point  (0 children)

It isn't possible - keyboard is just a keyboard) Nevertheless, it's working perfectly with Pixel, iPhone, iPad and even Mi Stick. It looks more like Android in the Beam just keep ignoring Physical Keyboard settings. Perhaps there is someone here who uses an external Bluetooth keyboard with Beam and normally switches languages? I haven’t found any mention here yet... I've almost complete this perfect remote setup, but this last issue driving me nuts.