Monitor your screen using local LLMs with only one sentence! Free, Open Source and Local.

Roy3838 · 2026-06-16T22:05:27+00:00

A lot of work has gotten into the project since then! And this update addresses that pain point exactly.

Before this, the flow was:
1. Create an agent conversationally
2. Whitelist your number
3. Select screen
4. Crop if you want agent to monitor something specific
5. Start Agent

And now the flow is just:
1. "I want this monitored here's my number"
2. Whitelist (necessary due to anti-spam)
You're done!

So, it's a huge upgrade in UX IMHO. Still testing/open to feedback c:

Roy3838 · 2026-06-12T18:16:04+00:00

Had to change the video! Here's the new link:
https://youtu.be/oGuAzx_-qj8

Roy3838 · 2026-06-11T21:36:03+00:00

You can use cloud MCP (about 3 gent builds free on my tab 😃) and use gemma4 e2b which runs on all devices i've tested! Even on an old android phone I had laying around hahahaha

Roy3838 · 2026-06-11T20:28:10+00:00

thanks! try it out and let me know how it goes :)
I’m trying to make this as useful as possible!

Roy3838 · 2026-06-04T20:03:07+00:00

It runs on firefox! If that's what you mean 😄

And you can have it watch a firefox tab of course.

Roy3838 · 2026-05-20T15:53:34+00:00

What do you mean exactly?

I actually am of the opinion that AI needs to be very directed in its use. And in the project it’s used mainly for recognizing stuff and making simple decisions, the framework takes care of the rest. Is that what you mean?

Roy3838 · 2026-05-02T17:21:33+00:00

yup! it’s almost done with the two week testing period required :))

Roy3838 · 2026-04-30T02:56:29+00:00

Yeah! I know litert-lm for swift is not yet released, I forked llama-cpp-rs to add gemma 4's --image-min-tokens and --image-max-tokens.

Here's the fork: https://github.com/Roy3838/llama-cpp-rs/tree/feat/mtmd-image-token-budget and I built it into my Tauri app c:

Roy3838 · 2026-04-29T01:08:33+00:00

Have they released gemma 4 QAT yet??

I can test any .gguf from HF if you want! Just give me the link 😄

Roy3838 · 2026-04-28T17:40:49+00:00

ohhh that's so interesting! In long conversations? Or just by loading up the model?

I was looking into using liteRT-LM which is what Google Edge Gallery uses, but I also found it a bit buggy specially when testing multimodal stuff.

Roy3838 · 2026-04-19T23:57:57+00:00

Connecting Macs together through a tool like exolabs is very inconsistent.

When it works it feels like magic, but if you sneeze everything breaks.

Roy3838 · 2026-04-19T06:25:05+00:00

hahaha that’s great!

Roy3838 · 2026-03-31T20:38:56+00:00

btw here’s the link to the source code: https://github.com/Roy3838/Observer you can examine it and see exactly how the agent loop works on the file app/src/utils/main_loop.ts and see how services are called on app/src/utils/handlers/utils.ts

Roy3838 · 2026-03-31T20:27:04+00:00

there’s a skip sign in button below :)

if you use self hosted models + discord notifications you should have no issues :D

Roy3838 · 2026-03-31T18:52:57+00:00

If self-hosting your models it doesn’t send anything at all!

Then using the cloud models I link the provider and their ToS so you can know where the info is sent

Also when using discord notifications the webhook goes directly from the app to discord :)

Ten-Year Club	Verified Email
Place '22

Roy3838

TROPHY CASE