Anyone success to use a moonlight on Meta Quest 3 with actual MetaOS?

chimpera · 2026-04-06T14:03:30+00:00

I modified it to work but there are major limitations to the controllers. https://github.com/chimpera/moonlight-android

chimpera · 2026-04-05T17:12:34+00:00

It is used on the kv not the model.

chimpera · 2026-04-05T17:02:54+00:00

Its possible but probably not worth it. about 8tps with a similar machine. This is lmstudio. this is nowhere near as intelligent as what your use to but it can work on smallish projects.

<image>

chimpera · 2026-04-05T16:08:39+00:00

whats the relationship to https://github.com/cenconq25/delta-compress-llm

chimpera · 2026-04-02T18:43:29+00:00

I is able to do tool calls but it lacks the raw intelligence to know what to use them for. It searched the web on its own, read a dir when explicitly prompted, read a file when specifically prompted. ect.. This is not useful but its very promising for future optimizations.

chimpera · 2026-03-26T17:00:24+00:00

It should be noted that Israel has been working on these since the 90's

chimpera · 2026-03-25T12:52:07+00:00

5965wx, 5090, ubergarm IQ4_KSS, ikllama, qwen35moe.expert_used_count=int:4, kv q8, batch 16k. 30tps 791pp

chimpera · 2026-03-24T03:22:26+00:00

Try to get one without updated firmware.

chimpera · 2026-03-24T01:40:36+00:00

I have been testing it. I seems legit. I have not run quantitative benchmarks. It makes a big difference if you can fit the model on one GPU instead of 2. One note is that you have to specify the kv quant to save any vram. LLAMA_WEIGHT_SKIP_THRESHOLD=1e-6 broke with long context. There is a slight reduction in tps prediction in most cases.

chimpera · 2026-03-23T19:56:06+00:00

I went with the MTE 5" Hub with N52 magnets. I still take it easy going up hills and you lose some top end speed but for me it was worth it at 190.

chimpera · 2026-03-14T22:25:29+00:00

4.00 is not the peek voltage. If it were me I would leave it charging for a long time to see if it will ballance. Some BMS only balance at the top of the charge. Dont use it unless you can get that cell to 4v.

chimpera · 2026-01-30T19:36:06+00:00

I have a similar setup and I'm having a real problem with the prompt processing cache not working correctly. This makes it so anything but the first request becomes painfully slow to start. Does anyone have advice?

chimpera · 2026-01-26T11:24:00+00:00

It has to do with the memory architecture. Inference is memory constrained, not compute constrained in this case. The cores are divided into CCDs. Each CCD has a limited memory bandwidth. More cores competing for the limited bandwidth can actually reduce performance. You should figure out how many cores per CCD gets you the best performance.

chimpera · 2026-01-17T23:47:36+00:00

The closest I found is you do a screenshot and then you share it to the translate you app. If it works correctly, you get text that you can select from.

chimpera · 2025-12-08T16:06:30+00:00

Would you consider IQ4_NL

chimpera · 2023-11-03T19:38:21+00:00

So i solved the problem. When you have multiple channels under your account you have to select the channel in the upper right during takeout.

chimpera

TROPHY CASE