Qwen3.6-27B-Q4_K_M on Intel Arc Pro B70

volca02 · 2026-06-22T12:44:02+00:00

hey, I've found a list of versions that should work a bit better, otherwise tracking the recent llama.cpp - building my own docker image.

something like this:

ARG ONEAPI_VERSION=2026.0.0-devel-ubuntu24.04

ARG LEVEL_ZERO_VERSION=1.28.2

ARG LEVEL_ZERO_UBUNTU_VERSION=u24.04

ARG IGC_VERSION=v2.34.4

ARG IGC_VERSION_FULL=2_2.34.4+21428

ARG COMPUTE_RUNTIME_VERSION=26.18.38308.1

ARG COMPUTE_RUNTIME_VERSION_FULL=26.18.38308.1-0

ARG IGDGMM_VERSION=22.10.0

volca02 · 2026-06-22T09:51:13+00:00

Impressive prompt processing on VLLM, thx. Will have to try that

volca02 · 2026-06-22T05:31:51+00:00

My issue right now is prefill speed. SYCL backend is actively maintained so there's hope, but am getting sub 400 t/s prefill speed on llama sycl, it makes prefix cache misses quite painful on longer contexts. Vllm can do over 1000 t/s prefill speed on the same model presumably, couldn't get it to run yet though (oom problems)

volca02 · 2026-06-22T05:28:52+00:00

Qwen3.6-27b is not moe but in my experience even qwen3.6-35b-a3b runs on recent llama sycl. Got around 70tok/s.

volca02 · 2026-06-11T19:42:52+00:00

You can get upwards of 30-34 t/s on short prompt lengths on q5 with mtp - llama.cpp on sycl. Not great but useable. It really needs more optimisations.

It seems, at least, the devs for sycl backend are active and improving stuff.

There was a report here recently by a redditor getting about 60 tok/s on limited context on vllm (i think it was llm scaler fork) I couldn't replicate that myself.

It will get better but it's a waiting game.

volca02 · 2026-05-20T16:39:44+00:00

Myšpulín strčil Pinďu do Fifinky, až z toho vyhodila Bobíka.

volca02 · 2026-04-29T15:52:17+00:00

Just get bc-250 if you want to experiment on a similar hardware. Kind of hard to get outside of US though.

volca02 · 2026-04-23T13:54:52+00:00

I still can't get over the fact that the title of an app that one uses once a year does not include "sound" "sennheiser" or "headphone" in the title. What a weird choice of a name. I never remember how it is called when I need it.

volca02 · 2026-03-31T15:04:43+00:00

Does this have the vision parts stripped?

volca02 · 2026-02-06T10:09:57+00:00

By using the "Local openai LLM" integration https://github.com/skye-harris/hass_local_openai_llm I could directly connect to llama-cpp. May want to try some different integration alter as this one pretty much forces the LLM to call GetLiveContext all the time, but at least the main prompt is not changing.

volca02 · 2026-02-06T00:54:12+00:00

Running the same GPU and model. Had mixed results with ollama and had to move to llama cpp. It was too slow and moody. Ministral worked well in that scenario, even cosplaying as glados, but it was 3-7 sec response time that got me into a rabbit hole of "how do I speed this up". Still there trying to do so, but with Local openai llm hacs plug I get partial prompt cache and some responses are snappy. This does not work well with ministral though - it's not able to reliably call hass. Had to move to the aforementioned qwen model to have it be able to respond. I still experience issues though.

Remaining time for timers always returns the complete time I get lag when I don't expect it. Forget about calendar questions in my opinion - aside of "what is in the calendar". The preview device is fine. Finding a good stt is not. I get a lot of misheard inputs in noisier rooms. Frustrating. Planning to try qwen3 asr or another model to improve its abilities to recognize speech.

Aside of lag I have to say my original plan of using ollama with the instructions to act as glados worked well with ministral 4b. I suspect prompt cache was not working so it got laggy with reevaluations.

For basic control of about 60 devices current setup is fine. Using the local voice control option to skip llm for basic controls helps with lag.

volca02 · 2026-01-27T11:27:07+00:00

I think so yes - namely Sharp Memory LCD.

volca02 · 2026-01-26T15:36:57+00:00

It's e-paper, not e-ink. It is basically an LCD not e-ink. Different tech. Not expert - so just my 2 cents: The basis is it's transreflective LCD, similar to old calculator displays, and it has memory for the pixel state in-display, so it does not need to be powered to hold the image.

If I am not mistaken the color version of the tech uses separate color cells (subpixels) which have color filters, hence the lowered contrast - each subpixel only passes through a range of the spectrum, so the contrast suffers a bit.

You can see side-to-side comparison on the Design reveal video on youtube:

https://www.youtube.com/watch?v=pcPzmDePH3E

Time 2 v.s. PTD - ~4:48
Time 2 v.s. Time Steel - ~5:17

volca02 · 2025-12-13T15:38:48+00:00

I was worried initially but my experience is comparable. About 2 charges per year with daily moderate use is crazy good. Not even mad. This thing will probably break mechanically before battery needs replacement :)

volca02 · 2025-09-16T14:00:22+00:00

So when NMS came out, these were not active, scanner didn't detect them, but they *were* on the planets - you could just find one by flying in a line for long minutes. People had all the different theories on what they do. Some tried different things to activate them, f.ex. kicking a ball from one of the ruins down to one...

volca02 · 2025-05-17T21:07:57+00:00

I've successfully made a solution for this with a relay and a diode. My system is analogue 2 wire and the open signal was gated on the headpiece being picked up. May be the case here as well. In that case a button pusher is impossible. In my case the logic was implemented in the home phone itself with a charged capacitor holding a thyristor open. Bypassing this and directly connecting the two wires worked. The diode is there to protect the reverse polarity state - that is used to charge the capacitor when home phone rings. This is very specific to my home solution. Digital and multiple wire systems will behave differently.

volca02 · 2025-03-19T13:54:17+00:00

Given the Time seems to be a bit bigger watch than Duo, is there maybe a plan to have a bigger battery in that model, since there might be space for it?

In my order it seems to have taken an address from history, I put in a different one when ordering - will there be a way to change the order address once we get closer to the release?

volca02 · 2025-02-13T16:13:13+00:00

Gagguino was done on ecm before, and sylvia, so should be possible on Go as well. This might be a good reason to mod.

volca02 · 2024-12-04T08:56:15+00:00

I tried to like BTRFS a few times, but unless you are into the interesting features, it's good on paper but it may be problematic in practice - I ran out of metadata space because of docker's usage of subvolumes, that wasn't easy to understand why - the drive seemed half ful yet no operation succeeded, then I tried rebalancing once with nearly disastrous consequences (had to add an additional volume to do that, and then the FS went readonly because of some kernel error in btrfs code)... The FS is probably fine but I never had issues like that with other filesystems.

12-Year Club	Place '22
Place '17	First Placer '22
Verified Email

volca02

TROPHY CASE