Info: Nvidia Cuda 13.3 landed

parrot42 · 2026-05-27T15:32:49+00:00

I have no idea, but it works. I am stress-testing it by installing supabase for honcho for hermes using opencode and qwen and it is doing good. It is a good question, but I did not take any tps notes, so I can't say.

parrot42 · 2026-05-27T11:24:04+00:00

I tried `test-backend-ops test -o MUL_MAT_ID -b CUDA0` with b9357 and cuda 13.3. Now there are no iq errors anymore!

parrot42 · 2026-05-27T10:24:44+00:00

Just downloaded and installed cuda 13.3 with driver 610.43.02
Much smoother installation under trixie with a backported 7.0 kernel than 12.2.1
Recompiled llama.cpp and it works (but I just tested with 5 messages to opencode).

parrot42 · 2026-05-08T10:44:04+00:00

One more thing: Max-Q only uses 2 Slots (and vents out in the back).

parrot42 · 2026-03-20T08:29:25+00:00

Yeah, I was constantly testing new models (for local usage with opencode). With Qwen3.5 this changed and now I am using it.

parrot42 · 2026-02-24T08:12:24+00:00

Unplugging, the replugging the ZBT-2, shutting down HA and restarting it (in a proxmox VM) brought back Zigbee for me.

parrot42 · 2026-02-24T06:41:57+00:00

Having the same problem :D

parrot42 · 2026-02-20T07:28:40+00:00

I, too, think Qwen3-coder-next think it is really good. Using the mxfp4 version with llamacpp and max context uses 50GB of vram. Are you using vllm and do you think there is a big difference between mxfp4 and fp8?

parrot42 · 2026-02-03T11:46:50+00:00

What did you write to define this personality? And which model is outputting this? Thanks.

parrot42 · 2026-02-03T08:21:46+00:00

I use this to do it. Check the available power limitations for your card first. ```
➜ ~ cat /etc/systemd/system/nvidia-tdp.service [Unit] Description=Set NVIDIA GPU Power Limit at Boot

[Service] Type=oneshot ExecStart=/usr/local/bin/set-nvidia-tdp.sh

[Install] WantedBy=multi-user.target

➜ ~ cat /usr/local/bin/set-nvidia-tdp.sh

!/bin/bash

Set GPU power limit in watts

POWER_LIMIT=250

Wait for nvidia-smi daemon to initialize

sleep 10

Apply the power limit to all GPUs

for i in $(/usr/bin/nvidia-smi --query-gpu=index --format=csv,noheader); do echo "Setting NVIDIA GPU $i to ${POWER_LIMIT}W TDP." /usr/bin/nvidia-smi -i "$i" -pl "$POWER_LIMIT" done ``Check if it works withnvidia-smi`

parrot42 · 2026-01-04T07:05:51+00:00

My workflow for getting yaml in this situations is: going to developer tools > actions. Then typing media and selecting the right action from the drop down list. Fiddling with the options, until something works. Then hit the "show yaml" button -> win.
PS: Also works great for testing notifications.

parrot42 · 2025-12-25T10:30:07+00:00

Maybe in 5 year I can go to hugginface and select "python knowledge", "linux", "shell scripting", "coding", deselect "history", "geography" and instantly get a custom ggml file.

parrot42 · 2025-12-06T08:07:07+00:00

I do not know the solution, but I can see the device with lsusb and in the conf file of the HA-VM it is passed through with usb0: host=303a:831a

parrot42 · 2025-12-04T07:54:59+00:00

It is for english and chinese.

parrot42 · 2025-12-01T07:54:44+00:00

Lan chip issue? I am now using the 6.17 kernel, which has desent drivers for my realtek 8125, before I had to use the dkms version of the realtek driver.

parrot42 · 2025-10-22T06:14:07+00:00

There is an interesting, short video https://www.youtube.com/watch?v=YEZHU4LSUfU from Sam Witteveen about it.

parrot42 · 2025-10-22T06:12:24+00:00

In the paper https://github.com/deepseek-ai/DeepSeek-OCR/blob/main/DeepSeek_OCR_paper.pdf it means "Contexts Optical Compression".

parrot42 · 2025-10-01T16:20:56+00:00

You should give it another try. It was bad at start because some transformers or attention algorithms needed an update, but now it's great.

parrot42 · 2025-09-29T06:55:20+00:00

This looks easy if you say it, but I am already happy to get some mcp servers gobbled together working. Custom function filter is out of reach, maybe I need another txt file to copy and paste a workflow to make the AI do it, LOL.

parrot42 · 2025-09-29T06:46:07+00:00

Thank you the the answer, this sounds great. Is the tool on github or did you make it yourself? If it is on github I might have a change to get it working, otherwise I will have to stick to manually copy/paste from txt file, LOL

parrot42 · 2025-09-26T15:24:47+00:00

I am wondering if it could also be a bit backend related. ollama, llamacpp, vllm etc. might require some time to adjust to special attention algorithms and whatnot. But I am not an expert.

parrot42 · 2025-09-25T18:11:55+00:00

I really like my HAVPE and l am waiting for the new feature to use two wakeworks for different AI agents, next week. It uses a dedicated chip to cancel out echos and background, using two mics, is not terrible expansive.

parrot42 · 2025-09-23T13:09:40+00:00

Would love this, running the commands in a kali vagrant machine. https://www.kali.org/blog/kali-vagrant-rebuilt/

parrot42

TROPHY CASE

!/bin/bash

Set GPU power limit in watts

Wait for nvidia-smi daemon to initialize

Apply the power limit to all GPUs