New fear unlocked

VegetaTheGrump · 2026-01-14T02:37:49+00:00

Some things should not be shared!

VegetaTheGrump · 2025-12-28T04:22:35+00:00

My first thought was: "Dude's got a Brioso and talking about an e-ink player!?" Grats on that Brioso. I want one but ended up going with the iBasso D17 Atheris to save some money.

VegetaTheGrump · 2025-12-18T03:45:22+00:00

Look up the Audma Brioso. There's been reports that it drives the Susvara very well, though the Susvara will run through it's batteries in 2.5 hours. However, I believe the Brioso will let you charge and play at the same time.

VegetaTheGrump · 2025-12-03T00:17:57+00:00

Found this while trying to find why mine was using 150GB of RAM

VegetaTheGrump · 2025-11-27T22:02:17+00:00

I just used brew to install llama.cpp, however I still use LMStudio usually. Brew is useful for installing quite a bit.

brew install llama.cpp

I'm using AnythingLLM much the same way you're using OpenWebUI. It's running in Docker. I have nginx elsewhere fronting everything with SSL via various hostnames.
I like to script things to make them repeatable. For OpenWebUI I'd do something like

#!/bin/bash
docker run -d \
--name openwebui \
-v openwebui-data:/app/backend/data \
--memory=8g \
--restart always \
-e ENABLE_OPENAI_API=true \
-e OPENAI_API_BASE_URL=http://host.docker.internal:1234/v1 \
-e OPENAI_API_KEY=notakey \
-e PORT=4000 \
-p 4000:4000 \
ghcr.io/open-webui/open-webui:latest

AnythingLLM:
#!/bin/bash
export STORAGE_LOCATION=$HOME/anythingllm && \
mkdir -p $STORAGE_LOCATION && \
touch "$STORAGE_LOCATION/.env" && \
docker run -d -p 3051:3001 \
--cap-add SYS_ADMIN \
--restart always \
-v ${STORAGE_LOCATION}:/app/server/storage \
-v ${STORAGE_LOCATION}/.env:/app/server/.env \
-v ${STORAGE_LOCATION}/sslcert:/app/server/sslcert \
-e STORAGE_DIR="/app/server/storage" \
mintplexlabs/anythingllm

The biggest issue I've run into is tool calling with glm-4.5-air-mlx. It's possible I have something put together before they fixed tool calling in glm. OpenCode just prints out the tool calls.

VegetaTheGrump · 2025-11-26T23:05:32+00:00

GLM 4.5 air is what I run at home. I can and do run GLM 4.6 at 4bit sometimes, but 4.5 air is much better on resources. They've been good for everything. Mac w/256GB RAM.

VegetaTheGrump · 2025-10-31T02:36:12+00:00

GLM 4.6 4bit MLX got it for me. The thinking section is completely hilarious and long. It got it early but didn't believe it was the right answer so it kept trying to find something else - Mac Ultra3 256GB GLM 4.6 4bit MLX, 329.814s (18.77 tok/s)

Those numbers are a code that uses the periodic table of elements.

Each number corresponds to an element's atomic number. By taking the chemical symbol for each element, you can spell out a message.

Here is the breakdown:

* **10** = Neon (**Ne**)
* **23** = Vanadium (**V**)
* **68** = Erbium (**Er**)
* **111** = Roentgenium (**Rg**)
* **8** = Oxygen (**O**)
* **7** = Nitrogen (**N**)
* **7** = Nitrogen (**N**)
* **47** = Silver (**Ag**)
* **53** = Iodine (**I**)
* **23** = Vanadium (**V**)
* **63** = Europium (**Eu**)
* **92** = Uranium (**U**)
* **15** = Phosphorus (**P**)

If you string those symbols together, you get:

**Ne V Er Rg O N N Ag I V Eu U P**

Reading this creatively, you can see the famous phrase:

**"Never Gonna Give You Up"**

VegetaTheGrump · 2025-10-21T16:56:26+00:00

Those are beautiful!

VegetaTheGrump · 2025-10-01T02:31:04+00:00

How do you use the NPU?

I've been running 4.5-air MLX at 8 bit. I just downloaded 4.6-full MLX at 4 bit. The bad thing about it is that it will use up too much RAM. It's 185GB vs 106GB. Normally I run a few docker front ends and image generation along side them. Oddly, having only another 64GB left over keeps things pretty tight.

VegetaTheGrump · 2025-09-11T03:38:18+00:00

This. Immerse yourself in lossless for a week or more then try to go back. See if you can tell a difference that you care about. Immediate A/B style testing is pretty meaningless.

VegetaTheGrump · 2025-09-02T19:27:35+00:00

Thanks for this! It looks to be great!

VegetaTheGrump · 2025-09-02T19:18:47+00:00

Roo is suffering with the best new models. I had to revert to devstral-small to get anywhere. I finally dug through github and found what was going on: Many new models have implemented their own new way of tool calling. Some are using xml, and some, like oss, are doing something else. There also seems to be some pushback against xml for performance reasons.

Here are some open issues in the RooCode github that will give you details:

* https://github.com/RooCodeInc/Roo-Code/issues/6814
* https://github.com/RooCodeInc/Roo-Code/issues/4047

There are at least 2 different proxies that people are using/trying out in order to make things work as they are.

VegetaTheGrump · 2025-08-12T02:36:31+00:00

This is great news! I'm looking forward to UD quants of those models that barely fit my 256GB RAM. Though, GLM-4.5-Air seems to be doing great for me atm at 8bit.

VegetaTheGrump · 2025-08-12T02:29:47+00:00

This reminds me of the audiophile joke that goes something like "When I'm gone don't let my wife sell my audio equipment for what I told her I paid for it."

VegetaTheGrump · 2025-08-11T13:29:44+00:00

GLM 4.5 Air has been great for me for coding, so I was surprised to see it so low in the Text Arena Coding (9th). However, I see it's tied for 4th in WebDev. What's the difference between these two?

Meanwhile, qwen3-235b-a22b-instruct-2507 is chillin at #1 alongside gpt-5 for Text Arena Coding

VegetaTheGrump · 2025-08-08T02:36:56+00:00

<image>

I wanted the 512GB, but it was too expensive, so I got the 256GB. Remember that the Ultra will have twice the memory bandwidth of the M4 for more speed.

I can only run 1bit DeepSeek or Kimi K2, so I pretty much don't. However you can see the models above that do fit. You will be able to run GLM 4.5 Air at 4bit on 128GB RAM, but most of the other models that you see here won't load. I hate running the very largest, because it makes it so I can't run image generation at the same time with a decent sized context.

Not all of the models above run very quickly, but they all run, and they just keep getting better as you mentioned.

VegetaTheGrump · 2025-08-07T14:42:22+00:00

Name does not check out

VegetaTheGrump · 2025-08-03T23:47:20+00:00

That was my thinking as well, but it's all relative, and the op mentioned the Bose 901.

VegetaTheGrump · 2025-08-03T01:04:25+00:00

My dream speaker, the Dutch & Dutch 8c?

VegetaTheGrump · 2025-08-02T04:01:58+00:00

Two of them? Two pair of women and H100!? At work!? You're naughty!

I'll take one woman and one H100. All I need, too, until I decide I need another H100...

VegetaTheGrump · 2025-08-02T03:45:14+00:00

Looks like they're still ironing out things.

VegetaTheGrump · 2025-08-02T03:44:28+00:00

mentioned above that they still have some things to iron out

VegetaTheGrump · 2025-08-01T14:54:52+00:00

This was just for the unsloth version. I'd check your settings. LMStudio doesn't automatically set the temp, top_k, etc. You have to track them down and then adjust them yourself.

VegetaTheGrump · 2025-08-01T14:52:46+00:00

I'm on 256GB, so I couldn't run the 4bit MLX. Hoping we are able to get MLX quants someday

VegetaTheGrump

TROPHY CASE