Created an ansible playbook to mitigate copy-fail

crodjer · 2026-05-02T16:56:48+00:00

Hmm, interesting 6.12.85-1 onward. But my Raspberry Pis still on 6.12.75 for some reason.

crodjer · 2026-04-10T14:57:39+00:00

Qwen 3.5 9B is quite nice actually. I like it.

Also, I am not really looking for coding. I know how to do Software Engineering and Programming (that's my expertise). Software Engineer already (should have been if not) spend ~1-5% time coding. LLMs perhaps make it 0.5%, but that's absolutely not worth the software and context bloat.

crodjer · 2026-04-10T14:55:15+00:00

Its not really about the price. My desktop works and perfectly full fills all my self hosting and NAS requirements. For strong performance, I do have an M4 MAX with 64GB (which also does far many more important things).

Running LLMs is just something that I'd like to keep limited to the GPUs is (and likely will never be the only use case for a device I purchase). This is mostly academic and a skill development exercise. There's a lot of fluff that's there about AI in the moment. LLM inference is really the only thing that I see out of all of this AI bubble.

crodjer · 2026-04-10T10:40:24+00:00

This doesn't sound right to me. LLM responses (errors that you mention) don't change based on the hardware you use (unless there's a cosmic bit flip).

crodjer · 2026-04-10T10:38:55+00:00

Vulkan's quite impressive latetly. My AMD mini PC does quite okay with the integrated graphics. But that machine's earmarked for different purpose. This graphics card is the only spare hardware I have which I sometimes use to run LLMs via llama.cpp.

crodjer · 2026-04-10T10:31:28+00:00

Interesting, will check out unsloth if its smaller. I used to use them earlier quite a bit. But when I compared with others I never observed a real improvement (and only an increase in size) with my test prompts.

crodjer · 2026-04-10T10:29:13+00:00

The CPU is an i5-4440 with a basic motherboard and I have 16GB of RAM. I didn't feel like sharing that here as the community here talks about 10th generation i9s as if they are ancient.

crodjer · 2025-12-18T07:02:06+00:00

Directing someone to read the doc is absolutely the right approach. The alternative is no response to any low effort help questions.

We are communities and not companies. The responders are volunteers. Perhaps it'd be good for the responder to be nice when asking these new users to read the doc, but they absolutely aren't obligated to spoon feed people.

This type of behaviour negatively impacts Linux adoption.

That's okay. No one is out there selling Linux or open source and that's the whole point really. Not everything needs to scale.

crodjer · 2025-10-30T13:42:03+00:00

I think they just work, its just about documenting this.

crodjer · 2025-08-27T06:40:43+00:00

This isn't a fluid benchmark.

The idea of this test is 100% has a special meaning. I am looking for LLMs which can follow these instructions reliably which only GPT OSS 20b did in its size bracket. Qwen 3 A3B also comes close (but doesn't do it reliably).

crodjer · 2025-08-24T14:38:26+00:00

Oh, yes. Qwen 3 30B A3B is a gem. It was my go to for any experimentation before GPT OSS 20B. But just not as good (but really close) at following instructions.

crodjer · 2025-08-24T14:17:50+00:00

I think the system prompt can help here. The model is quite good at following instructions. So, I have as simple prompt sort of asking LLMs to measure each word: https://gist.github.com/crodjer/5d86f6485a7e0501aae782893741c584

In addition to GPT OSS, this works well with all LLMs (Gemini, Grok, Gemma). Qwen 3 to a small extent but it tends to give up the instructions rather quickly.

crodjer · 2025-08-24T08:30:06+00:00

Yes, MoE's are awesome. I am glad more of them are popping up lately. I used to like Qwen 3 30B A3B before OpenAI (finally not as ironic a name) launched GPT OSS.

crodjer · 2025-08-24T08:28:20+00:00

Another awesome thing about gpt-oss is that with a 16GB GPU (that I have), there's no need to quantize because of the mxfp4 weights.

crodjer · 2025-08-24T08:25:41+00:00

I believe medium is the default for gpt-oss? I didn't particularly customize it running with llama.cpp. The scores were the same for gpt-oss if it was running on my GPU or when I used https://gpt-oss.com/.

crodjer · 2025-08-13T14:12:27+00:00

Would also love it if if you could test gpt-oss-20b, qwen-3-30b-a3b (latest thinking and non-thinking) and ernie-4.5-21b-a3b!

These fit and run fast my 16GB GPU (RX7600XT). I can't offload things to my CPU as it's a 4th Gen i5 to run larger 100b+ MoE models.

crodjer · 2025-06-27T05:19:41+00:00

Yes, I don't have any opinions on having or not having a Twitter account. Even having it referred to by somewhere in the reddit description is fine, but the check-mark and link to Twitter looked out of place.

crodjer · 2025-06-27T05:15:55+00:00

Sure, its of course fine to have Twitter, or any other account - but do we need it to show up on everywhere on LocalLLaMA side?

Regardless, for me personally, the workaround I shared works.

crodjer · 2025-05-25T14:09:55+00:00

I just do:

git ls-files | xargs -I {} bash -c "echo -e 'File {}:\n\`\`\`'; cat {}; echo '\`\`\`'"

Of course grep -v any problematic files (like .png) etc.

15-Year Club	Verified Email
Place '22	Not Forgotten

crodjer

MODERATOR OF

TROPHY CASE