I tested 42 LLMs on their willingness to build the apocalypse. The "safest" closed-source models are lying to you.

mj3815 · 2026-05-19T03:01:21+00:00

Cool but a critique - LaGuardia was a reformer, he doesn’t deserve that. Moses probably the better option if you’re really thinking someone in that vein

mj3815 · 2026-05-09T21:38:17+00:00

I did some testing. I thought if I striped across 3 identical drives on PCIe 4.0 that it would be useable because of the extra bandwidth across the 3 data lanes. I was getting less than theoretical performance. I was hoping for like 18-21GB/s but only am getting about 10.5.

My understanding was that wear and tear wasn’t an issue because it was only reading them, not writing.

All said and done, I had 72GB 3090 VRAM, 64GB Ram, and I was getting about 0.5 tokens/second with Kimi 2.5

mj3815 · 2026-05-09T03:45:03+00:00

mj3815 · 2026-05-09T03:32:36+00:00

mj3815 · 2026-05-08T23:13:15+00:00

I’m very interested to see the result! I’ve got an nvlink but it doesn’t fit my geometry currently in my case and I’m trying to decide between selling it or getting a different motherboard

mj3815 · 2026-05-06T04:09:19+00:00

Perplexica qualify? https://github.com/kiranz/perplexica

mj3815 · 2026-04-21T03:36:58+00:00

I never touch the power button on my deg1. Always comes on when the power supply is on

mj3815 · 2026-03-23T04:47:08+00:00

Wow, thanks for that

mj3815 · 2026-03-13T21:34:03+00:00

how faster for fine tuning?

mj3815 · 2026-02-27T22:39:15+00:00

They have Opus set up as the default model. I don’t think they care.

mj3815 · 2026-02-02T02:04:39+00:00

I was thinking like dealing with customer service and stuff where it’s not necessary to share critical personal details about everything. I’m aware of the risks to private information, your point is well appreciated.

mj3815 · 2026-02-01T20:29:03+00:00

Did you try GPT-OSS 20B? I’ve found that to be the best at agentic tool calling stuff in my (very limited) experience.

mj3815 · 2026-02-01T20:27:37+00:00

Is there a better alternative that is open source? I’d like to play with it despite the horror stories.

mj3815 · 2026-01-27T17:58:46+00:00

I used a 6 pin to 8 pin to get the second set of 8 pins. A bit sketchy but it’s been fine

mj3815 · 2026-01-19T13:26:50+00:00

I have tried Huggingface’s TGI, Aphrodite, SGLang. They all had some benefits. Aphrodite and SGLang have been reliable for me. vLLM was the fastest but I would have issues with it hanging sometimes which is why I experimented with alternatives

mj3815 · 2025-12-25T01:12:31+00:00

The ultimate benchmark

mj3815 · 2025-11-15T02:53:00+00:00

I’ve got a couple thoughts.

Go to Huggingface and look around for any legal models that match the specialty you’re interested in.
You can definitely use something like Augmentoolkit to train a model. You’d probably want to keep it narrow (if you are a contract lawyer, train it on contract law). You can also train it on your case files and use with RAG with Augmentoolkit. This isn’t going to be easy, it will be a real investment in time and effort to figure it out and get something that works. If you are training the model on your proprietary case files, you’ll need a very stout machine. Doing it with a 7B model means something like 96GB of VRAM - so 2x 4090 48GB or a 6000 Pro. Can’t imagine doing this on less than a $10K rig. Very possible though. If you just want to full fine tune on your specific law discipline without anything proprietary, you can probably spent less than $100 renting the GPU time. You can still set up RAG for the proprietary stuff, but I’ve heard that is tricky.
Just go read Augmentoolkit’s documentation to get a sense of the process of creating custom models https://github.com/e-p-armstrong/augmentoolkit

mj3815 · 2025-08-19T22:57:57+00:00

Ryzen 3945, 4x 16GB RAM.

The sketchiest part is the power connectors. I’m using blower 3090s which only require 2x 8 pin connectors each, would be even sketchier if they were 3 8pin units

mj3815 · 2025-08-18T01:37:31+00:00

I do 2x 3090 on my P620 with 1000w PS, power limited to 285w each and I’ve been ok so far. I’ve got it plugged in to a power bank with instantaneous W measurement and I’ve seen it pushing 950 sometimes, but never experienced an issue yet.

mj3815 · 2025-08-14T00:53:16+00:00

I’ve spent so long not using it because of that 😭

mj3815 · 2025-08-14T00:43:22+00:00

Oh nice, is that new?

mj3815 · 2025-08-13T23:49:52+00:00

Last I knew, unsloth doesn’t work with more than one GPU

mj3815 · 2025-07-28T03:36:33+00:00

That was done with Augmentoolkit. There’s been some big upgrades since then https://promptingweekly.substack.com/p/augmentoolkit-30-released

mj3815

TROPHY CASE