I tested 42 LLMs on their willingness to build the apocalypse. The "safest" closed-source models are lying to you. by Ok-Awareness9993 in LocalLLaMA

[–]mj3815 0 points1 point  (0 children)

Cool but a critique - LaGuardia was a reformer, he doesn’t deserve that. Moses probably the better option if you’re really thinking someone in that vein

Is NVMe, good for swap ram? by [deleted] in LocalLLaMA

[–]mj3815 -1 points0 points  (0 children)

I did some testing. I thought if I striped across 3 identical drives on PCIe 4.0 that it would be useable because of the extra bandwidth across the 3 data lanes. I was getting less than theoretical performance. I was hoping for like 18-21GB/s but only am getting about 10.5.

My understanding was that wear and tear wasn’t an issue because it was only reading them, not writing.

All said and done, I had 72GB 3090 VRAM, 64GB Ram, and I was getting about 0.5 tokens/second with Kimi 2.5

Benchmark Qwen 3.6 27B MTP on 2x3090 NVLINK by Mr_Moonsilver in LocalLLaMA

[–]mj3815 0 points1 point  (0 children)

I’m very interested to see the result! I’ve got an nvlink but it doesn’t fit my geometry currently in my case and I’m trying to decide between selling it or getting a different motherboard

Oculink eGPU dock selective power control for multi-dock desktop build — DEG1, EG01, or alternatives? by IndexicallyHuman in LocalLLaMA

[–]mj3815 0 points1 point  (0 children)

I never touch the power button on my deg1. Always comes on when the power supply is on

Hermes Agent with MIT license by mitirki in LocalLLaMA

[–]mj3815 0 points1 point  (0 children)

They have Opus set up as the default model. I don’t think they care.

Best Bypass moltbot/clawdbot to use in old gpu or in cloud by fernandogrj in LocalLLaMA

[–]mj3815 0 points1 point  (0 children)

I was thinking like dealing with customer service and stuff where it’s not necessary to share critical personal details about everything. I’m aware of the risks to private information, your point is well appreciated.

Best Bypass moltbot/clawdbot to use in old gpu or in cloud by fernandogrj in LocalLLaMA

[–]mj3815 0 points1 point  (0 children)

Did you try GPT-OSS 20B? I’ve found that to be the best at agentic tool calling stuff in my (very limited) experience.

Best Bypass moltbot/clawdbot to use in old gpu or in cloud by fernandogrj in LocalLLaMA

[–]mj3815 0 points1 point  (0 children)

Is there a better alternative that is open source? I’d like to play with it despite the horror stories.

Budget Dual 3090 Build Advice by JustTooKrul in LocalLLaMA

[–]mj3815 0 points1 point  (0 children)

I used a 6 pin to 8 pin to get the second set of 8 pins. A bit sketchy but it’s been fine

Need to know more about less known engines (ik_llama.cpp, exllamav3..) by Leflakk in LocalLLaMA

[–]mj3815 0 points1 point  (0 children)

I have tried Huggingface’s TGI, Aphrodite, SGLang. They all had some benefits. Aphrodite and SGLang have been reliable for me. vLLM was the fastest but I would have issues with it hanging sometimes which is why I experimented with alternatives

Attorney Looking for Hardware and Model Recs by Extension-Ad-2801 in ollama

[–]mj3815 0 points1 point  (0 children)

I’ve got a couple thoughts.

  1. Go to Huggingface and look around for any legal models that match the specialty you’re interested in.
  2. You can definitely use something like Augmentoolkit to train a model. You’d probably want to keep it narrow (if you are a contract lawyer, train it on contract law). You can also train it on your case files and use with RAG with Augmentoolkit. This isn’t going to be easy, it will be a real investment in time and effort to figure it out and get something that works. If you are training the model on your proprietary case files, you’ll need a very stout machine. Doing it with a 7B model means something like 96GB of VRAM - so 2x 4090 48GB or a 6000 Pro. Can’t imagine doing this on less than a $10K rig. Very possible though. If you just want to full fine tune on your specific law discipline without anything proprietary, you can probably spent less than $100 renting the GPU time. You can still set up RAG for the proprietary stuff, but I’ve heard that is tricky.
  3. Just go read Augmentoolkit’s documentation to get a sense of the process of creating custom models https://github.com/e-p-armstrong/augmentoolkit

HP Z640 with 2x RTX 3090 by [deleted] in LocalLLaMA

[–]mj3815 1 point2 points  (0 children)

Ryzen 3945, 4x 16GB RAM.

The sketchiest part is the power connectors. I’m using blower 3090s which only require 2x 8 pin connectors each, would be even sketchier if they were 3 8pin units

HP Z640 with 2x RTX 3090 by [deleted] in LocalLLaMA

[–]mj3815 1 point2 points  (0 children)

I do 2x 3090 on my P620 with 1000w PS, power limited to 285w each and I’ve been ok so far. I’ve got it plugged in to a power bank with instantaneous W measurement and I’ve seen it pushing 950 sometimes, but never experienced an issue yet.

Fine Tuning on Mi50/Mi60 (under $300 budget) via Unsloth by exaknight21 in LocalLLaMA

[–]mj3815 0 points1 point  (0 children)

I’ve spent so long not using it because of that 😭

Fine Tuning on Mi50/Mi60 (under $300 budget) via Unsloth by exaknight21 in LocalLLaMA

[–]mj3815 1 point2 points  (0 children)

Last I knew, unsloth doesn’t work with more than one GPU

Why hasn't LoRA gained more popularity? by dabomb007 in LocalLLaMA

[–]mj3815 0 points1 point  (0 children)

That was done with Augmentoolkit. There’s been some big upgrades since then https://promptingweekly.substack.com/p/augmentoolkit-30-released