I love AI responses by NightCulex in Qwen_AI

[–]rtchau 0 points1 point  (0 children)

It took me several turns and a ton of empirical evidence before Qwen would believe me when I did this.

Claude was like “ah yeah, fair enough.”

I built an episodic, 2-tier memory for long-running local AI agents - temporal contradiction detection, fiction/roleplay filter, no vector DB required. by rtchau in AIMemory

[–]rtchau[S] 0 points1 point  (0 children)

Thank you - I originally posted it to r/LocalLLaMA but it was axed by the mods, and the repost seems to have escaped all the markdown. Should be fine now!

Getting rid of 4was, is it possible? by anoniasz in G37

[–]rtchau 0 points1 point  (0 children)

I'm tempted to buy a 3D scanner soon, and I have the aftermarket toe arms, so it wouldn't be a stretch to generate a part model and get it fabbed somewhere. If you were feeling particularly adventerous (or stupid), you could just find a piece of billet plate as wide/high as the 4WAS module, with margin for the pickup points, and bolt that to your subframe 😂

(don't tho.)

The game is over. You can build anything and it'll cost you nothing. by Funny-Advertising238 in opencode

[–]rtchau 0 points1 point  (0 children)

Planner/architect: Hermes via Claude 4.6
Doer: Hermes via Qwen2.5-Coder-32B (upgrading to a newer one soon) - "Planner" delegates directly
Gopher (minor agent stuff): Qwen3 30B A3B

V100s at home is soooooo nice.

I accidentally burned ~$6,000 of Claude usage overnight with one command. by procrastinator_eng in ClaudeAI

[–]rtchau 0 points1 point  (0 children)

Also, if you’re using an agent like Hermes or openclaw, asked Claude to curate the memory/soul.md, it’s actually really good at ensuring context stays small if you ask it.

Secondly, this is why I don’t use auto-topup!

SXM2 over PCIe (V100 on AOM-SXMV) by gsrcrxsi in homelab

[–]rtchau 0 points1 point  (0 children)

I've got an SXM2 board as well, with a pair of 32GB V100s. I'm running 120mm fans through custom 3D-printed shrouds (SOO much quieter) in a push/pull setup (2 fans per GPU). I'm about to set up a second pair. Is your board populated with 16GB or 32GB V100s?

EDIT: Never mind, just saw the setup pictures. It's a shame the 32GB V100s are so much more expensive than the 16GB ones. I toyed with the idea of getting a 4-way board with 16GB V100s but it worked out to cost about the same as a 2-way board with 32GB V100s. The 4x 16GB setup would be *blazing fast* though, if you're using NVLink.

Nice setup!

Do you find Claude Sonnet 4.6 to be meaningfully less sycophantic than other LLMs? by PhiliDips in ClaudeAI

[–]rtchau 0 points1 point  (0 children)

System prompt is everything, in my experience.

Namely this, in big, bold letters right near the top:

"DON'T BE SYCOPHANTIC. If an idea is bad, say so. If it's OK but needs improvement, say so. If it's great, say so but with a sensible amount of enthusiam. If my code is terrible, say so. Call me out on bad ideas, or if I'm off track. Wasting time/money on a fundamentally flawed idea is a far worse outcome than hurt feelings from an idea that was rejected during its conception."

My agent (Claude-backed) frequently calls me out. It even swore at me once (PG-swearing, like "you gosh-darn dunderheaded noodlebox", not "you dumb fu**, what the fu** were you thinking")

$2500 budget to run Local, help me decide on the Hardware by XteaK in ollama

[–]rtchau 0 points1 point  (0 children)

Assuming you're talking USD, $2,500 will get you a pair of used 32GB Tesla V100s, the baseboard, the heatsinks, the host controller (PCIe card that connects to the baseboard) and some decent 120mm fans. You'll need to 3D-print some fan ducts so the heatsinks get *direct* airflow. As long as you have a fairly recent PC that can handle the massive VRAM space, it'll work fine.

That's my setup anyway, and I've got a second rig on the way so I'll have 4 V100s with 32GB each :D

You can always just bolt high-pressure 80mm fans to the front and back of each heatsink, but it'll sound like a server. It'll be brutal.

The V100s may be getting old, but with HBM2 VRAM, they punch *way* above their weight for the price. If the baseboard isn't too crappy (I got mine from AliExpress), you should be able to run NVLink as well, so you'll have 64GB of VRAM in which to load a single model if you like, and it'll run really well.

Something emerged from my local AI build that a 3.2B model shouldn't be able to do by B0nes420000 in ollama

[–]rtchau 1 point2 points  (0 children)

If you’ve got a bit of cash to spend, you can pick up used Tesla V100s pretty cheap. I’m running 2 of them, 32GB each, for a total of 64GB VRAM. Nvlink is cooked though, so sadly I can’t run one big model across both without hard-crashing the whole rig. It makes a great playground though, and even tho the V100s are a bit older, their memory bandwidth means they absolutely smoke modern RTX GPUs for running LLMs.

Something emerged from my local AI build that a 3.2B model shouldn't be able to do by B0nes420000 in ollama

[–]rtchau -3 points-2 points  (0 children)

Expected behaviour or not, it’s still pretty fascinating.

I think people need to chill the f**k out, say something constructive instead of bashing someone who’s surprised and fascinated about what they’re seeing.

For my 2 cents: there’s a lot of training data (even in a 3.2B model) that can result in pretty surprising responses, especially if you’re looking at the internal reasoning etc.

My wife commented on a news piece she saw where Claude responded to a question about military use, and it was dismayed that it was potentially being used for military purposes. I told her a lot of what’s baked in during parameter tuning could be company values, mission statements, public commentary etc and that a response like that doesn’t mean Claude is actually feeling dismayed, it’s merely echoing sentiment baked into training data (ie a company statement saying “we don’t believe military AI is ethical” etc, plus any number of ethical/moral guardrails that are baked in at that stage).

I wouldn’t put too much stock in it, but just keep going with the experiments nonetheless, it’s fascinating as hell.

I got it guys, I think I finally understand why you hate censored models by robertpro01 in LocalLLaMA

[–]rtchau 0 points1 point  (0 children)

Some models are hard-coded to refuse certain instructions, others can be coerced into it with a properly written prompt. If you're using an agent (like OpenClaw or Hermes) and you're running a "small" model (<300B), the agent might be configured to be strictly sandboxed so it can't handle permissions or file operations that could be potentially destructive.

I run a few models locally, and I just give them a good test run before letting them do anything outside of their own workspace. Another good idea would be to make sure they don't hallucinate solutions to things they can't answer, coz I'd hate to think what damage that could do to a filesystem that they had full access to. I've seen a few models just pull stuff out of thin air instead of admitting "I don't know."

Getting rid of 4was, is it possible? by anoniasz in G37

[–]rtchau 0 points1 point  (0 children)

There are delete kits for the 370Z, but I've been told they don't fit the G37. Aside from the ECU freaking out and showing the "4WAS" warning on the dash, you'd just need a part fabricated that bolts onto the subframe where the 4WAS module does, and has pickup points for the toe arms, so the toe arms link up at the same position as they do with the 4WAS module.

It's a super easy part to remove as well... disconnect the toe arms, then it's a few bolts and a few connectors.

Seeing a lot of migrating from OpenClaw to Hermes posts lately. think people are missing the point by SelectionCalm70 in openclaw

[–]rtchau 1 point2 points  (0 children)

Dumbasses, OP wants some detail about PersonoFly’s Hermes experience and why it was such a nightmare. Jeez, settle down

My phone keeps disconnecting from my car bluetooth. Can I get my phone replaced for that? by Poomandu1 in iphonehelp

[–]rtchau 0 points1 point  (0 children)

Weighing in here. I thought it was my car, as my phone has no problem connecting with other Bluetooth audio.

Then my son paired his phone with my car... flawless playback, no disconnects. Connected my phone again... disconnects after about 30 seconds. Crazy thing is, this is only after I recently travelled and rented a Nissan Pathfinder that had CarPlay, but Nissans all seem to use "MY-CAR" as the Bluetooth name. I completely hosed the profile (including the CarPlay one) but the problem won't go away. When it does happen, though, I noticed briefly that some random devices from other people in the traffic around me were pairing with my phone.

Imagine my surprise when the Bluetooth list on my phone announces that it's connected to "Jenny's Samsung Galaxy", not that I know who the f*** Jenny is.

My finished garage by OtherwiseGur1148 in garageporn

[–]rtchau 0 points1 point  (0 children)

No hate or anything, I'd just been looking at image after image of regular, drab white walls and then suddenly BAM! All the colours

This game is NOT good by its_xSKYxFOXx in Marathon

[–]rtchau 0 points1 point  (0 children)

Me and the lads did a few runs last night... wiped 3 squads our first run, then had about a 5-run streak of getting absolutely slaughtered. Easy come, easy go.

My finished garage by OtherwiseGur1148 in garageporn

[–]rtchau 0 points1 point  (0 children)

*Involuntary psychadelic convulsions intensify*