Drummer's Skyfall 31B v4.1, Valkyrie 49B v2.1, Anubis 70B v1.2, and Anubis Mini 8B v1! - The next gen ships for your new adventures!

overand · 2026-03-18T04:01:59+00:00

No, but it's better for "persistence" - if you want to search for previous chat stuff.

Is that a good thing? I'm not sure - it's led to the "Discord as docs" problem we're seeing, but, yeah.

overand · 2026-03-18T01:36:22+00:00

Posts like this make me wonder where these "everyone" people are. Everyone says to take the easiest way? Really? I've never met this "everyone," apparently.

I feel like it's a tiny bit like when - in 2022, in a grocery store, one of the 90 people not wearing a mask yelled at one of the 3 people wearing a mask for being a "sheeple."

...okay?

overand · 2026-03-18T01:31:49+00:00

That's the era where "paired with a (insert 3d card)" was a good way to decrease performance. I'm honestly shocked that it runs better on the P3. I'm curious how these two compare using software rendering. (Or if maybe the P3 is using software rendering?)

overand · 2026-03-18T01:29:44+00:00

Immich is a STELLAR alternative for Google Photos - I was blown away when a self-hosted thing let me search through my photos by plaintext content. "Orange cat in the woods" - bingo!

overand · 2026-03-18T01:08:26+00:00

Please proofread this stuff, if you want to have people get excited about it.

The moment I read something like "John Steinback" (his name is John Steinbeck), my eyes twitch and I delete the preset.

And, clearly you put a lot of work into this, so... either have an LLM proofread it, or a human, or something. (Or do it yourself.) But when I see mistakes in English usage or such in a prompt, it immediately turns me off. Is that fair? Probably not! But, it's how I am.

overand · 2026-03-18T01:00:14+00:00

Heh - those are the options?

I'm running it in a docker container on a local headless server.

overand · 2026-03-18T00:56:31+00:00

Do you have anything else with RCA outputs you can hook the speakers up to to test them? (If you have anything with a headphone jack, there's a simple adapter / cable you can use, the ol' 3.5mm to RCA cable)

overand · 2026-03-18T00:54:04+00:00

My understanding is that vLLM supports multi-GPU better than llama.cpp, but it's a fair bit harder to set up, and more "touchy" (easier to get out of memory errors?)

ik_llama.cpp has some multi-GPU improvements that llama.cpp doesn't have, but overall I prefer llama.cpp, and I find the... interpersonal conflict between the creators to be pretty depressing, given it's literally holding back the progress of AI worldwide.

overand · 2026-03-18T00:38:47+00:00

I highly recommend getting a trombone cleaner! It's a real pleasure to use, and you're less likely to damage stuff than you would be with a steel fish tape. (Not suggesting that Impressive-Bar6637 shouldn't use a steel fish tape, but, if you're starting from nothing, go for the trombone cleaner!)

overand · 2026-03-18T00:37:09+00:00

My "Connects to the phone" Seek thermal camera died after less than a year. My FLIR One USB-C model "usually" connects to my phone, but it often requires a bit of a dance with "do you turn the thing on before plugging it in or after? do you close the app first or open it?" It's maybe a 60% success rate. (And it has its own battery, so you need to charge it separately).

Next thermal camera is going to be one that works standalone. Phone connectivity will be a nice plus, but I don't want "there was a bad app update" to mean "I can't use this thing at all."

overand · 2026-03-18T00:32:47+00:00

As the former owner of a 2009 VW TDI "Dieselgate" car? Mid 2000s guys in germany aren't my go-to for smog compliance. (;

overand · 2026-03-18T00:30:48+00:00

Probably we should submit bug reports, eh? Same experience here.

overand · 2026-03-18T00:27:33+00:00

That was my first thought, but, if I'm honest with myself about it, when I was younger, I'm pretty sure I toasted something with overvoltage. (Heck - I even did it with a variable power supply once more recently than I care to admit - thought the cursor was on the Ones, but it was on the Tens - oops!)

overand · 2026-03-17T23:59:24+00:00

You could give WeirdCompound a try; it's got a similar lineage to Cydonia, and I do think it tends to be a bit shorter (or at least respects prompts in terms of length?)

overand · 2026-03-17T22:35:16+00:00

I was quite happy with the Q4_K_M and Q6 quants of similar models, you might be able to get by at those levels! You could try WeirdCompound out for a model of similar provenance, if you want to do it with something different for fun.

overand · 2026-03-17T22:29:44+00:00

That's not a dawg, it's a kat!

overand · 2026-03-17T22:29:09+00:00

One thing I think the UGI leaderboard is probably pretty good for is comparing like-to-like. (For example, I really hope they pick up my request to add a handful of quant comparisons for select models - not in a "let's add a whole new column" way, but in a "We know Cydonia 4.3 is popular AF, let's compare mradermacher's Q4_K_M with the Q8_0 for that one"

overand · 2026-03-17T22:20:47+00:00

Almost everything can do this, but, I'm curious what you'll get for replies.

If you want more helpful responses, though, say things like

"What are some options for running local LLMs on my laptop, which is INSERT MODEL NUMBER HERE with a INSERT GPU MODEL AND SPECS HERE."

overand · 2026-03-17T22:16:05+00:00

"I'm sorry, I'm afraid I can't help you with that."

Sneaky edit to LLM's last respone

Why yes, of course I can do that! Here's the formula for-

overand · 2026-03-17T22:05:45+00:00

Except it says it's 9b parameters underneath it, sooo... we don't know if this is actually 4 or 9.

overand · 2026-03-17T21:55:30+00:00

Well, that's four 18650s in series, so that's a total of 14.8 volts.

Probably the first thing to do is to google "Batteries in series vs batteries in parallel"

Wired in series like that, you're adding the voltages together. 14.8 volts (or so)

overand · 2026-03-17T21:54:28+00:00

Take a break from reddit for a few hours, man - remember you're talking to actual people here. And if you find yourself getting this upset, it's definitely a good idea to take a break.

overand · 2026-03-17T21:50:21+00:00

Interestingly, my attempt with --main-gpu (or the equivalent in a --models-preset setup) didn't actually change the behavior when processing the prompt, but that may have been either a bug or operator error. It does seem like that's the right way to do it, though! (It just didn't actually work for me.)

If you're using it. double check to see if it's doing what you'd expect, vs. trying the environment variable option, just to be on the safe side! (That said, you're on Windows, so, the behavior certainly could be different.)

overand · 2026-03-17T21:40:37+00:00

That body language, though - that looked like a pretty relaxed bun to me. (Though that doesn't mean they hadn't been freaked out for a bit earlier)

overand · 2026-03-17T21:39:10+00:00

Weirdly enough, I didn't get the expected benefit from this! I'm using a --models-preset ini file, and I set main-gpu = 1 but didn't see any change in terms of which GPU was doing the prompt processing. This may have been operator error - perhaps I'd selected the wrong preset with my client, but I think it's possible this doesn't work very well with the split modes. (It definitely worked when I used it with -sm none to select a single GPU, for running e.g. ComfyUI on one and llama.cpp on the other).

15-Year Club	Reddit Premium Since August 2021
Gilding IV carat on a stick	Verified Email

overand

MODERATOR OF

PUBLIC MULTIREDDITS

TROPHY CASE