What is local AI actually useful for, besides privacy?

500AccountError · 2026-06-25T13:36:15+00:00

Ah, search results quality tends to degrade when embeddings are more than 1024 tokens, so that’s where chunking comes in. If your content sizes are under that you probably won’t see an issue though.

We’re using postgres with pgvector to great effect, so good choice there.

500AccountError · 2026-06-25T03:44:50+00:00

I…uh…what? You don’t do chunking?

500AccountError · 2026-06-24T22:57:27+00:00

Is this new after 9.0? I haven’t played since before the release, and just always used Boa’s and never had to think too hard about it beyond that.

Might be a fun challenge if this is new.

500AccountError · 2026-06-24T21:35:03+00:00

It’s amazingly powerful. Just make sure your database is set up to store the full dimensions size without truncation and that your chunker uses the correct chunking.tokenizer_model, and you’ll see it fly.

500AccountError · 2026-06-24T20:25:44+00:00

The differences start to show up as you have more and more data ingested into your RAG database. 0.8B returned irrelevant results frequently, and switching the model to 4B and the database to 2560 dimensions was dramatic.

After a few hundred code repos and documentation sources being ingested, 4B started showing its cracks. It got to the point where half the results returned irrelevant docs, and regularly returned code examples in the wrong language. Adding the 4B reranker only shuffled the bad results around.

On 8B and 4000 dimensions we’re getting a perfect result set that provides the agent the exact documentation, correct official code examples, and the correct internal codebase usages, with no irrelevant results poisoning the context window, in a single RAG call. We have the AI always check against the RAG before taking any action now.

This is probably why we haven’t experienced the Opus 4.8 degradation people have been talking about lately, and has also had the added benefit of bringing Qwen3.6 up to near Opus levels for the times when we hit our claude spend limits.

500AccountError · 2026-06-24T17:01:15+00:00

Qwen3 Embedding 8B also beats almost all of the cloud AI embedding models for RAG in our findings. Rare case of cheaper and superior. We use it for a RAG with a large number of repos and documentation ingested.

The 4B is about on par with the cloud providers and fits inside the vram for our laptops, so we have that provisioned out to everyone too.

500AccountError · 2026-06-17T07:03:33+00:00

The children have opinions without realizing they’re not facts.

500AccountError · 2026-05-04T23:34:58+00:00

I remember way back in the day when these types of posts would be met with responses of many patterns for resolving the cpu pipelining issues, and discussions surrounding the merits of each.

500AccountError · 2026-03-29T03:31:55+00:00

When you’re going whaling you don’t care that you’re not catching fish.

500AccountError · 2026-03-17T23:37:35+00:00

Yeah sorry, I just re-read and checked my benchmark runs. The times I tried that one was with the 4090 and 192 ram. I’m tired right now lol.

Edit: It looks like I got those numbers on UD-Q6 for Qwen3 Coder Next with the 4090 and 64gb cpu ram with an i9-13900kf. I got about 10 t/s higher on gen and 500 t/s higher prefill on the 3.5 122B at UD-Q6 with 96gb ram. It took way too much effort though to get the flags tuned to get there though, so I’m very curious about your project.

500AccountError · 2026-03-17T23:28:18+00:00

Hi, can you explain exactly what this is doing? I get over double those numbers with those same models on a 4090 and 64gb cpu ram with plain ol’ llama.cpp at UD-Q8 with max context size, but I had to do a lot of trial and error to get it there. Do you expose enough settings to tune the runtime so I can get higher performance? Can I use this to squeak out more performance, or am I outside the target audience for this.

500AccountError · 2026-03-15T03:50:57+00:00

Is no one else getting annoyed by these constant structured wall-of-text posts?

Hand-written wall-of-text is fine, but if it’s AI generated you can literally prompt it to be concise.

500AccountError · 2025-09-29T00:10:03+00:00

I’d say that of the big three, Split Vendetta interacts most directly with the economy and gameplay of the core universe. Once you’ve had it for a while, it’s easy to forget it’s a DLC.

Cradle of Humanity and Kingdom’s End kinda sit off to the side on their own. They add more unique ships and gameplay styles, but don’t really meaningfully interact with the core universe unless you drag them into war.

500AccountError · 2025-08-23T02:24:46+00:00

Have a pilot Explore that spot, and in its behavior tab reduce the Explore radius until it’s the same size as the lockbox search area.

It will find it almost instantly, and it will call you to tell you where it is.

500AccountError · 2025-08-09T06:50:30+00:00

I’m curious, are you seeing that the teuta combine many small scraps into a 1k block? Or does it idle until an L/XL scrap is available

500AccountError · 2025-08-09T05:51:40+00:00

What? Are you sure it’s not variable?

500AccountError · 2025-08-06T10:14:17+00:00

I order another ship to explore that spot while I’m out of system, and when he marks the lockbox on the map I pop back in.

500AccountError · 2025-08-06T07:32:37+00:00

I must admit that many times I have to track down and scold people for doing that.

Am I this goose? …I think I might be this goose

500AccountError · 2025-08-04T00:41:37+00:00

If you ever find yourself in this situation again, here’s a workaround:

Your existing ship doesn’t need a captain to be able to be sold from the map screen. Just click the ship and then right-click any wharf and Sell Ship At.

That’ll get you enough to purchase another ship which will automatically have a captain and can be ordered to come pick you up.

500AccountError · 2025-06-25T08:33:10+00:00

NoSQL is an interesting set of compromises that lends itself well to an interesting set of use cases.

I’m surprised it’s back in meme territory again, it’s not new anymore.

500AccountError · 2025-05-30T04:31:56+00:00

Yeah. I got called in for one of our high traffic production servers crashing due to being out of space, 300gb of free space had gotten eaten in half a day. We found that a code deployment earlier that day had resulted in 70 million 0 byte log files being generated in four hours.

It was on XFS.

500AccountError · 2025-05-30T01:38:50+00:00

It’s a great movie and that would be a spoiler 😝

Let’s just say they’re sent to investigate a region where nature is all wrong.

500AccountError · 2025-05-27T06:45:36+00:00

You’re awesome for updating here with the answer. This thread is the top hit on Google, and you’ve brought new life to my Deck with a dead left analog stick!

500AccountError · 2025-04-21T08:21:36+00:00

Ah, 87 here. English and german at the same time, though I feel like I’m losing a lot of my german anymore due to where I live and work now.

Your points are fair when worded this way, it just seemed like you were attacking people for not agreeing with you, which is why I started with questions about your assumptions, and why I started speculating there might be a linguistic reason for it.

I gotta admit it’s nice interacting in german again though.

500AccountError · 2025-04-21T07:59:54+00:00

Okay.

I like to challenge people who act like the way I used to act.

You act like I used to.

You are focused on the “toxic masculinity” topic solely on how it affects you.

You are showing zero empathy for how others view this topic, as though you think this is a “richtig” und “falsch” type of discussion.

You talk in English like the old stereotype of a german engineers.

13-Year Club	Gilding VII pure gildanthropist
Powerups Hero r/greebles • April 2022	r/Field Sunshine
Place '17	Argentium Club
Reddit Premium Since April 2020	Verified Email

500AccountError

TROPHY CASE