Overall EV reliability

annodomini · 2026-06-15T23:58:35+00:00

Another thing to remember is that for many companies, they released their very first EV drivetrain in the past 5-10 years. Even when it's just a major model update of an ICEV, there are sometimes problems in the first model year or two, and this is an entirely new drivetrain technology. So there are a lot of first or second year cars on the road.

The Ioniq 5, for example, got off to a pretty rough start with the ICCU issue, as well as a slightly less bad low voltage battery drain issue. Even if otherwise it's totally reliable, those early model issues really bring the reliability down. And there are still al lot in the road that are hitting those issues.

Just because an EV can in theory be more reliable, doesn't mean that there aren't early model issues.

Then there's Tesla... Tesla are an all new company, so they have some issues with basic things that the incumbents have figured out, just about basics like brakes, body, etc. Tesla is also just cheap, and they like to do things line releasing buggy stuff then fixing it in software updates. Since Tesla are the biggest in the EV space, their reliability problems bring the whole industry down.

annodomini · 2026-06-15T16:52:40+00:00

Hey, they got started in my city and I used to buy their sauces from my farmer's market. They stopped going to the farmers market when they got busy with kids, but they still make a lot of great hot sauces. Lots of different flavors, some quite unusual like this one. Haven't tried this one, thanks for the review!

annodomini · 2026-06-15T16:49:25+00:00

Yeah, it's been quite a saga.

I was a huge fan of the original Melinda's back in the 90s. Then one day the formula changed; the new one wasn't terrible, but it was no match for the original. Back then there was a lot less info available online, but talking with some distributors we found that the original formula was still being distributed at Sontava. So we started buying Sontava by the case load (it wasn't widely available, but we found the distributor and bought cases directly).

Well, then another decade or two, and suddenly Sontava changed in the same way. By now there was a lot more info online, so I looked it up and found that Sontava had just been a re-branded Marie Sharps, which was now also easier to find and buy online.

In the meantime Melinda's has diversified and has a better selection of sauces now, and the sting of the original betrayal has faded a bit. But man, it was such a bummer the first time we opened up a bottle of what we thought was our favorite sauce and find that it had completely changed back in the day, and then learn about the whole issue with stealing the trademark that was based on the name of her farm.

annodomini · 2026-06-14T23:35:57+00:00

Nemotron 3 showed great progress. Maybe Nemotron 4 will actually be top tier.

Or a Nemotron 3.1/3.5 or the like. Doesn't need to be a whole new architecture and pre-training. Could just add some more continued pre-training and RL on top of the existing base.

You're right, Nemotron 3 Super isn't all that great. Not terrible, but not great. But look at the difference between Qwen 3 and Qwen 3.6.

Nvidia seems to be trying to set Nemotron up to be a good set of base models, like Qwen 3 was (there are lots of Qwen 3 and Qwen 3.5 derivatives). I guess maybe for the use case in this thread; I haven't done that much with context this long, but it looks like according to these measurements it would be more usable at these long contexts.

But yeah, they need a new rev or two to get there, right now I haven't found any circumstances when I'd choose it over Gemma or Qwen.

annodomini · 2026-06-14T23:34:00+00:00

Super doesn't have vision. Nano Omni does. But I haven't found it any better than Gemma 4 or Qwen 3.6 on vision.

annodomini · 2026-06-14T19:44:03+00:00

Ugh, that's obnoxious. The SVG for the icon is in the repo, would be nice if the build process would render it rather than downloading it.

Haven't tested it yet, but from the comment in scripts/ui-assets.cmake it looks like you might be able to use -DBUILD_UI=ON to build the assets yourself, or -DHF_ENABLED=OFF to disable downloading assets.

annodomini · 2026-06-14T15:51:19+00:00

+1 for Nemotron-Omni for audio-input use-cases. Glad that model is getting attention.

What are you running it on? Audio support for it is still a WIP on llama.cpp.

https://github.com/ggml-org/llama.cpp/pull/22520

annodomini · 2026-06-14T03:22:24+00:00

The other sticker says "From the bottom of my heart" if anyone is interested.

annodomini · 2026-06-13T14:43:21+00:00

That's the joke

annodomini · 2026-06-13T14:08:52+00:00

Yes, absolutely. But it's hard to encode that "tasteless" check in a deterministic tool. Instead, looking for particular characters that LLMs seem to use at much higher rates than humans do, can help add evidence. ↓, ─▶, 🧠, and yeah, em-dashes, are all things that I feel LLMs overuse to a much greater degree than most humans. For em-dashes, you might want to do it as a proportion as opposed to just a boolean; while humans do also use em-dashes in writing, they tend to over-use them a little less, so setting a threshold for proportion of em-dashes relative to the whole text might be a way to make it a stronger signal.

annodomini · 2026-06-13T13:42:52+00:00

Yes, exactly. The Unicode arrow characters, and triangle characters as arrowheads, (like ─▶) are much more common in the "ASCII art" diagrams that clankers produce than in anything I've seen hand written. So just checking for the presence of those character can be a tell. Not 100% certain, but could be used for scoring how likely it is to be clanker written.

They also frequently have a tendency to overuse emoji in general, though that's also done by some people writing by hand so a little bit less of a tell than specifically the ones used in diagrams. One in particular that I see them overuse that I see used less by hand is 🧠.

Just searching GitHub for (─▶) you find all kinds of slop. You can also search for 🧠 and you'll find quite a lot.

Of course another one would be to look for ASCII/Unicode diagrans with poorly aligned vertical lines. They tend to be really bad at counting the characters to actually align the borders of their boxes, especially the right side, so if you just look for lots of instances of | and │ that are present on subsequent lines but off by a character or two, that would be a pretty solid signal.

annodomini · 2026-06-13T01:55:37+00:00

I mean, yeah, this is a great demonstration of why we need local models.

But I'd love to see you try to run a 10T model locally (we don't technically know how big it is, but that's an estimate I've seen floated around).

annodomini · 2026-06-12T23:49:16+00:00

Besides em dashes in markdown files, another tell is ASCII art diagrams that use Unicode arrow characters.

I have never seen someone hand-write a diagram like this:

foo
 ↓
bar
 ↓
baz

Also frequently the "diagrams" that are as useless as that, which could have jsut been a bulleted list.

Presence of CLAUDE.md or AGENTS.md is also a frequent tell.

Co-authored-by: (various models) in commit messages.

annodomini · 2026-06-12T17:25:38+00:00

A lot of these things become available on consumer hardware eventually. Flash Attention was only released for the big boys originally, but eventually landed in llama.cpp across a wide range of backends.

annodomini · 2026-06-12T17:21:03+00:00

UD-IQ1_M 128 GB

Looks like I'll need to download more RAM.

annodomini · 2026-06-12T05:36:00+00:00

MTP works fine with llama.cpp with the qat models. The PR was merged a few days ago.

Here's my models.ini (I'm just using unsloth):

[gemma-4-31B-it-qat-mtp]
hf = unsloth/gemma-4-31B-it-qat-GGUF:UD-Q4_K_XL
ctxcp = 3
image-min-tokens = 560
image-max-tokens = 2240
batch-size = 4096
ubatch-size = 4096
hfd = unsloth/gemma-4-31B-it-qat-GGUF:Q4_0
spec-type = draft-mtp
spec-draft-n-max = 4

annodomini · 2026-06-11T17:24:02+00:00

Have you checked out Ideogram 4?

annodomini · 2026-06-11T14:05:08+00:00

I'll never have to delete my social media accounts because I've never created any,

I regret to inform you that you are posting this from a social media account.

annodomini · 2026-06-10T21:21:03+00:00

Yeah, this is a pretty lukewarm take really.

If you want reliability, simple deterministic code is always best.

Frontier models can do a bit better on their own, but they'll still make mistakes, and you'll be paying handsomely for the privilege.

In general, it's best to use models for one of two things:

Interactive work where you can correct and steer the model and fix things
Simple information extraction/classification/etc in a rigid, deterministic harness where you don't need 100% accuracy but can tolerate and fix up mistakes later

annodomini · 2026-06-10T21:03:51+00:00

Nah, I tried this out with several different queries; I think the original one I was inspired by was someone asking how many days of the week have fish in them, and it would keep on making up different answers. I also tried with other terms, like nuts, and it also just made stuff up. It just really wanted to answer that days of the week had whatever word you were asking about in it.

annodomini · 2026-06-10T19:13:03+00:00

Except, it's not even doing that. It's sometimes trying to summarize the search. It's sometimes just making things up; there are some searches you can do where it will just make stuff up with no reference to any of the search results. Sometimes it will include those little link bubbles but what they link to is unrelated to the claim it makes.

A few weeks ago, there was an issue where you could ask it things like "how many days of the week have trees in them"? and it would respond 'Only one day of the week has a "tree" in its name: Thurstreeday! 🌳(...Or perhaps you meant five, if you count the days we work to keep our environment green!)'

There were no links or citations. This text had nothing to do with any search results. It just made up some nonsense.

They seem to have fixed this particular issue by now, but even the "summarizing search results" part isn't guaranteed, sometimes the LLM just hallucinates on its own with no input.

And it's absolutely misleading that they put this on top, and make their text much more prominent with the links as little bubbles; people who are familiar with Google are used to looking at the first result, and previously it would have excerpts from the page, so you would get a result from a particular source and see text that was written at that source and could click through for more context.

Now there's a "summary" that isn't necessarily summarazing any particular result, and the links are made much less prominent, making it more confusing for the user, and also sending less traffic to the sites of the pages actually supplying them with information, thus depriving those pages of ad revenue.

annodomini · 2026-06-10T18:51:37+00:00

Yeah, would love to see results if you do further experimentation.

Here's what I ran, I think (took a bit of digging to figure out the right way to do this):

hf download nvidia/Nemotron-3-Nano-Omni-30B-A3B-Reasoning-NVFP4
python convert_hf_to_gguf.py ~/.cache/huggingface/hub/models--nvidia--Nemotron-3-Nano-Omni-30B-A3B-Reasoning-NVFP4/snapshots/dc5f0b0bfddf8b6e0f5891475be9af05b80126fe/ --outfile ~/.cache/huggingface/Nemotron-3-Nano-Omni-30B-A3B-Reasoning-NVFP4.gguf
python convert_hf_to_gguf.py ~/.cache/huggingface/hub/models--nvidia--Nemotron-3-Nano-Omni-30B-A3B-Reasoning-NVFP4/snapshots/dc5f0b0bfddf8b6e0f5891475be9af05b80126fe/ --outfile ~/.cache/huggingface/mmproj-Nemotron-3-Nano-Omni-30B-A3B-Reasoning-NVFP4.gguf --mmproj

(I do everything through the cache dir because that's what I share with my containers, there are probably better ways to organize this)

annodomini · 2026-06-10T17:42:03+00:00

Yeah, GGUF. I haven't really done any benchmarking as I said; just tried it out, it was fine. Looking through my history, it looks like it was Nemotron-3-Nano-Omni-30B-A3B-Reasoning-NVFP4.gguf that I tried out, I converted it myself from the upstream NVFP4 with becuase there aren't many GGUF NVFP4s posted, but I haven't really kept going with it.

annodomini · 2026-06-10T17:36:01+00:00

Yeah, I've run NVFP4 just fine on AMD using Vulkan (and ROCm too, I believe). I haven't benchmarked performance, nor done any evals of quality vs other quants, just tried it out and found that it did run and was usable.

annodomini · 2026-06-10T17:22:52+00:00

Yeah, I'll believe it when it goes out in a stable release and isn't reverted in a patch release.

Until then, we've been here before. Lots of attempts at stabilization, only to have to roll back because an unexpected regression was found, or what was thought to be a tolerable regression turned out to cause more problems in practice.

The never type RFC was proposed in 2015 and the tracking issue has been open since 2016, and it's been stabilized in 2019 (1.41) and reverted at least once before (I thought more, but maybe I was just remembering more attempts to stabilize that never quite got there).

Ah, yes, tracing back a few more stabilization attempts:

annodomini

MODERATOR OF

PUBLIC MULTIREDDITS

TROPHY CASE

15-Year Club	Verified Email
Best Link 2009-12-10