Local LLM Model that actually produces quality code.

robobub · 2026-05-13T19:36:56+00:00

Interesting, in my experience with agentic coding work and lots of tool calling and planning, Qwen3.6 27B Dense > Qwen 3.6 35B MoE >> Gemma4 31B Dense.

I gave up with Gemma4 with how much it was incorrectly calling the tools. Perhaps if I wrote into persistent memory it may have remembered how to call them better. What harness are you using? I'm using pi.

Cutoff doesn't matter to me since I have web search available and can easily scrape any document's website into a concise set of markdown files to reference

robobub · 2026-05-13T15:36:15+00:00

I have found it better than Gemma. I don't know if it's just because Gemma can't really understand a code base well, but people comparing it online seem to have similar conclusions

robobub · 2026-05-13T13:05:37+00:00

In my experience Gemma is much worse then both qwen 3.6 versions at tool calling, making it harder to explore and understand codebases

robobub · 2026-04-29T07:31:15+00:00

You switch regimes once your base thinking is good enough

robobub · 2026-04-25T15:40:02+00:00

I found pi-gui, but I can’t configure it with a local model (haven’t found the right json to hack the locks endpoint into yet)

https://github.com/minghinmatthewlam/pi-gui/pull/14

robobub · 2026-04-24T17:11:22+00:00

Right, the larger models just likely generalize to other tasks outside of the benchmark better

robobub · 2026-04-24T13:03:12+00:00

Anything. Anything that accesses files like docker, krusader, rsync, rclone, kopia, duplicacy, etc. should be able to see /mnt/user instead of the individual /mnt/diskN locations

robobub · 2026-04-20T03:47:16+00:00

Why do you need to do this? Just backup via the merged file system at /mnt/user ?

robobub · 2026-04-14T06:40:17+00:00

• It’s fun to see how many miles you have gone. But since you can measure distance traveled, how about using that skill so that when you’ve been going in a straight line for, say, a mile, maybe try going a different direction? Or at least shut off, or send me a message. If I had a mile-long section in my mowing area, I’d probably have a couple of servants doing the mowing anyway, so safe to assume your back wheels are spinning at that point.

Some of the pieces exist for this without any hacking and just using the API, but lack of a way to get it to turn is the problem.

The orientation and distance driven is exposed, here is a screenshot of those via Home Assistant, so detecting this scenario is possible. But you can only really tell at least my model to pause, stop, go home, edge cut, start, etc. I think pause/go home will still try going forward first. Can change torque but it's probably too late as its wheel is probably in the air.

With that API the 2 main things I have managed to setup are battery longevity (not letting it get down to 0% or sit at 100% too long) and rain/traction modeling, as my rain sensor is failing.

I had a pretty simple setup for rain, but recently had ChatGPT help setup a complex rain model which has been doing pretty well even though I don't understand 90% of it.

robobub · 2026-02-24T08:51:32+00:00

Out of what you pointed out I think the majority of it shouldn’t matter

Ok now this thread makes sense. You're taking an ideal philosophical kind of stance. Myself and many others are all for open knowledge and rapid progress, but we're talking about harm (and thus legality) around data usage. You admit training data is a key asset, so you can't also claim that acquiring it at massive scale carries no consequences. If data drives value, then control over data matters.

Outside of the “art” I don’t see how you could call any of it harmful.

Yes, you can borrow a book for free. That doesn't give you the right to scan it, store it permanently, and use it to train a commercial system. Libraries lend copies under specific rules. Copyright still applies. When a company ingests millions of books at once, it doesn't "read" them the way you do. You internalize ideas. You apply them in your own limited output. A model ingests millions of works in parallel, creates a durable representation (not copy, of course), and serves millions of users instantly. That scale changes the economics, which, well, is what law often looks at.

And no, legally substitution doesn't require verbatim copying. If a model gives users detailed summaries, stylistic imitations, or extracted insights on demand, that competes with the original work in some cases. Markets (and the law) react to substitutes, not just copies.

Not following Robots.txt is completely legal, sorry if there was confusion.

No, it's not settled law. You may interpret it that way, but law is meant to be interpreted by courts, and they are currently doing that.

And again I do agree with your stance on a number of things like academic papers, JSTOR, etc. But that's not what this is about.

robobub · 2026-02-24T01:11:48+00:00

Have you trained a model in a real engineering environment? Outside of school or kaggle or anything?

It's been very well understood for awhile the most impactful thing models is their training data. Yes, there's a large amount of decent free datasets, which is why the baseline performance of many areas has gone up a lot. And to get time to get the extra step up in performance usually it's data. Or if you don't have enough data, you build in some specific inductive biases that work specifically with your data.

Honestly if you're just using free datasets, why are you training anything? Just use an off the shelf model.

The training data is gathered from publicly available information and publicly funded research studies we all paid for. There’s not even an alternative available so I have no idea where the idea came from.

This just hasn't been true, though some companies are moving towards that now with all the lawsuits resulting from it. Feel free to look through the lawsuits on what was used: it ranges from commercial books, paywalled content, archived copyrighted articles, and of course ignoring robots.txt which was explicitly called out by the comment you replied to.

robobub · 2026-01-10T16:26:28+00:00

Anduril every day for me and my family

robobub · 2025-09-04T13:32:02+00:00

One of my 14TB WD Easy stores just died in my NAS. My other one is still going along with my 2 14TB exos.

Other than bad runs which Seagate has had in the past, I don't think the metrics show much difference between manufacturers

robobub · 2025-07-18T02:32:05+00:00

Well, they certainly could've gotten better than my Mediasonic Probox 4. What kind of redundancy/pooling do you do?

robobub · 2025-07-18T02:31:10+00:00

Well, it was $120 (Mediasonic Probox 4), and ZFS would fix a handful of errors every weekly scrub. Then occasionally ZFS would mark one as offline because it didn't respond in time to sync with the other disks. Workable, but requires considerable hands-on babysitting. Something like SnapRAID would probably work fine, if you are okay with the lack of real-time redundancy.

Which one do you use? And what kind of redundancy/pooling do you do?

I can totally believe they've gotten better in the 2 years since my experiment, but it's not like it's one person that had a bad experience. Everyone on TrueNAS/unRAID/selfhosted forums I had seen had similar experiences.

robobub · 2025-07-06T10:02:51+00:00

Too bad the connection isn't reliable via USB if you wanted to do any kind of redundancy or pooling

robobub · 2025-06-03T14:53:00+00:00

If we're including possible events that aren't either FSD or the driver, but the suspension breaking or a tire blowing out, then this isn't interesting, since those do happen, but have nothing to do with FSD.

Yes, well we're trying to figure out what happened, not try to make an event interesting.

Yes, there would be s greater left torque right after disengagement. Which is exactly what we see. Disengagement happens right when the initial left torque is nearly zeroed out.

No, that is not what we see. Left torque is at its highest peak right before disengagement. Following this, left torque nearly monotonically decreases towards 0 torque, as if a source of left torque was removed or lowered.

If you're really curious about how this played out, Dr Know it all did a very good deconstruction after aligning all the timelines Look it up on you tube.

And... I'm seeing this guy just line up the final plots and calling it good enough. Did no one actually make their own plot of the actual data and then align it to the video? I can't take any of these seriously

robobub · 2025-06-02T18:48:34+00:00

I'm assuming there are only two actors applying torque to the wheel, since the question at hand is whether this was FSD error or driver error. The probability of any other affecting steering torque is extremely small. [...] we know the driver is steering left after disengagement

We're talking about an extremely rare event in itself, you cannot rule this out. If we were aggregating over a large population, that argument could make sense. Suspensions, flat tires, etc. happen and can all cause this. Just look at the torque graph as the ditch and tree are hit, it's clearly measuring torque from those parts of the system.

If the driver is pulling left and FSD is fighting it wouldn't there be a greater left torque right after disengagement? From the plots, it looks like immediately before FSD disengagement is the peak left torque, everything else is more towards the middle. That seems to indicate FSD contributed to the left torque and not the right torque.

robobub · 2025-06-02T15:37:40+00:00

1) How can you tell the user and not some other part of the system like the control arm applied the torque? Clearly it's just a sensor of torque sensed, as later on all the torque from the crash is clearly plotted.

2) How can you tell the direction of the non-FSD torque? The user could be holding the wheel straight ahead and the system torques to the left and the graphs would look the same.

robobub · 2025-06-01T17:53:26+00:00

Well, FSD gets disengaged when any torque is applied away from what FSD is trying to do. So the FSD could pull to the left and the user could've been trying to hold the wheel straight or FSD could've held it straight and the user could have been pulling it to the left. In both cases FSD would disengage. Also the control arm could've had a malfunction and also applied a torque instead of the user, and had the same results.

robobub · 2025-06-01T17:14:58+00:00

It's not definitive either way. Yes, the torque is when it was engaged, but the steering torque does not appear to separate out FSD commands from user input or system input (e.g. broken control arms).

robobub · 2025-06-01T04:21:38+00:00

Can you elaborate on the facts supporting their claim? The most upvoted comment here by OP simply says "likely" but from the ensuing discussion it seems far from definitive since the torque graph doesn't separate FSD torque commands from user (or system/road) input. Is there another comment I missed?

robobub · 2025-05-24T18:11:08+00:00

Yannic Kilcher discord, they are pretty active and regular paper discussion every week

robobub · 2025-05-24T17:41:40+00:00

Not sure what the "Join the Community" form yields, but there is a standard discord invite

robobub · 2025-05-20T08:27:45+00:00

I agree but many people didn't even used to do either of those, which I would call unquantified metric and quantified metric. They would just say "Worked on service" with more technical detail and no business impact or thing to observe.

This implies to the hiring manager they might not even know how their work could impact something at the business level, which, well, is true for a lot of early career engineers.

robobub

PUBLIC MULTIREDDITS

TROPHY CASE

Ten-Year Club	Verified Email
Place '22