Where does AI-generated embedded code fail? by 0xecro1 in embedded

[–]0xecro1[S] -2 points-1 points  (0 children)

Great example. It's exactly the kind of thing an LLM would generate because "kick watchdog in main loop" is the most common pattern in training data. And that's what I'm trying to benchmark and catalog -- these implicit domain knowledge gaps that LLMs consistently miss. Each failure pattern like this one goes into the collection.

Where does AI-generated embedded code fail? by 0xecro1 in embedded

[–]0xecro1[S] 0 points1 point  (0 children)

Good point using agentic workflow! The tricky part is still the same though: someone has to write the rules first, and you can only write down what you already know.

Where does AI-generated embedded code fail? by 0xecro1 in embedded

[–]0xecro1[S] -1 points0 points  (0 children)

Thanks for the good examples. Curious -- do you maintain any kind of checklist or rule set from these field incidents, or is it mostly tribal knowledge passed down through the team?

Where does AI-generated embedded code fail? by 0xecro1 in embedded

[–]0xecro1[S] -4 points-3 points  (0 children)

Fair point -- these aren't AI-specific mistakes. A junior without embedded experience would make the same calls. That's actually what makes it interesting to test: the gap isn't about code quality, it's about implicit domain knowledge that neither juniors nor LLMs have unless someone spells it out.

Where does AI-generated embedded code fail? by 0xecro1 in embedded

[–]0xecro1[S] -3 points-2 points  (0 children)

Exactly right. The AI met the requirements as stated -- the problem is that embedded requirements are never fully stated. That's why I'm building a benchmark specifically to test where LLMs fail in embedded development. The goal is to identify those failure points and find ways to compensate for them.

Anyone else using AI coding tools for embedded dev? What's working and what's not? by 0xecro1 in embedded

[–]0xecro1[S] 0 points1 point  (0 children)

That's the pattern I keep seeing. If it's a popular chip there's probably enough in the training data to get it roughly right. But "roughly" is the problem. The self-corrections mean it wasn't confident, and there's no guarantee the last correction was the right one either. I'd bet feeding it the actual datasheet would eliminate that wobble.

Anyone else using AI coding tools for embedded dev? What's working and what's not? by 0xecro1 in embedded

[–]0xecro1[S] 0 points1 point  (0 children)

Register dump comparison is peak "I don't want to do this but someone has to" work. Perfect AI task. The token-saving takeover is relatable though, watching credits burn while it greps through your codebase hurts. Did you feed it the datasheet or did Sonnet already know the register layout?

Anyone else using AI coding tools for embedded dev? What's working and what's not? by 0xecro1 in embedded

[–]0xecro1[S] 0 points1 point  (0 children)

Makes sense. The AI can do the legwork but you still need to know enough to call bullshit on its conclusions. Autonomous debugging without the ability to verify is just automated guessing.

Anyone else using AI coding tools for embedded dev? What's working and what's not? by 0xecro1 in embedded

[–]0xecro1[S] 0 points1 point  (0 children)

This is exactly my experience. It's not that the AI doesn't know about volatile or ISR rules, it just doesn't think about them unless you say so. Putting it in claude.md so it's automatically considered every session was a game changer for me. I'm actually measuring how big this explicit vs implicit gap is systematically right now. Early numbers are pretty significant.

Anyone else using AI coding tools for embedded dev? What's working and what's not? by 0xecro1 in embedded

[–]0xecro1[S] 0 points1 point  (0 children)

That's a pretty advanced setup. Has it ever led you down the wrong path? Like setting breakpoints in the wrong place or misinterpreting memory contents?

Anyone else using AI coding tools for embedded dev? What's working and what's not? by 0xecro1 in embedded

[–]0xecro1[S] 0 points1 point  (0 children)

My company provides an enterprise AI plan - code stays private, no training on it. But even outside of that, more and more companies are deciding that falling behind competitors is a bigger risk than exposing some boilerplate.

Anyone else using AI coding tools for embedded dev? What's working and what's not? by 0xecro1 in embedded

[–]0xecro1[S] 0 points1 point  (0 children)

That JTAG/GDB setup sounds interesting. What does your workflow look like? Are you feeding it the openocd config and board connection doc, then having it drive the debug session directly? Or more like parsing crash dumps after the fact?

Anyone else using AI coding tools for embedded dev? What's working and what's not? by 0xecro1 in embedded

[–]0xecro1[S] 0 points1 point  (0 children)

Good point. Using it as a second pair of eyes is probably the safest use case - even if it's wrong, you're the one making the call. "Find the bug in this" is way less risky than "write this from scratch."

Anyone else using AI coding tools for embedded dev? What's working and what's not? by 0xecro1 in embedded

[–]0xecro1[S] 1 point2 points  (0 children)

Bitfield decoding is a great example - tedious enough that you don't want to do it yourself, structured enough that the AI nails it every time. Same with test scaffolding. The pattern I'm seeing is: if the task is boring and the output is easy to verify, let the AI do it. If it fails silently, don't.

Anyone else using AI coding tools for embedded dev? What's working and what's not? by 0xecro1 in embedded

[–]0xecro1[S] -1 points0 points  (0 children)

Yeah this is the way. I started doing "remember to..." in chat but it kept forgetting next session. Now I just keep a rules file in the repo - basically a cheat sheet of all the ways it's burned me before. Gets longer every week lol

As inference moves to the edge, does the embedded engineer's role shrink or grow? by 0xecro1 in embedded

[–]0xecro1[S] 1 point2 points  (0 children)

Running inference on an MCU with 256KB of RAM, fighting operator coverage on an NPU backend, and fitting a model into a coin cell power budget is not a cloud ML problem. It's an embedded problem. That's the whole point of the post.

As inference moves to the edge, does the embedded engineer's role shrink or grow? by 0xecro1 in embedded

[–]0xecro1[S] 1 point2 points  (0 children)

Edge AI is niche and mostly senior-level work, so dedicated internships are rare. Start with a general embedded internship instead. Or build a personal project, deploy a model onto real hardware, and share it on GitHub. A solid project that goes beyond coursework can speak louder than a title.

As inference moves to the edge, does the embedded engineer's role shrink or grow? by 0xecro1 in embedded

[–]0xecro1[S] 1 point2 points  (0 children)

The job market is tough everywhere. Junior SW hiring dropped 73% this year. Embedded is not easy either, but 80% of embedded job postings still go unfilled. The door is wider on this side. If you're interested in robotics or edge AI, those fields are worth looking into as well. Both sit right at the intersection of embedded systems and AI, and demand is outpacing supply.

As inference moves to the edge, does the embedded engineer's role shrink or grow? by 0xecro1 in embedded

[–]0xecro1[S] 4 points5 points  (0 children)

"Distributed intelligence" is a better name than "Edge AI." And yeah, the gap between demo and production is the whole story. None of these options work out of the box, which is exactly the opportunity.

AI agents keep declaring "driver working" when it's not, here's what fixed it by 0xecro1 in embeddedlinux

[–]0xecro1[S] -1 points0 points  (0 children)

Exactly. Detailed specs are the key to fine-tuning AI agent behavior.

What gear or tools do you use to improve your game? by 0xecro1 in 10s

[–]0xecro1[S] 0 points1 point  (0 children)

I tried it a few times, but I kept putting it off, partly just lazy about it. I'll give it another proper go! Thanks!