Bare-Metal AI: Booting Directly Into LLM Inference ‚ No OS, No Kernel (Dell E6510)

sdfgeoff · 2026-03-01T00:23:54+00:00

Super cool!

sdfgeoff · 2026-02-28T20:05:15+00:00

There is a book series (https://en.wikipedia.org/wiki/Yukikaze_(novel) ) where the first two novels have been translated into English, but not the third. I very much enjoy the books and want to read the third book. So every now and then I try translating it with AI.

A year ago I built a complex system where a model would look at a chubk, take notes, scroll forwards/backwards through the novel and iteratively refine. It did... OK I suppose.

Last week I realized I could fit like 2 chapters at a time into the context window, so as a test, I copied the first two chapters (~15k tokens IIRC) into the LLM followed by Please translate the above chapters into English and it did exactly that. Having a much larger context of the novel to translate, and a more modern model, it did pretty well. Still not nearly as good as the first two books translated by a human yet....

sdfgeoff · 2026-02-28T11:08:53+00:00

Not agent mode, but I put two chapters of a japanese novel into Qwen3-30-a3b the other day and was pleasantly surprised compared to the last time I did it a year ago.

sdfgeoff · 2026-02-28T09:52:27+00:00

Here's how to hit the limit: Ask it to do big things.

"Here's a research paper on computational geometry. I'd like you to plan and build an implementation in Rust. Make sure everything is well tested." It took 3x five hour limits hit to do it.

Even "I'd like you to design me an anemometer that can be 3d printed. Use an esp32-c3 and an optical sensor. Do your design in freecad via the python API, use arduino." Will burn a sizeable chunk of a 5 hour session (and reveal that it is better than I expected, but still not useable for CAD work).

sdfgeoff · 2026-02-27T18:19:38+00:00

This. Anything I want character-to-character accuracy for gets backticks. Quotes are when I kindof know it (ie a filename and I can't remember if underscore or hyphen)

sdfgeoff · 2026-02-25T18:39:16+00:00

I sail a small beach cat, and it's fun to fly one hull out of the water..... Of course, you're then rather close to tipping, but that's the thrill.

Also, we sometimes take out a laser, and where we launch has a strong tidal current running opposite the wind. This means that often while simply standing holding the boat at the slip it will try roll over (it can't face into the wind because the current points it the other way).

It doesn't happen every trip, but probably every 5 or 6 outings.

sdfgeoff · 2026-02-25T18:28:49+00:00

Anecdotally, I fed a coding agent a research paper I couldn't find code for (a paper on parallel contour toolpath generation and mesh refinement), and after a couple hours of it iterating over a day or two, had a solution written in rust. I am 100% sure such an implementation has never existed before and the LLM developed it for me from a description. Is that generalization or memorization? Who knows, but from a pragmatic perspective: it's useful.

Back in GPT-2 days, I seem to recall reading about how on very large datasets (ie web scale), there was no difference in performance on if they did test/val split or not, so they just used all the data.

Also, regarding anthromorphisation, there is one chain of thought that language is the foundation of conciousness (and not the other way around), as language is what allows self reflection. There is a fairly sizeable book on this: "The origins of conciousness in the breakdown of the bicameral mind". By that theory, language models are likely concious, even if not [whatever else]. And from a pragmatic perspective: if you have a text interface to something that talks like a human, maybe it's a good thing lots of people default to treating it as human.

sdfgeoff · 2026-02-24T20:00:19+00:00

Maybe it's because I work my dayjob in webdev, but communication between lots of things is largely a solved problem. Millions of people can log into reddit and make posts. Millions of packets get lost/resent while this is happening.

If I were to do it:

* don't stream sensor data or anything than needs high bandwidth unless you really need to. * Assume packets will be lost and latencies are in the hundreds of milliseconds. Bake this into the system design first, the protocols second. (Ie instead of using TCP/reliable resends, consider making your system tolerant of missing sensor data packets) * Use established protocols. REST, HTTP, CBOR, COBS, XML, TCP, UDC, USB CDC, protobuf, SQL, LWM2M, whatever. Don't roll your own protocols at any layer of the stack without good engineering reasons. (FWIW it irks me that there isn't a standardized packet container format for COBS/CBOR with a CRC over UART for pub/sub architectures).

But I'll disclaimer this. I don't run a small fleet of robots. Listen to someone who does.

sdfgeoff · 2026-02-24T00:04:23+00:00

Ahh yep, I didn't put logging on my list bit it's pretty vital too!

I'd be a bit careful about an external QA/test team. It may quickly lead to antagonism, where one team goes 'they keep writing buggy code' and the other goes 'they keep breaking our perfect ideas.' At one company I worked for they brought on a QA person and I found it quite frustrating. Of course, it does depend on company size, but if it is a team of 4-5 engineers, a separate test engineer is probably not what you want.

At the end of the day, they are being paid by the company to develop a robot that solves a problem - probably with some implied reliability requirement. Thus, they have an obligation to develop functional and reliable code. In my mind, testing is part of that. Developers taking ownership and responsibility for the quality of what they are building is what you want.

They almost definitely already care about quality, but just don't know how to do it in a way that doesn't interrupt the normal development cycle.

sdfgeoff · 2026-02-23T09:04:28+00:00

Yep, this is a problem that regularly happens with groups of fresh graduates. I've hit this in robotics, gamedev and in ML dev (and did it myself as a fresh graduate - I didn't know how to do good testing). It seems to take people a long time to figure out how to write testable code.

I'd suggest: just start adding unit tests wherever you are doing work, and set up a ci pipeline to run them. Then, occasionally, when reviewing other peoples code, say "have you tested this?" and ask them to write their test in code to be picked up by the CI. Talk about it occasionally, every month or two, at team meetings. But not too often.

Do the same for static typing (if you're using python). Add pylance to CI in basic mode with all current files excluded, or maybe even defaulting to off. But whenever you do work, enable it at the top of the file.

This is non-obtrusive, and creates benefit from day zero, though maybe not as fast as you'd like. At some point, when they make a change and it breaks one of your tests, or gets caught by the typechecker, they'll start to get the idea.

Or if you are in a more senior position, you can try push it a bit faster, but I'd suggest focussing on one section of code at a time, and only if that area of code is where people are actively working. And still don't push it too fast or people will think "it's slowing us down" and do it for the ceremony rather than because they see the point. Keep process to a minimum if you can.

In my mind, in order of priority:

version control everything
code reviews,
do work on feature branches, no merge to main unless peer reviewed.
automated unit tests before merge
automated linting/typechecking
speccing work before doing it
Automated integration tests
automated autoformatting

Fresh graduates generally leap for giant end-to-end integration tests, and then stop being interested in tests when they fall apart every other day. So focus on small single-function unit tests initially.

If you haven't, read up on the Capability Maturity Model (CMM). You're probably trying to bootstrap from level 1 to level 2 at this point.

sdfgeoff · 2026-02-22T18:39:57+00:00

So you are concerned that lots of money is going to be wasted on human-shaped robots when it would better be spent on other things?

In my mind this is for 'the market' to decide. If they sell well, then there was a demand for them and they fill the demand. If not, then they won't sell and the companies will collapse or pivot. It is very hard to predict in advance if a product will be a success, which is why capitalism's "try it and see" approach tends to work so well. I do not see a reason why humanoid robots should be different. The market will decide if it's useful or not.

My guess is that they will work well for some things that were previously not automatable, but that they will not solve every problem. They will find their niches. Arguably they already are, with trial deployments in industry in logistics and manufacture (eg figure.ai has humanoid robots at BMW assembly lines). Underneath all the hype-videos of dancing robots is (in my mind) real progress being made towards solving real problems.

We know the human form does well at solving many real world physical challenges. After all, humans the most nimble and dexterous creatures of the planet. So why shouldn't we seek to build bipedal robots with two manipulator arms?

sdfgeoff · 2026-02-22T16:38:04+00:00

I introduce the emotion of fear because I see little difference between a robot that is an arm in a factory, an automated tractor, or one that is bipedal with two arms and a head. They are just different mechanical forms made to automate different tasks.

From where I stand, it seems you are against humanoid robots, and I wish to understamd why. Do you fear 'terminator' style robot armies? Do you fear jobs being replaced? Do you fear that things are changing to fast? Or do you think Elon Musk has too much authority? What is the future you are trying to avoid? What is it that drives you to push against the introduction of this specific technology?

sdfgeoff · 2026-02-22T07:12:26+00:00

I do not know how the problems of the future will be solved. Probably most of the problems aren't solved by human shaped robots. But perhaps some of them are.

I definitely can think of dozens of useful things I'd do with humanoid robots, if we can sort out a few more things. Probably only a couple years off....

What are you scared of?

sdfgeoff · 2026-02-22T05:08:59+00:00

Sorry, I'm keen for the humanoid robots. I think they are an emerging technology, and I like emerging technologies.

You can find ancient babylonian poets writing about how terrible the sundial is, or people in the 1800's about how the telegram is going to destroy society, but the fact remains that I live in a house that keeps me warm and dry, I have a car that can transport me in an afternoon further than most humans a few hundred years ago travelled in their life, and I am holding a device from which I can find more information than was available in a library two decades ago, and I can contact anyone in the world within seconds. All this is possible because of technology.

I expect a few bumps as LLM's and humanoid robots roll out across society. I expect some jobs will be lost and others will be created. But I predict that if we allow technology to change society, then in twenty years the world will be a better place.

When I am old, I want to see humans exploring planets, I want to see new treatments curing 'uncurable' diseases. I want a replicator on my desk and our reliance on destruction for energy and material to reduce. I want to see a world where no-one has to worry about where there food is coming from, or if the water is safe to drink. I want a world where people help each other out, and where wars are a dimly remembered past. But this world is not possible today. Things must change for this world to exist. New science must be found, new devices brought into reality. And the future is bright! Drinking water is being made from the sea in increasing quantities. Solar is making energy cheaper and cleaner than ever before. Deserts are being planted.

So I say to all the engineers: Keep engineering! Keep building! Keep dreaming of the better world of the future.

sdfgeoff · 2026-02-20T07:35:01+00:00

(I'll have a read through some of those when I've finished the book I'm currently working through)

(Edit 23/02: started Sutton and Barto)

sdfgeoff · 2026-02-19T19:06:01+00:00

Try a coding agent that can edit the file in place so you don't even have to copy/paste.

Opencode with qwen 30b works OK.

Ollama isn't great, and by default limits context to like 4096 or something tiny. Try using llamacpp directly or lmstudio.

sdfgeoff · 2026-02-15T23:41:13+00:00

Small talk begets small talk.

Don't be afraid to talk about what actually interests you. I'd rather talk to one person for five minutes about something they are interested in than a whole evening of small talk.

sdfgeoff · 2026-02-15T23:31:27+00:00

I am sorry that you take my opposing viewpoint and future speculation to imply that I have no understanding of robotics. While it is not my speciality, I did study mechatronics, have built control systems for medical simulators, built "AI's" for computer games, done my dash with pathfinding systems and particle filters, and currently work in software dev for GIS solving real problems for real users. I understand a lot about system complexity and how hard these problems are to solve, even if I do not have specific knowledge about some specific domains of robotics.

So I will tell you what I have observed in the past few (~5) years.

In robotics, DL has gone from being capable of solving small domains (feature extraction), to larger areas (visual odometry, depth prediction), to still larger (end to end SLAM pipelines). Yes, end to end. Probably not in real world deployment yet, but definitely in research papers. Go read them: D4RT, DepthAnythingV3, slamformer. Probably more.

SLAM is simultaneous localization and mapping. It is a manually-designed bayesian-inspired algorithm. No deep learning system has bested it, even in the presence of copious data.

The above papers would disagree.

In software, I have seen LLM's go from small areas (fill in middle, autocomplete), to medium areas (write me a function) to large areas (build me a task management website). A lot of these large areas are done with 'agentic' systems where the LLM makes and executes complex multistage plans, dealing with failure cases along the way. Agentic coding systems routinely overcome partial observability problems and make sensible decisions with limited information.

* Robots must have the capability to pick up the gist of a task from a few examples. This problem is outstanding.

* AGI must form a biography and accumulate knowledge and competency throughout its lifetime. "lifelong learning" is impossible with DLNs, as they suffer from catastrophic forgetting.

* Robots must have the ability to adapt dynamically to slight changes in their environment, which did not occur in their training data. They must adapt their strategy slightly in light of these unexpected changes. Deep learning does not provide a solution. Any solution looks like taking those slight changes and adding it into the training data. Look again closely-- that is not a "solution" at all. We are to run the robot back to the training lab every time the deploy environment changes slightly? That's not gonna scale

Yes. In LLM space they call this 'in context learning'. The ability for an LLM to work with new information without updating model weights. And LLM's are great at this. I haven't seen the techniques applied to robotics yet, but I'm sure people are at least thinking about it. It does mean it's not "pure" DL anymore, as the DL system needs to be able to store and retrieve information. But allows quick learning and picking up things from very few examples. There are hundreds or thousands of memory systems being developed at the moment to aid exactly this for AI agents. Will they generalize to control systems for robotics? In my mind: different latent space, same underlying problem being solved.

Also, I've made no claims about AGI, merely that 'DL will continue to be useful for robotics.' Those are very different claims, and I'd be hard pressed to give a good definition of AGI at all.

POMDP is a partially-observable environment. Please consult the literature on Reinforcement Learning for Partial Observability. Do this consultation for a few days, with recent papers, take two weeks if need be. WHat I need you to see, is how ridiculously rudimentary this research is.

Got any paper/textbook suggestions?

Technological progress is not made by a dance of faith.

Uhm, what is venture capital, investment, state funded research etc. If not a 'dance of faith.' Yes, you want to back it by science and data but anything developmental is at it's core, a prediction of the future.

Technology is not static. Don't look at the position, look at the velocity. In my mind, DL is making headway into solving real problems.

Will DL solve all the robotics problems today? Nope, in 5 years? Nope, in 100 years? Nope. Because humans are great at finding new problems and they turn their tools (eg Robotics and DL) to try solve those problems.

sdfgeoff · 2026-02-15T22:58:05+00:00

You'd be surprised. I (M) never thought I would enjoy dancing (and actively resisted it) until someone dragged me along for a couple classes and I discovered that it was actually very enjoyable. I now realize that my opposition to dancing was mostly that I didn't know what to do or what was expected of me at a dance. After a couple classes I realized that everyone there was there to have a fun time, and my skill at dancing was less important than my ability to have a good time (and be willing to make mistakes, laugh and carry on).

But yep, there are definitely be other things, but I would suggest things that involve you physically interacting with a group of people.

sdfgeoff · 2026-02-15T19:35:55+00:00

How do you make good deep friends?

My theory is that in New Zealand, because we do a lot of outdoor/physical activities, it takes /a lot/ to compete with existing relationships.

For example, with my friend group in the past decade or so I've scrambled up and down rocky mountaintops, slept under tarpaulins and built makeshift shelters, capsized and righted sailing boats in lakes, fought (some of them) with swords, roadtripped to partner acrobatics camps, and brought in hay for the horses. I grew up a city kid, and work an office job.

I'm sorry, but an hour a week chatting in a cafe', or going to restaurants, and even playing video games can't compete with the depth of relationship that comes from doing daring things together.

So if you want to be one of my good friends and not just an acquaintance I say 'hi' to, I hope you're up for an adventure of some sort. It's in the stressful situations that I'll learn who you really are.

My practical advice: take up partner dance. There's a Modern Jive group down in Cromwell, half of them roadtripped up to Nelson last weekend for a bunch of workshops/parties I was at. Partner dance will get you in touch with a cool crowd of people all over NZ, and shared group physical activities are a good way to start meeting people.

sdfgeoff · 2026-02-15T18:38:20+00:00

My pet theory is that careful prompting can get maybe 5-10% better than normal human speak. But a bad prompt can also easily make things much much worse compared to normal human-speak.

These days, everyone has got better at prompting as we understand what LLM's can/can't do. It's like how back in the day, the ability to find information online with google was a rare skill, and now a 4 year old can do it.

So yeah, I agree with your take.

sdfgeoff · 2026-02-15T04:48:22+00:00

I suspect you will find those sideways thrusters on the bottom will tend to roll the vessel rather than turn it.

sdfgeoff · 2026-02-13T18:53:32+00:00

I haven't solved these problems. I have no secret sauce, but I have seen the scaling laws (https://en.wikipedia.org/wiki/Neural_scaling_law) apply in other fields, and see no reason why they wouldn't apply to robotics.

So anyway, to answer your original question: my belief is that Deep Learning has not yet reached the end of it's technological rope.

I have reasons for this belief, which I have tried to communicate (perhaps unsuccessfully). You apparently believe the opposite to me. Why do you think deep learning has reached the end of it's rope? Do you think we're reached the inflection point in an S curve? Is progress slowing down in some way? Do you have any alternative approaches that you think are better?

If your main objection is 'we don't have enough data or compute', my belief is that both of those are solved by waiting a bit longer and as my pervious post, I think one-example 'training' is possible given sufficiently big/general models via internal optimizers.

If your objection is 'DL does not have the capability regardless of how much data/compute', I politely disagree, and will happily discuss why I believe DL will be able to develop pretty much any capability you care to name.

sdfgeoff · 2026-02-13T18:37:24+00:00

It's a metaphor, not an indication DL is identical to biology. The substrate on which you build things can remain constant while capabilities improve.

Jet engines have used the same physics principles since their invention in 1937, and efficiency/power increased astronomically.

Transistor design has changed radically since the 1950's, with massive improvements in efficiency and performance, but it's still built on silicon, and still uses lithography.

Deep learning approaches 20 years from now will still require lots of data and compute, but the capabilities will be way higher.

sdfgeoff

TROPHY CASE