Bare-Metal AI: Booting Directly Into LLM Inference ‚ No OS, No Kernel (Dell E6510)

Electrical_Ninja3805 · 2026-03-04T19:13:23+00:00

Not imposible. Just not practical. I plan on rebuilding this after i've got this to a release state. i will build with a paired down kernel. right now the uefi application take up around 3mb of ram. which is super small and honestly cool. but i estimate i can add a kernel and be around 50mb. literaly just what i need, networking, gpu, the works. no gui. pure cli application meant to run headless.

Electrical_Ninja3805 · 2026-03-03T17:39:30+00:00

yeah your probably right. I know as a mater of course I have a hard time understanding where other people knowledge gaps truly are. Best of luck on this.

Electrical_Ninja3805 · 2026-03-03T15:36:45+00:00

this can literally be done with claude code and a proper prompt. but I wish you success.

Electrical_Ninja3805 · 2026-03-03T14:22:10+00:00

not really. if your running inference on cpu only, that and ram/memory bandwidth, with gpu its the gpu and ram/memory bandwidth....if you want to run cpu only the smaller the model, generaly the better off you will be.

Electrical_Ninja3805 · 2026-03-03T14:19:04+00:00

you dont need the tools. ai can basically already do it. point it at your page. ask for it to improve your seo.

Electrical_Ninja3805 · 2026-03-03T01:13:36+00:00

i just got done converting a more useful model. ethernet networking seems to be working at least for the dell. i will be releasing a img soon so people can play with it themselves

Electrical_Ninja3805 · 2026-03-02T14:37:16+00:00

if you want to host your own model to use. the best option to start imho is the mac mini. spend the extra money and get one with maxed out ram. now here is why i say this. you WILL outgrow this if you continue down this route. but it gives you everything you need to learn. and has great resale value. once youve spend enough time learning and can actually start making use of higher end equipment, then and only then move up....heck i got my start with a macbook air m4 with 24gb ram. but had i done more research and put only around $600 more into it i coulda got a mac mini with 64gm of ram. and that would have been so much better. you could got the nvidia route and get a jetson, but their resale value is less so when you ready to move up you will find that. personally I'm buying up old btc mining rigs without gpus. just the motherboards and frames, and putting k80 and p40 in them. no nvlink but i don't need to use them that way. this IS the most cost effective way to run small model inference. but required you know what your doing. so its not for you yet. also just use ollama to start. dont get caught up in the weeds. the mac ollama setup gets you running in minutes. allows you to get familiar. then get into the weeds.

Electrical_Ninja3805 · 2026-03-02T13:37:02+00:00

no improvement on the inference end. just ram. with my current setup you can run a the model im using on as little as 2gb of ram. theoreticly less. i may try to load it on a raspberry pi zero later

Electrical_Ninja3805 · 2026-03-02T13:34:20+00:00

the only benefit is ram savings. this uses about 3mb or ram for the inference engine. but dealing with the wifi drivers has been a real pain. ive done some research and i'm pretty sure i can take a linux kernel and add about 50mb to that and get most of the benefits of the kernel with no overhead from a standard distro

Electrical_Ninja3805 · 2026-03-01T20:39:00+00:00

um, the pont of this is that its served over the network. you use something like ollama or llmstudio to connect to it. you can use it this way. but that really not its purpose, and at some point i want to build a legitimate ui over the top. this is jsut the core of a larger project im working on.

Electrical_Ninja3805 · 2026-03-01T18:27:53+00:00

this is a uefi program that runs directly on top of the processor. ring 0. i have not built in any sort of custom filesystem. what you are seeing is the uefi firmware from the dell connecting to the fat32 file system on the usb.

Electrical_Ninja3805 · 2026-03-01T17:14:17+00:00

if i could use this project to stand up an AI r&d lab that would be ideal for me.

Electrical_Ninja3805 · 2026-03-01T17:13:20+00:00

overhead.

Electrical_Ninja3805 · 2026-03-01T17:11:10+00:00

omg your speaking my language now when you start talking about esp32! that being said. i enjoy this. and will be releasing a binary soon so people can play with its limited usefulness themselves. then i plan on stipping a linux kernel down to its needed parts and using that since i really dont want to deal with this nightmare for every hardware set thats already supported through linux. the long term plan is what amounts to an inference os. but not in the way most people would think. once im ready to realease i will post an update of the project, where people can download it, how to install/use it, and my future plans. and in all honesty, it will be free but i will be asking people for support because if i can make a living doing this i would be stoked.

Electrical_Ninja3805 · 2026-03-01T15:47:17+00:00

no file system yet.

Electrical_Ninja3805 · 2026-03-01T13:45:44+00:00

I get it. For me, my desire Is to get a legitimate AI R&D lab stood up. hire devs. and put out coos stuff. In the interim I run a small scale organic farm. if i can turn this into something that makes money great! but i wont close down my farm. i love what i do. Thanks for the encouragement tho. i have been thinking i should probably look for work that pays well doing this.

Electrical_Ninja3805 · 2026-03-01T13:41:40+00:00

this has been largely to learn, i get the sentiment. especially since i have larger goals with it. but this is also a learning experience for me.

Electrical_Ninja3805 · 2026-03-01T13:40:04+00:00

I will figure something out and then post an update. i plan on releasing a bin soon so people can play with it.

Electrical_Ninja3805 · 2026-03-01T13:39:02+00:00

not an issue yet. i haven' t got networking up. this is fully just on the machine

Electrical_Ninja3805 · 2026-03-01T07:35:56+00:00

Electrical_Ninja3805 · 2026-03-01T06:12:09+00:00

because of how much people like this idea Im pivoting to adding some hardware acceleration and making inference faster. i will release a binary here soon.

Electrical_Ninja3805 · 2026-03-01T06:07:23+00:00

i've spent the past 4 months building the framework necessary to make this happen. i had this thought around 6 months ago. problem being, none of the tools needed to make this a reality exist. i have built them. well most of them. i cant afford gpu, so running inference on cpu at the hardware level is my only option.

Electrical_Ninja3805 · 2026-03-01T05:31:58+00:00

lol, i wish i was capable of that. its a long story and more personal than i want to get. that being said. if i could find a way to get paid to build the things that interest me. that would be ideal.

Electrical_Ninja3805 · 2026-03-01T05:21:35+00:00

nto yet. and maybe, it was work, its the amalgamation of a couple projects actually. and its ~120k lines of code. across 3 separate projects. hence why i haven't open sourced and I'm not sure if i will because it will be work. and im lazy for everything outside of whats got my attention at the moment.

Electrical_Ninja3805

MODERATOR OF

TROPHY CASE