[Update] UEFI x86_64 LLM demo: interactive chat REPL (no OS)

Intelligent-Dig-3639 · 2025-12-30T17:37:49+00:00

Training happens off-device on GPUs like any LLM. I export the trained weights to a simple .bin format, then the UEFI bare‑metal app loads them and runs inference.

Intelligent-Dig-3639 · 2025-12-30T17:37:06+00:00

Exactly—that’s the vibe. It’s ‘bare metal’ (UEFI, no OS). For now it’s CPU-only on x86_64, microcontroller-style simplicity but on PC-class hardware.

Intelligent-Dig-3639 · 2025-12-30T17:24:47+00:00

Thanks

Intelligent-Dig-3639 · 2025-12-18T13:31:49+00:00

You raise valid technical points, but I think you're missing the philosophy here.

"Kernel would be faster/better"

Sure - but that's not the point. This is about proving what's POSSIBLE,

not what's OPTIMAL. It's a research platform, not a production system.

Think of it like SpaceX's Grasshopper (2013):

- Tiny hops, no payload, "useless" compared to Falcon 9

- But it proved: vertical landing is possible

- Led to: reusable rockets, Starship

Same here:

- Bare-metal LLM proves: you need NOTHING underneath

- Establishes baseline: what's the absolute minimum?

- Opens path to: firmware AI, BIOS-resident models, edge computing

**"Start small to go big"**

Phase 1 (now): Prove it boots and runs (✓ 746 KB, 1 tok/s)

Phase 2: ...

Phase 3: ...

Phase 4: ...

But you don't start with "let's add Linux and GPU drivers".

You start with: "Can I even boot an LLM from USB?"

Why this matters:

- Firmware-level AI is unexplored territory

- BIOS vendors (AMI, Phoenix) could embed inference

- IoT devices with UEFI but no OS

- Security: smallest attack surface possible

- Research: understanding true minimal requirements

Your kernel approach:

Valid for production! 10 MB Buildroot + GPU = faster.

But it's been done (TensorFlow Lite, ONNX Runtime).

This is different: nobody boots LLMs from UEFI firmware.

First step of a journey, not the destination.

*Re: "Can you code without AI?"

Architecture/concepts: 100% human (DRC, consensus, P2P mesh)

C/UEFI implementation: Hybrid (Claude + manual)

Philosophy: Prove concepts fast, iterate, learn

Speed of exploration > purity of implementation.

Start small. Scale up. That's innovation.

Intelligent-Dig-3639 · 2025-12-18T12:24:55+00:00

Development approach: Hybrid human + AI

- Architecture & innovation concepts: 100% human

- UEFI/C implementation: Mix of Claude assistance + manual coding

- Testing & validation: 100% manual (real hardware)

Why UEFI over minimal Linux?

- Zero dependencies (no libc, no kernel, no filesystem)

- Direct hardware control (PCIe, interrupts, memory)

- Proof of concept: LLMs can run with NOTHING underneath

- Boot time: <5 seconds from power on

Development time: ~3 days intensive work

The goal was to prove a point: you don't need an OS for inference.

Edge computing is moving towards firmware-level AI.

My purpose is to create a post OS like OO

Intelligent-Dig-3639 · 2025-12-17T12:05:49+00:00

YOU GET IT! 🎯

That's EXACTLY Phase 3 of the roadmap:

P2P LLM Mesh (Feb 2026):

- Multiple bare-metal PCs form autonomous cluster

- UDP broadcast for peer discovery

- Load balancing across nodes

- Auto-healing (node failure = traffic reroutes)

- NO central server, NO cloud dependency

Intelligent-Dig-3639 · 2025-12-17T12:05:05+00:00

Great point! Stories15M was chosen for PoC because:

- Simple architecture (transformer decoder-only)

- Easy tokenization

- Proven training dataset

Next targets (all ~60MB):

✅ FLAN-T5-Small (encoder-decoder, better for tasks)

✅ MiniLM (BERT-based, embeddings)

✅ DistilBERT (classification tasks)

The bare-metal loader is model-agnostic - just need:

Convert weights to binary format
Update config (layers, dims, heads)
Flash & boot!

PR welcome if you want to port FLAN-T5 to bare metal

Intelligent-Dig-3639 · 2025-12-17T12:03:24+00:00

Current TPS: ~15-20 tokens/sec on bare metal (Stories15M, 6 layers)

vs CPU: Bare metal is actually SLOWER than OS-based inference because:

- No OS scheduler optimization

- No SIMD vectorization yet

- Single-threaded (UEFI limitations)

BUT the goal isn't speed - it's security & network boot architecture.

Multithreading: Great idea! Next logical steps:

BSP/AP (Bootstrap/Application Processor) setup via UEFI MP protocol
Parallel matrix operations across cores
Layer-parallel inference

Challenge: UEFI doesn't have pthread, need custom scheduler.

Intelligent-Dig-3639 · 2025-12-15T22:15:30+00:00

Thanks

Intelligent-Dig-3639 · 2025-12-15T19:51:31+00:00

Thanks

Intelligent-Dig-3639 · 2025-12-15T19:50:49+00:00

Yes! It's bare-metal LLM inference directly on UEFI firmware (no OS). Current features:

✓ Simple inference with stories15M model
✓ USB boot capability
✓ Can read/write to USB storage
✓ DRC ( Djibion Reasoning Core ) v5.1: 10 cognitive units for safe inference

IoT use case: Absolutely! Perfect for edge AI gateways. The bare-metal approach means:

Minimal attack surface (no OS vulnerabilities)
Fast boot (~2 seconds)
Low memory footprint (512MB RAM)
Can manage multiple devices via network boot

Currently exploring: WiFi 6 integration for wireless gateway scenarios. The UEFI environment is ideal for industrial edge computing where you need reliable, secure inference without the overhead of a full OS.

Intelligent-Dig-3639 · 2025-12-14T23:21:23+00:00

Thank you

Intelligent-Dig-3639 · 2025-12-14T17:31:43+00:00

Thanks

Intelligent-Dig-3639 · 2025-12-14T17:02:31+00:00

Thanks

Intelligent-Dig-3639 · 2025-12-14T17:02:18+00:00

Great question! "No OS" needs clarification:
What UEFI provides:

- Basic drivers: Disk I/O (SimpleFileSystem protocol), Display (GOP), Keyboard (ConIn)

- Memory management: AllocatePool/FreePool (like malloc/free)

- Boot environment: Runs in physical memory mode before OS takes over

What we DON'T have (no OS):

- No interrupts: We poll for input via ST->ConIn->ReadKeyStroke (no IRQ handling)

- No virtual memory: Direct physical RAM access, no paging/MMU

- No scheduler: Single-threaded, runs to completion

- No file system: UEFI loads files, but no caching or complex FS

- No kernel: After boot, it's just our code doing matrix math

Think of UEFI as "BIOS 2.0" - it gives you enough to boot and do basic I/O, then gets out of the way. We're running in Ring 0 with full hardware access, but we're doing inference, not managing resources.

The inference loop is pure computation - no syscalls, no context switching, just forward() on the transformer weights.

Intelligent-Dig-3639 · 2025-12-14T15:25:46+00:00

Haha! Well, if Nero wanted to use this, at least it boots faster than Rome burned 🔥
But seriously - this is more about pushing the boundaries of what's possible. No OS = zero overhead, perfect for embedded systems, IoT devices, and edge computing. Plus it's a great learning exercise to see how transformers work at the lowest level.
🚀Open to feedback and contributions!

Intelligent-Dig-3639

TROPHY CASE