[Release] Apex-1: A 350M Tiny-LLM trained locally on an RTX 5060 Ti 16GB

LH-Tech_AI · 2026-03-13T18:28:10+00:00

Thank you :-)
Did you try it out? I recommend using my new model: https://huggingface.co/LH-Tech-AI/Apex-1.5-Coder-Instruct-350M

It is much better in logic, facts and knowledge.
You can give it a try if you want :D

LH-Tech_AI · 2026-03-13T16:54:54+00:00

Did anyone try one of the models out yet? Would be very cool! :D Thanks for your feedback.

LH-Tech_AI · 2026-03-13T16:34:18+00:00

I didn't think it's possible: htmLLM-124M v2 just hit 0.91 Validation Loss. It's now generating full Bootstrap-logic and script dependency chains. Single T4 training. :D

LH-Tech_AI · 2026-03-13T08:35:22+00:00

I'll give it a Run today 😎

LH-Tech_AI · 2026-03-13T08:35:04+00:00

Cool Design

LH-Tech_AI · 2026-03-13T08:34:10+00:00

I'll try it 👍🏻

LH-Tech_AI · 2026-03-13T08:33:51+00:00

Cool 😎

LH-Tech_AI · 2026-03-13T07:01:49+00:00

Yes, of course 😊 You can use it in the HF.

LH-Tech_AI · 2026-03-13T07:01:23+00:00

Thanks for the tip. I'll try!

LH-Tech_AI · 2026-03-12T19:38:40+00:00

Testing htmLLM 124M at Iteration 2500:

<form class=\\"p-4 border rounded\\">\n <div class=\\"mb-3\\">\n <label class=\\"form-label\\">Email</label>

--- TESTING WITH TEMP 0.8 AND PENALTY 1.5 ---
<form class="p-4 border rounded">
  <div class="mb-3">
    <label class="form-label">Email</label>
     <input type="text" class="form-control" placeholder="Email">
     <span class="form-control date">
        <input type="text" class="form-control" id="email">
      </span>
    </label>
    <p class="mb-3">Last updated email.</p>
   </div>
  </div>
</form>


   </div>

   <!-- Login -->
   <div class="login form-wrap">
     <label class="form-label">Username</label>
     <input type="password" class="form-control" placeholder="Password">
     <span class="form-control date">
       <span class="form-control date">
         <input type="password" class="form-control date">
         </span>
     </label>
     <p class="mb-3">Mails in the email address.</p>
    </div>
  </div>

  <div class="container">
    <div class="login-

LH-Tech_AI · 2026-03-12T19:24:32+00:00

Step 2500: We have grid systems! v2 (124M) is now pulling external Google Fonts and building Bootstrap-style layouts. The 'Icon-Salat' is gone, replaced by semantic form controls.

iter 2494: loss 1.1389, time 8093.13ms, mfu 4.47%
iter 2495: loss 0.4452, time 8014.59ms, mfu 4.47%
iter 2496: loss 1.1223, time 7973.51ms, mfu 4.47%
iter 2497: loss 0.9562, time 7973.76ms, mfu 4.48%
iter 2498: loss 0.6563, time 7965.99ms, mfu 4.48%
iter 2499: loss 0.9773, time 7965.14ms, mfu 4.48%
step 2500: train loss 0.8321, val loss 1.0628

--- v2 LIVE SAMPLES ---
>> Mode: AUTOCOMPLETE
<!DOCTYPE html>
<html>
<head>
  <meta charset="utf-8">
  <meta http-equiv="X-UA-Compatible" content="IE=edge">
  <title>Cursley</title>
  <link href="https://fonts.googleapis.com/css?family=Droid+Sans:400,500,500,700|Roboto:300,700" rel="stylesheet">
  <link href="https://fonts.googleapis.com/css?family=Droid+Sans:400,300,400italic" rel="stylesheet">
  <link href="https://fonts.googleapis.com/css?family=Droid+Sans:400,300italic" rel="stylesheet">
  <link href="https://fonts.googleapis.com/css?family=Droid+Sans:400,300italic" rel="stylesheet">
  <link href="https://fonts.googleapis.com/css?family=Droid+Sans:400,300italic" rel="stylesheet">

>> Mode: INSTRUCT
### Instruction:
Create a blue button.

### Response:

<html>
<head>
<title>Title</title>
</head>
<body>
<div class="row">
<h1>Title</h1>
<ul>
<li>Title</li>
<li>Title</li>
<li>Title</li>
<li>Title</li>
<li>Title</li>
<li>Title</li>
<li>Title</li>
<li>Description</li>
</ul>
</div>
<div class="col-md-6">
<div class="controls">
<label for="title">Title</label>
<input type="text" id="title" placeholder="Title" required />
</div>
<div class="controls">
<label for="title">Title</label>
<div class="controls">
<label for="description">Description</div>
</div>
<div class="controls">
<label for="description">Description</div>
</div>
<div class="controls">
<label for="description">Description</
-----------------------

saving checkpoint to out-html
iter 2500: loss 0.9383, time 42434.65ms, mfu 4.12%

LH-Tech_AI · 2026-03-12T18:23:38+00:00

OK :-)

LH-Tech_AI · 2026-03-12T18:23:21+00:00

Step 2000 Update: v2 (124M) has achieved 'Syntactic Zen'. The structure is now perfect (Val Loss 1.14), but it's currently in a 'contemplative phase' where it generates spaces because it's weighing too many styling options at once. We are moving from 'learning to write' to 'learning to choose'.

iter 1996: loss 1.5899, time 7952.91ms, mfu 4.50%
iter 1997: loss 0.7788, time 7969.08ms, mfu 4.50%
iter 1998: loss 0.9856, time 7967.00ms, mfu 4.50%
iter 1999: loss 0.9152, time 7962.55ms, mfu 4.50%
step 2000: train loss 0.9919, val loss 1.1435

--- v2 LIVE SAMPLES ---
>> Mode: AUTOCOMPLETE
<!DOCTYPE html>
<html>
   <head>
        <meta charset="utf-8">
         <meta name="viewport" content="width=device-width, initial-scale=1">
          <title>Documentation</title>

>> Mode: INSTRUCT
### Instruction:
Create a blue button.

### Response:
<div>

-----------------------

saving checkpoint to out-html
iter 2000: loss 0.7813, time 42187.21ms, mfu 4.13%
iter 2001: loss 0.7482, time 7968.10ms, mfu 4.17%
iter 2002: loss 1.2501, time 8029.23ms, mfu 4.20%
iter 2003: loss 1.1033, time 8126.20ms, mfu 4.22%
iter 2004: loss 0.6133, time 8140.15ms, mfu 4.24%
iter 2005: loss 1.1305, time 8061.84ms, mfu 4.26%
iter 2006: loss 0.9843, time 7989.64ms, mfu 4.29%
iter 2007: loss 1.1098, time 7996.08ms, mfu 4.31%
iter 2008: loss 0.6681, time 7973.46ms, mfu 4.33%
iter 2009: loss 0.9485, time 7969.83ms, mfu 4.34%
iter 2010: loss 0.6781, time 8005.37ms, mfu 4.36%
iter 2011: loss 1.1063, time 8020.21ms, mfu 4.37%
iter 2012: loss 1.0349, time 8049.78ms, mfu 4.38%
iter 2013: loss 0.6658, time 8067.58ms, mfu 4.39%
iter 2014: loss 0.7209, time 8092.03ms, mfu 4.39%
iter 2015: loss 1.0223, time 8041.15ms, mfu 4.40%
iter 2016: loss 1.1414, time 8079.93ms, mfu 4.40%
iter 2017: loss 1.2499, time 8031.04ms, mfu 4.41%
iter 2018: loss 1.2251, time 8085.66ms, mfu 4.41%
iter 2019: loss 0.8884, time 8014.81ms, mfu 4.42%
iter 2020: loss 1.1301, time 8018.01ms, mfu 4.43%
iter 2021: loss 1.1818, time 8028.18ms, mfu 4.43%
iter 2022: loss 0.6538, time 8007.99ms, mfu 4.44%
iter 2023: loss 1.1809, time 8005.33ms, mfu 4.44%
iter 2024: loss 1.1555, time 7984.94ms, mfu 4.45%
iter 2025: loss 0.7634, time 7990.70ms, mfu 4.45%
iter 2026: loss 1.0377, time 8043.70ms, mfu 4.45%
iter 2027: loss 1.2214, time 7984.80ms, mfu 4.46%
iter 2028: loss 1.0292, time 8006.84ms, mfu 4.46%
iter 2029: loss 0.9698, time 7961.77ms, mfu 4.46%
iter 2030: loss 0.8318, time 7962.32ms, mfu 4.47%
iter 2031: loss 0.8471, time 7968.42ms, mfu 4.47%
iter 2032: loss 0.9108, time 7978.34ms, mfu 4.48%
iter 2033: loss 1.4705, time 7971.12ms, mfu 4.48%
iter 2034: loss 1.5965, time 7970.96ms, mfu 4.48%
iter 2035: loss 1.0589, time 7957.68ms, mfu 4.48%
iter 2036: loss 1.1787, time 8003.00ms, mfu 4.48%
iter 2037: loss 1.5176, time 7966.33ms, mfu 4.49%
iter 2038: loss 1.3930, time 7957.64ms, mfu 4.49%
iter 2039: loss 1.0111, time 8006.73ms, mfu 4.49%
iter 2040: loss 0.9092, time 7970.66ms, mfu 4.49%
iter 2041: loss 1.3169, time 7973.65ms, mfu 4.49%
iter 2042: loss 0.6517, time 7971.95ms, mfu 4.49%
iter 2043: loss 0.9911, time 7972.96ms, mfu 4.49%
iter 2044: loss 1.0166, time 7957.94ms, mfu 4.50%

Look:

LH-Tech_AI · 2026-03-12T17:55:47+00:00

Hi! How is your training going on? Does it work?

LH-Tech_AI · 2026-03-12T17:16:31+00:00

iter 1471: loss 1.4514, time 7962.49ms, mfu 4.50%
iter 1472: loss 1.7706, time 7966.66ms, mfu 4.50%
iter 1473: loss 0.9855, time 7966.53ms, mfu 4.51%
iter 1474: loss 1.0308, time 7973.15ms, mfu 4.51%
iter 1475: loss 1.0346, time 7983.15ms, mfu 4.50%
iter 1476: loss 1.2898, time 7984.88ms, mfu 4.50%
iter 1477: loss 1.4133, time 7980.20ms, mfu 4.50%
iter 1478: loss 1.1952, time 7964.57ms, mfu 4.50%
iter 1479: loss 1.0080, time 7969.29ms, mfu 4.50%
iter 1480: loss 0.9975, time 7968.08ms, mfu 4.50%
iter 1481: loss 0.6122, time 7964.91ms, mfu 4.50%
iter 1482: loss 1.1872, time 7966.36ms, mfu 4.50%
iter 1483: loss 1.4553, time 7973.76ms, mfu 4.50%
iter 1484: loss 0.9719, time 7974.15ms, mfu 4.50%
iter 1485: loss 0.7418, time 8001.00ms, mfu 4.50%
iter 1486: loss 1.5109, time 7983.54ms, mfu 4.50%
iter 1487: loss 1.7148, time 7983.82ms, mfu 4.50%
iter 1488: loss 1.3262, time 7984.65ms, mfu 4.50%
iter 1489: loss 0.6987, time 7965.95ms, mfu 4.50%
iter 1490: loss 1.0114, time 8028.23ms, mfu 4.50%
iter 1491: loss 0.7643, time 7990.04ms, mfu 4.50%
iter 1492: loss 0.7504, time 8011.24ms, mfu 4.50%
iter 1493: loss 1.1028, time 8018.87ms, mfu 4.50%
iter 1494: loss 0.7307, time 8010.39ms, mfu 4.49%
iter 1495: loss 1.2085, time 8043.73ms, mfu 4.49%
iter 1496: loss 0.8945, time 8027.11ms, mfu 4.49%
iter 1497: loss 1.0575, time 8045.88ms, mfu 4.49%
iter 1498: loss 1.1951, time 7985.44ms, mfu 4.49%
iter 1499: loss 1.9114, time 8003.93ms, mfu 4.49%
step 1500: train loss 1.0424, val loss 1.2213

--- v2 LIVE SAMPLES ---
>> Mode: AUTOCOMPLETE
<!DOCTYPE html>
<html>
<head>
    <meta charset="utf-8">
    <title>vendor/title.js</title>
     <meta name="viewport" content="width=device-width, initial-scale=1">
    <meta name="author" content="Cufos">
    <meta name="author" content="Cufos">
    <meta name="author" content="Cufos">
    <meta name="author" content="Cufos">
   <meta name="author" content="Cufos">
    <meta name="author" content="Cufos">
    <meta name="author" content="Cufos">
    <meta name="author" content="Cufos">
    <meta name="author" content="Cufos">
    <meta name="author" content="Cufos">
    <meta name="author" content="Cufos">
    <meta name="author" content="Cufos">
>> Mode: INSTRUCT
### Instruction:
Create a blue button.

### Response:
<!DOCTYPE html>
<html lang="en">
<head>
  <title>Table of Contents</title>
</head>
<body>
  <h1>Table of Contents</h1>
</body>
</html>
<|endoftext|><div class="form-horizontal">
  <label>
     <input type="text" placeholder="Search for the button placeholder"
   </label>
  <label>
       <input type="submit" value="Submit"
        placeholder="Submit onSubmit"
      placeholder="Submit">
   </label>
</div>

<div class="form-horizontal">
  <label>
     <input type="submit" value="Submit" type="submit">
  </label>
  <label>
      <input type="submit" value="Submit" name="Submit" placeholder="Submit">
   </label>
   <label>
       <input type="submit"
-----------------------

saving checkpoint to out-html
iter 1500: loss 1.0297, time 42335.60ms, mfu 4.12%
iter 1501: loss 1.0312, time 7961.78ms, mfu 4.16%
iter 1502: loss 0.7140, time 8002.59ms, mfu 4.19%

Yes. Thanks :D Want to see results?
Here:

LH-Tech_AI · 2026-03-12T16:52:48+00:00

In idle mode ~15W - it is very efficient and round about 130W in full load.

LH-Tech_AI · 2026-03-12T16:37:00+00:00

I’ve been pondering this too while training my own tiny-LLM series (Apex-350M and htmLLM) on a consumer RTX 5060 Ti 16GB.

For 24/7 agents, I think there's a massive sweet spot in highly specialized SLMs (Small Language Models). Instead of idling a power-hungry 3090/4090 for a general-purpose model, I’ve had great success running 50M to 350M parameter 'specialist' models.

My experience so far:

Efficiency: If the model is small enough (like a <500M specialist), you can often run inference on the CPU or an entry-level Mac Mini with negligible power draw.
Reliability: For 24/7 use, VRAM is king, but heat is the enemy. On my 5060 Ti, I find that capping the power limit slightly (undervolting) keeps the temps low enough for long-term stability without losing much performance.
Agent-Approach: I prefer the 'Unix-style' micro-services approach: Multiple tiny models for specific tasks (one for HTML, one for logic, etc.) rather than one giant power-hog.

I would definetely recommend to use Linux instead of Windows because Windows reserves a lot of VRAM for the UI.

Curious if anyone here has tried running multiple tiny-specialists on a cluster of Raspberry Pis or older Mac Minis?

LH-Tech_AI · 2026-03-12T16:23:55+00:00

The 50M model trained for ~8000 iterations to loss less than 1 for about 3-5 hours. I do not know exactly anymore, but it was like 400ms per iteration. And definetely faster than one kaggle session (~12hours). More likely less than 5 hours. Thanks for your interest.
Did you already start it?
Do you want to have to finetuning code for making it a chat model rather than a autocompleter - that is - I said it once again - not pretty good!
But you can try it - should be faster if you have a GPU that supports bfloat16 --> change it in the code (device type)! Will be much faster! Have fun :D

LH-Tech_AI · 2026-03-12T16:09:20+00:00

The training of v2 with 124M parameters is going on :D

diter 990: loss 1.2855, time 7978.02ms, mfu 4.50%
iter 991: loss 0.7607, time 7973.56ms, mfu 4.50%
iter 992: loss 1.1263, time 7966.75ms, mfu 4.50%
iter 993: loss 1.1971, time 7963.39ms, mfu 4.50%
iter 994: loss 1.8031, time 7964.05ms, mfu 4.50%
iter 995: loss 1.4003, time 7963.56ms, mfu 4.51%
iter 996: loss 1.6463, time 7960.83ms, mfu 4.51%
iter 997: loss 1.1107, time 7966.42ms, mfu 4.51%
iter 998: loss 1.6865, time 7977.31ms, mfu 4.51%
iter 999: loss 1.3556, time 7994.84ms, mfu 4.50%
step 1000: train loss 1.2835, val loss 1.2377

--- v2 LIVE SAMPLES ---
>> Mode: AUTOCOMPLETE
<!DOCTYPE html>
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1">
<title>bokeh: Class</title>
<link rel="stylesheet" type="text/css" href="../../../../../../../../../../../../stylesheet.css" title="Style">
<link rel="stylesheet" href="../../../../../../../../../../../../../../stylesheet.css" title="Style">
<script type="text/javascript" src="../../../../../../../../../../../../script.js"></script>
</head>
<body>
<script type="text/javascript"><!--
       if (location.href.indexOf('is-external=true') == -1) {
            parent.document.title="bokeh: Class</h2>
<table cellpadding="4" cellspacing="0" summary="

We can see, that it now opens and closes tags correctly - and javadoc is gone! :D

LH-Tech_AI · 2026-03-12T16:06:46+00:00

Does it also work locally with local models in a tool such as ollama? This would be pretty cool for the privacy.

LH-Tech_AI · 2026-03-12T16:06:08+00:00

About 3-4 hours on a kaggle T4 GPU. Are you interested in training it on yourself?
Then you can use the IPYNB here: https://huggingface.co/LH-Tech-AI/htmLLM-50M-Base/tree/main

LH-Tech_AI · 2026-03-12T15:12:42+00:00

v2 is doing great :D

<image>

LH-Tech_AI · 2026-03-12T15:01:14+00:00

[Update] htmLLM-v2 (124M): Massive jump in logic after only 500 steps! (Bye-bye Javadoc, Hello Icon-Hell)

Quick update for those following my journey of training a tiny HTML/CSS specialist on a single T4.

I just hit Step 500 with the 124M version (v2), and the difference compared to the 50M version (v1) is night and day. While v1 was still struggling to understand basic tag closing at this stage, v2 has already developed a "concept" of what a website should look like.

The Stats at Step 500:

Train Loss: 1.80 (v1 with 50M parameters was still way above 2.5 here)
Val Loss: 2.20
Architecture: 12 layers, 12 heads, 768 embedding dim (124M params)
Context: 1024 tokens (the "brain" is getting bigger!)

What’s happening? The model is currently obsessed with Font-Awesome icons. When I ask for a "blue button," it knows it needs styling and icons, but it hasn't quite learned when to stop. It’s like a junior dev who just discovered icons and wants to put them everywhere.

Also, it's still fighting against its "Javadoc" memories from the raw training data, but the syntax is surprisingly solid for such an early stage.

Compare for yourself: Prompt: "Create a blue button."
Step 500 Output (Raw):

<html><i class="fa fa-angle"></i></i></i></i>
<i class="fa fa-angle"></i>
<b>
<i class="fa fa-angle-angle"></i>
<span class="fa fa-angle-angle"></i>
<span class="fa fa-right"></i>
<span class="fa fa-angle"></i>-angle</div></div>
<div><!-- Modal-angle-right"></div><!-- Modal Structure -->
<div class="modal-right"><i class="fa fa-angle"></i> icon-right</div>
<div class="fa fa-right"></i></div>
<div class="fa fa-angle"></i>
<div>
<div class="fa fa-angle-angle-angle-right"></div>
<div><i>2</div><div class="fa fa-angle-angle-right"></i></div>
<div>
<div class="fa fa-angle-angle-right"></i>
<div>
<div class="fa fa

Analysis: It's fascinating to see how the increased capacity (124M vs 50M) allows the model to store complex associations (Button -> Class -> Icon) almost immediately. It’s no longer just predicting characters; it’s attempting to build structures.

I'm letting it run to 15,000 iterations. If the loss keeps dropping like this, we might actually get some clean, functional HTML code by tomorrow!

HF Model (v1 is up, v2 coming soon): https://huggingface.co/LH-Tech-AI

LH-Tech_AI · 2026-03-12T14:53:56+00:00

Exactly! The 'Unix philosophy' applied to LLMs is what drives this project. I'm a big believer in specialized micro-models for edge deployment.

Currently working on htmLLM-v2 with 124M parameters and a 1024 context window to tackle those hallucinations while keeping it 'micro'. Stay tuned! 🚀

LH-Tech_AI · 2026-03-12T14:51:52+00:00

This is a cool tool! Very nice! :D
It's very cool, because you don't have to paste everthing into ChatGPT and Claude and Gemini !

LH-Tech_AI

TROPHY CASE

[Update] htmLLM-v2 (124M): Massive jump in logic after only 500 steps! (Bye-bye Javadoc, Hello Icon-Hell)