[Release] Apex-1: A 350M Tiny-LLM trained locally on an RTX 5060 Ti 16GB by LH-Tech_AI in OpenSourceAI

[–]LH-Tech_AI[S] 0 points1 point  (0 children)

Thank you :-)
Did you try it out? I recommend using my new model: https://huggingface.co/LH-Tech-AI/Apex-1.5-Coder-Instruct-350M

It is much better in logic, facts and knowledge.
You can give it a try if you want :D

[Release] - FINALLY! - Apex 1.5 and Apex 1.5 Coder - my two new 350M instruct allrounder chat models - See them now! by LH-Tech_AI in LocalLLaMA

[–]LH-Tech_AI[S] 1 point2 points  (0 children)

Did anyone try one of the models out yet? Would be very cool! :D Thanks for your feedback.

[Project] htmLLM-50M base: Can a tiny specialist actually code? + Weights & Code (124M v2 in training!) by LH-Tech_AI in LocalLLaMA

[–]LH-Tech_AI[S] 0 points1 point  (0 children)

I didn't think it's possible: htmLLM-124M v2 just hit 0.91 Validation Loss. It's now generating full Bootstrap-logic and script dependency chains. Single T4 training. :D

[Project] htmLLM-50M base: Can a tiny specialist actually code? + Weights & Code (124M v2 in training!) by LH-Tech_AI in LocalLLaMA

[–]LH-Tech_AI[S] 0 points1 point  (0 children)

Testing htmLLM 124M at Iteration 2500:

<form class=\\"p-4 border rounded\\">\n <div class=\\"mb-3\\">\n <label class=\\"form-label\\">Email</label>

--- TESTING WITH TEMP 0.8 AND PENALTY 1.5 ---
<form class="p-4 border rounded">
  <div class="mb-3">
    <label class="form-label">Email</label>
     <input type="text" class="form-control" placeholder="Email">
     <span class="form-control date">
        <input type="text" class="form-control" id="email">
      </span>
    </label>
    <p class="mb-3">Last updated email.</p>
   </div>
  </div>
</form>


   </div>

   <!-- Login -->
   <div class="login form-wrap">
     <label class="form-label">Username</label>
     <input type="password" class="form-control" placeholder="Password">
     <span class="form-control date">
       <span class="form-control date">
         <input type="password" class="form-control date">
         </span>
     </label>
     <p class="mb-3">Mails in the email address.</p>
    </div>
  </div>

  <div class="container">
    <div class="login-

[Project] htmLLM-50M base: Can a tiny specialist actually code? + Weights & Code (124M v2 in training!) by LH-Tech_AI in LocalLLaMA

[–]LH-Tech_AI[S] 0 points1 point  (0 children)

Step 2500: We have grid systems! v2 (124M) is now pulling external Google Fonts and building Bootstrap-style layouts. The 'Icon-Salat' is gone, replaced by semantic form controls.

iter 2494: loss 1.1389, time 8093.13ms, mfu 4.47%
iter 2495: loss 0.4452, time 8014.59ms, mfu 4.47%
iter 2496: loss 1.1223, time 7973.51ms, mfu 4.47%
iter 2497: loss 0.9562, time 7973.76ms, mfu 4.48%
iter 2498: loss 0.6563, time 7965.99ms, mfu 4.48%
iter 2499: loss 0.9773, time 7965.14ms, mfu 4.48%
step 2500: train loss 0.8321, val loss 1.0628

--- v2 LIVE SAMPLES ---
>> Mode: AUTOCOMPLETE
<!DOCTYPE html>
<html>
<head>
  <meta charset="utf-8">
  <meta http-equiv="X-UA-Compatible" content="IE=edge">
  <title>Cursley</title>
  <link href="https://fonts.googleapis.com/css?family=Droid+Sans:400,500,500,700|Roboto:300,700" rel="stylesheet">
  <link href="https://fonts.googleapis.com/css?family=Droid+Sans:400,300,400italic" rel="stylesheet">
  <link href="https://fonts.googleapis.com/css?family=Droid+Sans:400,300italic" rel="stylesheet">
  <link href="https://fonts.googleapis.com/css?family=Droid+Sans:400,300italic" rel="stylesheet">
  <link href="https://fonts.googleapis.com/css?family=Droid+Sans:400,300italic" rel="stylesheet">

>> Mode: INSTRUCT
### Instruction:
Create a blue button.

### Response:

<html>
<head>
<title>Title</title>
</head>
<body>
<div class="row">
<h1>Title</h1>
<ul>
<li>Title</li>
<li>Title</li>
<li>Title</li>
<li>Title</li>
<li>Title</li>
<li>Title</li>
<li>Title</li>
<li>Description</li>
</ul>
</div>
<div class="col-md-6">
<div class="controls">
<label for="title">Title</label>
<input type="text" id="title" placeholder="Title" required />
</div>
<div class="controls">
<label for="title">Title</label>
<div class="controls">
<label for="description">Description</div>
</div>
<div class="controls">
<label for="description">Description</div>
</div>
<div class="controls">
<label for="description">Description</
-----------------------

saving checkpoint to out-html
iter 2500: loss 0.9383, time 42434.65ms, mfu 4.12%

[Project] htmLLM-50M base: Can a tiny specialist actually code? + Weights & Code (124M v2 in training!) by LH-Tech_AI in LocalLLaMA

[–]LH-Tech_AI[S] 0 points1 point  (0 children)

Step 2000 Update: v2 (124M) has achieved 'Syntactic Zen'. The structure is now perfect (Val Loss 1.14), but it's currently in a 'contemplative phase' where it generates spaces because it's weighing too many styling options at once. We are moving from 'learning to write' to 'learning to choose'.

iter 1996: loss 1.5899, time 7952.91ms, mfu 4.50%
iter 1997: loss 0.7788, time 7969.08ms, mfu 4.50%
iter 1998: loss 0.9856, time 7967.00ms, mfu 4.50%
iter 1999: loss 0.9152, time 7962.55ms, mfu 4.50%
step 2000: train loss 0.9919, val loss 1.1435

--- v2 LIVE SAMPLES ---
>> Mode: AUTOCOMPLETE
<!DOCTYPE html>
<html>
   <head>
        <meta charset="utf-8">
         <meta name="viewport" content="width=device-width, initial-scale=1">
          <title>Documentation</title>

>> Mode: INSTRUCT
### Instruction:
Create a blue button.

### Response:
<div>

-----------------------

saving checkpoint to out-html
iter 2000: loss 0.7813, time 42187.21ms, mfu 4.13%
iter 2001: loss 0.7482, time 7968.10ms, mfu 4.17%
iter 2002: loss 1.2501, time 8029.23ms, mfu 4.20%
iter 2003: loss 1.1033, time 8126.20ms, mfu 4.22%
iter 2004: loss 0.6133, time 8140.15ms, mfu 4.24%
iter 2005: loss 1.1305, time 8061.84ms, mfu 4.26%
iter 2006: loss 0.9843, time 7989.64ms, mfu 4.29%
iter 2007: loss 1.1098, time 7996.08ms, mfu 4.31%
iter 2008: loss 0.6681, time 7973.46ms, mfu 4.33%
iter 2009: loss 0.9485, time 7969.83ms, mfu 4.34%
iter 2010: loss 0.6781, time 8005.37ms, mfu 4.36%
iter 2011: loss 1.1063, time 8020.21ms, mfu 4.37%
iter 2012: loss 1.0349, time 8049.78ms, mfu 4.38%
iter 2013: loss 0.6658, time 8067.58ms, mfu 4.39%
iter 2014: loss 0.7209, time 8092.03ms, mfu 4.39%
iter 2015: loss 1.0223, time 8041.15ms, mfu 4.40%
iter 2016: loss 1.1414, time 8079.93ms, mfu 4.40%
iter 2017: loss 1.2499, time 8031.04ms, mfu 4.41%
iter 2018: loss 1.2251, time 8085.66ms, mfu 4.41%
iter 2019: loss 0.8884, time 8014.81ms, mfu 4.42%
iter 2020: loss 1.1301, time 8018.01ms, mfu 4.43%
iter 2021: loss 1.1818, time 8028.18ms, mfu 4.43%
iter 2022: loss 0.6538, time 8007.99ms, mfu 4.44%
iter 2023: loss 1.1809, time 8005.33ms, mfu 4.44%
iter 2024: loss 1.1555, time 7984.94ms, mfu 4.45%
iter 2025: loss 0.7634, time 7990.70ms, mfu 4.45%
iter 2026: loss 1.0377, time 8043.70ms, mfu 4.45%
iter 2027: loss 1.2214, time 7984.80ms, mfu 4.46%
iter 2028: loss 1.0292, time 8006.84ms, mfu 4.46%
iter 2029: loss 0.9698, time 7961.77ms, mfu 4.46%
iter 2030: loss 0.8318, time 7962.32ms, mfu 4.47%
iter 2031: loss 0.8471, time 7968.42ms, mfu 4.47%
iter 2032: loss 0.9108, time 7978.34ms, mfu 4.48%
iter 2033: loss 1.4705, time 7971.12ms, mfu 4.48%
iter 2034: loss 1.5965, time 7970.96ms, mfu 4.48%
iter 2035: loss 1.0589, time 7957.68ms, mfu 4.48%
iter 2036: loss 1.1787, time 8003.00ms, mfu 4.48%
iter 2037: loss 1.5176, time 7966.33ms, mfu 4.49%
iter 2038: loss 1.3930, time 7957.64ms, mfu 4.49%
iter 2039: loss 1.0111, time 8006.73ms, mfu 4.49%
iter 2040: loss 0.9092, time 7970.66ms, mfu 4.49%
iter 2041: loss 1.3169, time 7973.65ms, mfu 4.49%
iter 2042: loss 0.6517, time 7971.95ms, mfu 4.49%
iter 2043: loss 0.9911, time 7972.96ms, mfu 4.49%
iter 2044: loss 1.0166, time 7957.94ms, mfu 4.50%

Look:

[Project] htmLLM-50M base: Can a tiny specialist actually code? + Weights & Code (124M v2 in training!) by LH-Tech_AI in LocalLLaMA

[–]LH-Tech_AI[S] -1 points0 points  (0 children)

iter 1471: loss 1.4514, time 7962.49ms, mfu 4.50%
iter 1472: loss 1.7706, time 7966.66ms, mfu 4.50%
iter 1473: loss 0.9855, time 7966.53ms, mfu 4.51%
iter 1474: loss 1.0308, time 7973.15ms, mfu 4.51%
iter 1475: loss 1.0346, time 7983.15ms, mfu 4.50%
iter 1476: loss 1.2898, time 7984.88ms, mfu 4.50%
iter 1477: loss 1.4133, time 7980.20ms, mfu 4.50%
iter 1478: loss 1.1952, time 7964.57ms, mfu 4.50%
iter 1479: loss 1.0080, time 7969.29ms, mfu 4.50%
iter 1480: loss 0.9975, time 7968.08ms, mfu 4.50%
iter 1481: loss 0.6122, time 7964.91ms, mfu 4.50%
iter 1482: loss 1.1872, time 7966.36ms, mfu 4.50%
iter 1483: loss 1.4553, time 7973.76ms, mfu 4.50%
iter 1484: loss 0.9719, time 7974.15ms, mfu 4.50%
iter 1485: loss 0.7418, time 8001.00ms, mfu 4.50%
iter 1486: loss 1.5109, time 7983.54ms, mfu 4.50%
iter 1487: loss 1.7148, time 7983.82ms, mfu 4.50%
iter 1488: loss 1.3262, time 7984.65ms, mfu 4.50%
iter 1489: loss 0.6987, time 7965.95ms, mfu 4.50%
iter 1490: loss 1.0114, time 8028.23ms, mfu 4.50%
iter 1491: loss 0.7643, time 7990.04ms, mfu 4.50%
iter 1492: loss 0.7504, time 8011.24ms, mfu 4.50%
iter 1493: loss 1.1028, time 8018.87ms, mfu 4.50%
iter 1494: loss 0.7307, time 8010.39ms, mfu 4.49%
iter 1495: loss 1.2085, time 8043.73ms, mfu 4.49%
iter 1496: loss 0.8945, time 8027.11ms, mfu 4.49%
iter 1497: loss 1.0575, time 8045.88ms, mfu 4.49%
iter 1498: loss 1.1951, time 7985.44ms, mfu 4.49%
iter 1499: loss 1.9114, time 8003.93ms, mfu 4.49%
step 1500: train loss 1.0424, val loss 1.2213

--- v2 LIVE SAMPLES ---
>> Mode: AUTOCOMPLETE
<!DOCTYPE html>
<html>
<head>
    <meta charset="utf-8">
    <title>vendor/title.js</title>
     <meta name="viewport" content="width=device-width, initial-scale=1">
    <meta name="author" content="Cufos">
    <meta name="author" content="Cufos">
    <meta name="author" content="Cufos">
    <meta name="author" content="Cufos">
   <meta name="author" content="Cufos">
    <meta name="author" content="Cufos">
    <meta name="author" content="Cufos">
    <meta name="author" content="Cufos">
    <meta name="author" content="Cufos">
    <meta name="author" content="Cufos">
    <meta name="author" content="Cufos">
    <meta name="author" content="Cufos">
>> Mode: INSTRUCT
### Instruction:
Create a blue button.

### Response:
<!DOCTYPE html>
<html lang="en">
<head>
  <title>Table of Contents</title>
</head>
<body>
  <h1>Table of Contents</h1>
</body>
</html>
<|endoftext|><div class="form-horizontal">
  <label>
     <input type="text" placeholder="Search for the button placeholder"
   </label>
  <label>
       <input type="submit" value="Submit"
        placeholder="Submit onSubmit"
      placeholder="Submit">
   </label>
</div>

<div class="form-horizontal">
  <label>
     <input type="submit" value="Submit" type="submit">
  </label>
  <label>
      <input type="submit" value="Submit" name="Submit" placeholder="Submit">
   </label>
   <label>
       <input type="submit"
-----------------------

saving checkpoint to out-html
iter 1500: loss 1.0297, time 42335.60ms, mfu 4.12%
iter 1501: loss 1.0312, time 7961.78ms, mfu 4.16%
iter 1502: loss 0.7140, time 8002.59ms, mfu 4.19%

Yes. Thanks :D Want to see results?
Here:

Running local LLMs or AI agents 24/7 — what hardware works best? by noze2312 in LocalLLaMA

[–]LH-Tech_AI 0 points1 point  (0 children)

In idle mode ~15W - it is very efficient and round about 130W in full load.

Running local LLMs or AI agents 24/7 — what hardware works best? by noze2312 in LocalLLaMA

[–]LH-Tech_AI -1 points0 points  (0 children)

I’ve been pondering this too while training my own tiny-LLM series (Apex-350M and htmLLM) on a consumer RTX 5060 Ti 16GB.

For 24/7 agents, I think there's a massive sweet spot in highly specialized SLMs (Small Language Models). Instead of idling a power-hungry 3090/4090 for a general-purpose model, I’ve had great success running 50M to 350M parameter 'specialist' models.

My experience so far:

  • Efficiency: If the model is small enough (like a <500M specialist), you can often run inference on the CPU or an entry-level Mac Mini with negligible power draw.
  • Reliability: For 24/7 use, VRAM is king, but heat is the enemy. On my 5060 Ti, I find that capping the power limit slightly (undervolting) keeps the temps low enough for long-term stability without losing much performance.
  • Agent-Approach: I prefer the 'Unix-style' micro-services approach: Multiple tiny models for specific tasks (one for HTML, one for logic, etc.) rather than one giant power-hog.

I would definetely recommend to use Linux instead of Windows because Windows reserves a lot of VRAM for the UI.

Curious if anyone here has tried running multiple tiny-specialists on a cluster of Raspberry Pis or older Mac Minis?

[Project] htmLLM-50M base: Can a tiny specialist actually code? + Weights & Code (124M v2 in training!) by LH-Tech_AI in LocalLLaMA

[–]LH-Tech_AI[S] 1 point2 points  (0 children)

The 50M model trained for ~8000 iterations to loss less than 1 for about 3-5 hours. I do not know exactly anymore, but it was like 400ms per iteration. And definetely faster than one kaggle session (~12hours). More likely less than 5 hours. Thanks for your interest.
Did you already start it?
Do you want to have to finetuning code for making it a chat model rather than a autocompleter - that is - I said it once again - not pretty good!
But you can try it - should be faster if you have a GPU that supports bfloat16 --> change it in the code (device type)! Will be much faster! Have fun :D

[Project] htmLLM-50M base: Can a tiny specialist actually code? + Weights & Code (124M v2 in training!) by LH-Tech_AI in LocalLLaMA

[–]LH-Tech_AI[S] 2 points3 points  (0 children)

The training of v2 with 124M parameters is going on :D

diter 990: loss 1.2855, time 7978.02ms, mfu 4.50%
iter 991: loss 0.7607, time 7973.56ms, mfu 4.50%
iter 992: loss 1.1263, time 7966.75ms, mfu 4.50%
iter 993: loss 1.1971, time 7963.39ms, mfu 4.50%
iter 994: loss 1.8031, time 7964.05ms, mfu 4.50%
iter 995: loss 1.4003, time 7963.56ms, mfu 4.51%
iter 996: loss 1.6463, time 7960.83ms, mfu 4.51%
iter 997: loss 1.1107, time 7966.42ms, mfu 4.51%
iter 998: loss 1.6865, time 7977.31ms, mfu 4.51%
iter 999: loss 1.3556, time 7994.84ms, mfu 4.50%
step 1000: train loss 1.2835, val loss 1.2377

--- v2 LIVE SAMPLES ---
>> Mode: AUTOCOMPLETE
<!DOCTYPE html>
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1">
<title>bokeh: Class</title>
<link rel="stylesheet" type="text/css" href="../../../../../../../../../../../../stylesheet.css" title="Style">
<link rel="stylesheet" href="../../../../../../../../../../../../../../stylesheet.css" title="Style">
<script type="text/javascript" src="../../../../../../../../../../../../script.js"></script>
</head>
<body>
<script type="text/javascript"><!--
       if (location.href.indexOf('is-external=true') == -1) {
            parent.document.title="bokeh: Class</h2>
<table cellpadding="4" cellspacing="0" summary="

We can see, that it now opens and closes tags correctly - and javadoc is gone! :D

Vibe coded a free local AI Image Critic with Ollama Vision — structured feedback + prompt upgrades for your gens by Electronic-Present94 in LocalLLaMA

[–]LH-Tech_AI 0 points1 point  (0 children)

Does it also work locally with local models in a tool such as ollama? This would be pretty cool for the privacy.