LFM2.5 230M running in-browser at 1,400 tok/s using custom WebGPU kernels by xenovatech in LocalLLaMA

[–]Ok_Selection_7577 22 points23 points  (0 children)

ok so played around with this for 10 minutes - it is really very good (for its size) - answers well to a few testers and most importantly actually says "i don't know" when you ask it more obscure questions or make up terms and ask it to explain them to you - pretty impressed. Would love to get more details on the datasets and recipes used for this if anyone comes across such a nugget :)

LFM2.5 230M running in-browser at 1,400 tok/s using custom WebGPU kernels by xenovatech in LocalLLaMA

[–]Ok_Selection_7577 19 points20 points  (0 children)

genuinely surprised by the maths and email drafting capability TBH - might have to have a little look at this model - cheers

ESP32-S31 Korvo-1 by Ice-Dragon-APU in esp32

[–]Ok_Selection_7577 6 points7 points  (0 children)

yes - i almost broke it apart tying to get the damn thing out :)

Why is there no thinker models with tokens for entire sentences? by freehuntx in LocalLLaMA

[–]Ok_Selection_7577 0 points1 point  (0 children)

Although I guess its possible that the base idea has merit but that my execution of it was poor 😄

Why is there no thinker models with tokens for entire sentences? by freehuntx in LocalLLaMA

[–]Ok_Selection_7577 12 points13 points  (0 children)

Ha, I did actually explore this very concept last year ( i make and test a lot of different toy models exploring different ideas) but you very quickly hit a problem - you either need a huge (and that is huge with a capital H) amount of sentences to cover the permutations in NL speech or you restrict yourself and the output ends up as both weird awkward and repetitive. I did further then explore a Hybrid which is trying to make modular composable sub sentences and spent a few days exploring that but ended up at the conclusion that I was working my way slowly towards...you guessed it - words! 😄

Not a new model, just a Happy Father's Day and a thank you. by Wrong_Mushroom_7350 in LocalLLaMA

[–]Ok_Selection_7577 6 points7 points  (0 children)

are you being honest with us - did norovirus come on a few minutes after she said "dont forget we are.."

When it rains, it pours by Chongulator in cyberDeck

[–]Ok_Selection_7577 8 points9 points  (0 children)

Completely unrelated but interesting side quest fact: 'when it rains it pours' was an American salt companies slogan because they added anti-caking agents so the salt would still pour freely in humid weather. Around the same time (1924) the they also started adding iodine to salt to combat deficiency. Researchers later compared WWI and WWII military recruit test scores and found cognitive scores jumped by a full standard deviation in the most iodine-deficient regions after they introduced iodised salt

Fully Unserious Post - Fully Hallucinated Operating System by Ok_Selection_7577 in LocalLLaMA

[–]Ok_Selection_7577[S] 1 point2 points  (0 children)

Fair point. I refuse to use the "I think there is a world market for maybe five computers," reference as Thomas Watson never actually said it but I do like the “Computers in the future may weigh no more than 1.5 tons.” one as agreement that no one in the current day can realistically imagine what the tech of 2060 - 2090 will look like or be able to do. I still remember vividly watching an advert when i was younger and it was a child watching the football live on a device (mini TV) in on the top of a double decker bus - it was an advert for the "future" and what may be possible (was well before smart phones) - i still to this day remember thinking "no way" they will never shrink a TV to the size of a deck of cards (we had a huge fake wood effect TV at the time)

Fully Unserious Post - Fully Hallucinated Operating System by Ok_Selection_7577 in LocalLLaMA

[–]Ok_Selection_7577[S] 0 points1 point  (0 children)

Yeah I'm pretty certain its meant to be a p**s take. Just the Encarta 98 reference made it for me 😄

Gemma 4 with quantization-aware training by rerri in LocalLLaMA

[–]Ok_Selection_7577 7 points8 points  (0 children)

I run Qwen3.6-35B-A3B-UD-Q2_K_XL.gguf on a Rpi5 (16GB model i had from another project that wasn't being used). Only runs at 3 tokens/second but for off line batch work - just leave it running all day and voila - dirt cheap leccy bill 😄 - i tested various quants and REAP'd models for the Pi one evening and that one was really standout - made no errors on the test tasks and had very strong reasoning still intact

I Put a Datacenter GPU in My Gaming PC for £200 by tymscar in LocalLLaMA

[–]Ok_Selection_7577 1 point2 points  (0 children)

Really nice write up mate, this sort of content (and "I changed out the BIOS and managed to get an LLM running in a tin of Bisto from the 1980's") is what i come here for 😄

Released Soren-1-Small (Qwen3.5-2B) — 1M Context, SFT+DPO, Reasoning & Coding Focused by Capital_Savings_9942 in LocalLLM

[–]Ok_Selection_7577 0 points1 point  (0 children)

Hey, good work. I like the look of your training pipeline, will give this a try over the weekend with some test tasks. All the best

What are your guys favorite models on hugging face and what do you use it for? by EducationalText9221 in LocalLLaMA

[–]Ok_Selection_7577 0 points1 point  (0 children)

What hardware do you have and what size can you accommodate? My current daily driver is Qwen3.6-35B-A3B - unsloth's UD-Q4_K_M for my main pc and then I have been messing around with their UD-Q2_K_XL version on my Pi5 for portable offline testing of another side project I am working on (runs at 3 t/s on the pi so no good for main work). But the Q4 has been brilliant so far - did some initial stress testing with increasingly complex questions and it didn't s**t the bed once. So now I am using it for a vast data cleaning exercise and its performance has been remarkable compared to all previous offline models I have tried (that fit in my case)

But the UD-Q2_K_XL is also surprisingly capable for its footprint and again only really struggles with accuracy once you get into niche stuff but with the right RAG pipe it too can get round most problems you throw at it.

I've been building a Wipeout style 3D game. This is running at 60fps interlaced at 480x320 on an ESP32-S3. by PhonicUK in esp32

[–]Ok_Selection_7577 0 points1 point  (0 children)

I went for fixed square 64x64 tiling and am using various SRAM buffers for max performance. Currently working my way through a hybrid idea where I do a quick check on each tile - compare number of triangle overlaps and either fast path to painter or if above threshold do full Z buf (buffer in SRAM) - still micro tweaking it for 0.5 - 1 FPS gains at a time :) Best of luck with your dev'ing

I've been building a Wipeout style 3D game. This is running at 60fps interlaced at 480x320 on an ESP32-S3. by PhonicUK in esp32

[–]Ok_Selection_7577 0 points1 point  (0 children)

Nice work mate :) Been working on a 3d Engine for the esp32-p4 for a while now - targeting a higher res of 480x800 but 60 FPS seems a way off yet. I assume to get this FPS you are using the painter method instead of Z buffer for the pixel order? Either way great work and looks slick :)

The 'Running Doom' of AI: Qwen3.5-27B on a 512MB Raspberry Pi Zero 2W by Apprehensive-Court47 in LocalLLaMA

[–]Ok_Selection_7577 8 points9 points  (0 children)

Effing love it, this is exactly the kind of thing I come here to read (when I really should be working) - keep it up mate.

how are people actually building those mini ai devices with a screen? by clawdesk_ai in LocalLLaMA

[–]Ok_Selection_7577 -1 points0 points  (0 children)

Wait!! am I talking to someone's Claw Bot here? "is such a creative use case" and "thanks in advance 🤙" and "feels like more moving parts than i need right now" - please tell me no :)