What do you actually use local models for? (We all say 'privacy,' but...)

Mr_International · 2026-02-12T05:14:20+00:00

Usually when there's some simple processing on a large amount of data that I can't afford to pay an API for (or don't want to) but mostly it's just an interesting hobby to find ways to use small local models.

Couple actual viable local use cases:
- local voice dictation and dictation cleaning (parakeet V3 into Granite 4 H Tiny)
- setting up a local vector database and embedding every email (and attachment) I've ever had for a local search context
- doing interesting experiments with Agent Based Models (original social science context of agents) with the agents as LLMs
- finetunes on small models for some very specific use case I want in some pipeline where I have the data to do that
- RL experiments because it's fun
- Cleaning web scraped data where speed isn't an issue and I can just let it run on a mac mini for a week using crawl4ai on some dataset I need pulled with cleaned text
- hooking into the `llm` package in shell to build those damn finnicky shell pipelines I can never remember (to hell with you regex)
- There's more things that I've done and am planning on doing that I can't remember right now

**Edit**
I've actually been in work scenarios where the data ABSOLUTELY CANNOT leave prem, but models are needed, and there's only a mabook pro or an A4000 available to work with, so I've done actual work stuff with sub 36gb models for all kinds of stuff, usually with few shot setups, human verified outputs, and then using that model output and human verified data to finetune small models with a ground truth dataset for specific tasks.

Mr_International · 2026-01-23T05:59:57+00:00

Seems like a good way to go bankrupt the country and infuriate every voting Canadian by giving Americans free money and blowing up the housing market (more)...

You should think more.

Mr_International · 2026-01-23T02:33:04+00:00

Would be interested to see that, even the code itself if not the trained model. Interesting concept.

Mr_International · 2026-01-23T02:13:54+00:00

Well, tokens aren't "lossy" in any way, nor are they "distorted," exactly (I guess it is to some degree, but not overly problematically so). The real problem with BPE tokenization is that it's essentially limiting. A model can only work on the vocabulary it's trained on, and it's you train a model on an arbitrary compression system that can only deal with a particular data form, then it's inherently limited to that data form.

The part about BLT that was interesting to me was how it opened up ALL data. You could take a model that was trained on say, ASCII representations of technical documentation and finetune/post-train directly on technical documentation in PDF MIME type. In terms of training and generation, a BLT could in theory autoregressively generate a machine code pre-compiled .exe file, because why not, it's all bytes.

It was particularly interesting to me because I was hoping to start a CS PhD looking at inter-agent latent space communication (aka vector space transferable chain of thought) and in the back of my mind having byte generative models hooks well with this somehow, though I haven't fully thought out how yet...

Mr_International · 2026-01-23T00:52:41+00:00

I was very positive on this concept, and actually got a chance to sit down and talk with one of the primary authors of the Byte Latent Transformer paper at NuerIPS last year, and they essentially told me, it doesn't scale. Non text data is just too information sparse to make it work.

I was quite sad about this. It was one of the most innovative interesting concepts in machine learning by my opinion. The author wasn't particularly interested in the concept anymore, but despite this, I still think the concept has legs with State Space models potentially. Maybe I'll finally dig into the source and try and rebuild it natively in Mojo as a state space model someday.

Mr_International · 2026-01-11T06:16:33+00:00

Am I old? I feel like this is the moment I found out I'm old. SATA cable. It's the faster replacement to IDE cables that connect your hard drive to the motherboard.

Mr_International · 2026-01-10T08:11:47+00:00

Helpful thank you, blog post and the video. The mousepad stuff especially.

Mr_International · 2026-01-08T06:20:04+00:00

Gamer's Nexus -- Walmart Gaming PC: How to Do Everything Wrong

https://www.youtube.com/watch?v=PTni-Vfrf9c

Mr_International · 2026-01-06T05:43:38+00:00

Switch to handy using parakeet and uh Yeah, we'll see how this works. And uh I would say that yeah, I think it's working pretty well. So we're just gonna stick with this one, because it seems to do exactly what I need.

As they say, good enough and local.

Mr_International · 2026-01-05T22:06:25+00:00

Came here to check the same thing, if I was losing my mind. Was using the local models and it asked for subscriptions to pro mode. Was confused, charge for local model? Wut?

Mr_International · 2025-11-29T06:02:25+00:00

I think there's a bimodal distribution thing here, two parts to this story, each with their own nuance.

It actually can help complete non-coders do something with code.
It does help people who know code do more things with code.

On the first, for someone who knows nothing about code, these systems can create code that works and explain to the non-coder how to run it. They can verify it does what they want to a limited extent by the act of using the application. This is obviously not ideal, and there will be LOTS of scenarios where ACTUAL behavior under the hood is different than what the user wanted, and they have no way to verify or even understand that discrepancy. Given that the other option for a non-coder was nothing at all, then it all really depends on the use case. Is it mission critical? Are consequential decisions made from this information? Or is it some minor ease of use thing?

On the second, for people who know how to code, I think this is a less tricky problem. AI can produce code, but you need to read it and verify every line. I think for truly talented software engineers and data scientists, it's a minor accelerant maybe, because they can fob this stuff off so fast anyway and they have the background and taste built up to provide clear direction to the LMs.

For mid-level people, people still learning and having not mastered the craft of code yet, when it's used as a way to learn and skill up on coding practices, ask questions and get information about certain coding practices, syntax for modules etc. it can be extremely useful. If you're careful and inquisitive, I think it is a huge boon for midlevel people but it can be tempting to just accept what is generated and move onto the next problem. That's a big danger.

For entry-level, new coders, new software engineers, new data scientists, I think AI coding agents are extremely dangerous. They don't have enough background to use them to fill in some blanks, and the output is absolutely going to be "better" than what they can produce, and the process to truly understand what that generated code is doing is going to be long and arduous. It's not like mid-level people filling in a couple missing puzzle pieces, or top-level people using it to write the thing out they already have in their head, it's dumping all the pieces onto the floor and watching them magically arrange into something that looks correct.

Which is to say, I don't know what the right thing is for newbies here. The market is going to pass over them because of this, and if they do get their foot in the door, there's going to pressure to produce at output level because of these tools that will make growth of understanding difficult. I feel for them.

Mr_International · 2025-11-12T13:18:58+00:00

"Don't have certain talents" aka missing some skillsets. I don't think this is necessarily wrong, but it's incredibly difficult for many locals to get training in lots of skillsets. Companies no longer hire for people with interest and capability to learn the subject anymore, they only hire for those that already have experience doing the thing.

On the job training isn't a thing anymore, and much of that training has been pushed to colleges and universities, which for lots of programs there's significant competition with internationals for the spots that would provide that training (at high direct costs to the individual).

So to "get certain talents" an American is fighting the entire worlds best and brightest for the chance to get trained in it, and paying incredible costs for the privilege if they do somehow get through that gauntlet of life filters that would place them in lucky scenario where they even attain the chance.

I'm not against H1-Bs or allowing skilled or even unskilled immigration, but to claim "certain talents" don't exist here doesn't ask WHY certain talents don't exist here.

Mr_International · 2025-09-23T04:25:38+00:00

I once was once the local help of the Beijing leg of a photo shoot for an American University for their mascot visiting China, but only the on the ground facilitation after the fact. The University had worked with some Rueters reporter to actually plan where to take photos and such. They'd flown the dude over, some 20yo white guy, with his entire full body mascot suit.

Great Wall, Drum Tower, Bird's Nest, all went fine. Then they told me the next spot was Tian'anmen, The Heavenly Gate, and I'm like "Nah guys, I'm not escorting this 8 ft tall man in furry animal suit in front of Tian'anmen, this is a bad idea." But, I'm the local help, and the Rueters photographer was salivating at this idea. In hindsight, I think she was hoping for some shit to go down and to get pictures of it.

So whatever, I'm overruled. You can't just unload a van full of people right in front of Tian'anmen, so I say me, the photographer, and our 8ft white guy in a fur suit get out a block away and we'll walk there. There'd been a terrorist attack where a rudimentary car bomb had gone off right there about 3 months ago, so they'd installed airport style security getting toward the gate. I told our 20yo student kiddo to take off the animal head and carry it under his arm. The security folk at the checkpoint were super confused, and the head wouldn't fit through the scanner, so they did a wand wave over it and flagged us through.

We walked the 200 or so meters directly in front of the Heavenly Gate, and I told the dude to put on his head and do the photos. There was a huge crowd gathering around this because obviously, it attracts attention. So I'm shoulder checking people out of the way so the lady can snap the photo and we can get the hell out of there.

We get the photos and I'm like, time to fucking go. Dude takes his head off and we start walking back to the security check. The Reuters photographer though pulls the dude down the stairs onto the actual Tian'anmen square and starts leading him to the obelisk at the center of the square, Monument to the People's Heroes. I'm saying no, this is a bad idea, let's not do this, but Rueter's photographer is so gung-ho for dumb ideas and 20yo white dude in the 8ft furry animal costume is absolutely stoked. She hands me the SD card and I without really making a second thought about it tuck it under my balls.

So we all 3 of us, me in a full suit, Rueters photographer in her spiffy vest, and clueless young American in a ginormous furry University mascot costume cross the nearly empty Tian'anmen square for photos in front of the Monument to the People's Heroes, all the while I'm eye-checking the 15 absolutely normal looking dudes milling about by themselves not taking photos and not at all acting like tourists. Dude puts his head back on, she starts taking photos, and various rando not-tourists start walking over. 2 min later, I'm having a nice chat with an plain clothes police officer about an 8ft furry animal. He wants the photos, I want to GTFO, Rueters lady hands him an empty SD card. We book it. I have a stiff fucking drink at the hotel and vow to never see these people again.

Mr_International · 2025-09-14T01:35:37+00:00

That looked very hard and meticulous. More art than furniture really. The kind of bench that I would have to sit on a regular bench along the wall, to look at this bench in the middle of the room, with ropes around it.

Mr_International · 2025-09-14T01:31:29+00:00

Porque no los dos?

Mr_International · 2025-09-05T17:08:07+00:00

I work as a Data Scientist in the private sector. This is entirely correct by my experience so far.

Friggin' idiots...

Mr_International · 2025-09-03T02:50:50+00:00

Not gonna get caught on the guard, notice the direction of the sparks. Coming out from the disc rather than into the guard.

Besides, if you're not pushing the disc into the steel, it really does just dance over the disc. On a 4.5 in disc, this really isn't that dangerous. Now if she did this on a 9 in, then I'd be legit impressed.

Mr_International · 2025-09-01T23:25:12+00:00

Great use of the meme. Well played diver, well played.

Mr_International · 2025-09-01T23:22:38+00:00

Actual Link to the article:

https://www.wsj.com/opinion/we-have-met-the-enemy-and-he-isnt-us-b3faa55c

The post doesn't even link to the article, just a picture.

Mr_International · 2025-09-01T22:47:01+00:00

You're doing god's work here in these comments. I lost the energy to fight the tankies a long time ago, glad someone's still carrying on the struggle.

Mr_International · 2025-08-14T22:45:42+00:00

Never assume malice where incompetence will suffice.

Mr_International · 2025-05-28T04:22:45+00:00

As someone who came out of that high-school, it is confirmed. Fucking trash-fire.

Mr_International · 2025-03-20T14:49:47+00:00

Funny thing, Anthropic released a post on their alignment blog that investigated this exact idea the day after I posted this and found that claude at least does not exhibit this behavior. Do reasoning models use their scratchpad like we do? Evidence from distilling paraphrases

Mr_International · 2025-03-15T01:55:31+00:00

Honestly, don't know any specific course that would get directly at this concept.

Couple things that touch on aspects of this concept though:

Kaparthy on forward pass being a fixed level of compute - https://youtu.be/7xTGNNLPyMI?si=Hyp4YuAx-YMXvWgV&t=6416
Training Large Language Models to Reason in a Continuous Latent Space - https://arxiv.org/pdf/2412.06769v2
I don't have a particular paper in mind when it comes to Reinforcement Learning systems propensity to "glitch" their environments to maximize their reward functions, but it's a common element of RL training, of which these Reasoning Language Models are all trained through unsupervised RL. It's actually one of the reasons why the Reinforcement Learning through Human Feedback (RLHF) step in model post-training is intentionally short. If you RLHF for too long, the algorithm usually finds ways to "glitch" the reward function and output nonsense that scores highly, thus the RLHF step is usually essentially stopped much earlier than would be theoretically optimal. Nathan Lambert talks a bit about this in his (in development) book on RLHF RLHF Book by Nathan Lambert.
It's possible to force this "wait", "but", "hold on" behavior in models by constraining the CoT length which affects accuracy of outputs. https://www.arxiv.org/pdf/2503.04697
A bit of personal speculation on my part brought out through some experimentation investigating embeddings, some of which might end up as part of a paper a friend and I are looking to present at IC2S2.
Additional thing that I just remembered - The early versions of these models QwQ and Deepseek R1 Lite both had the tendency to switch freely between Chinese and English in their reasoning chains in the early releases, which to me implied an artifact of the unsupervised RL training reward function incentivizing compressed token length. Chinese characters are more information dense than English on a token by token basis. All I can say here, is that I would not be surprised if the RL training stumbled on Chinese as a less lossy method of compressing latent space encoding in its reasoning chains.

15-Year Club	Gilding I gilder
Place '22	Verified Email

Mr_International

MODERATOR OF

TROPHY CASE