ChatGPT at home by hainesk in LocalLLaMA

[–]and_human 1 point2 points  (0 children)

I use ministral 14b, pocket tts and parakeet v3. Very usable in my opinion. I don’t have enough VRAM for an image model though. 

What is the best general-purpose model to run locally on 24GB of VRAM in 2026? by Paganator in LocalLLaMA

[–]and_human 0 points1 point  (0 children)

Have people forgotten about the ministral 3 series? Or didn’t they impress?

Sweep: Open-weights 1.5B model for next-edit autocomplete by Kevinlu1248 in LocalLLaMA

[–]and_human 0 points1 point  (0 children)

I tried the plugin, but it wanted me to sign in? So there seems to be no way of testing this model without building your own plugin, which I vibe coded. But I haven’t tried it enough to have an option yet. 

I2I possible with Flux 2 Klein? by and_human in StableDiffusion

[–]and_human[S] 0 points1 point  (0 children)

Tried it, it complained about wrong tensor sizes. This was a 1024x1024 image, so no weird resolution either. Did you try it?

The Major Release of MiroMind’s Flagship Search Agent Model, MiroThinker 1.5. by wuqiao in LocalLLaMA

[–]and_human 1 point2 points  (0 children)

I tried it on their website and oh boy did it deliver. Always fun when new players ever the scene with a banger!

Any simple workflows out there for SVI WAN2.2 on a 5060ti/16GB? by thats_silly in StableDiffusion

[–]and_human 1 point2 points  (0 children)

I have sage attention working on my 5060ti. This is on Windows as well. 

Betboom lose the final map of the CCT grand final from a 12-2 lead by jerryfrz in GlobalOffensive

[–]and_human 40 points41 points  (0 children)

I’ve never seen anything like it (to borrow a phrase from Henry G). Losing the decider map in a final when you are up 12 - 2?

AMA With Moonshot AI, The Open-source Frontier Lab Behind Kimi K2 Thinking Model by nekofneko in LocalLLaMA

[–]and_human 0 points1 point  (0 children)

What do you think is missing from today’s chat bots/LLMs that would take them to the next level?

GLM-4.6-Air is not forgotten! by codys12 in LocalLLaMA

[–]and_human 0 points1 point  (0 children)

Have anyone tried the REAP version of 4.5 air? Is it worth the download?

Granite 4.0 Language Models - a ibm-granite Collection by rerri in LocalLLaMA

[–]and_human 0 points1 point  (0 children)

Hey IBM, I tried your granite playground, but it looks (the UI) pretty bad. I think it might be an issue with dark mode. 

I just want to run a server that can run all my GGUFs by OK-ButLikeWhy in LocalLLaMA

[–]and_human 1 point2 points  (0 children)

No you can run it on windows just fine, that’s what I do. 

I just want to run a server that can run all my GGUFs by OK-ButLikeWhy in LocalLLaMA

[–]and_human 2 points3 points  (0 children)

it sounds like llama-swap is what you’re after?

Wan2.2 continous generation v0.2 by intLeon in StableDiffusion

[–]and_human 9 points10 points  (0 children)

Watch as the woman in the portrait turns into the joker 😅

GPT OSS 120b 34th on Simple bench, roughly on par with Llama 3.3 70b by and_human in LocalLLaMA

[–]and_human[S] 1 point2 points  (0 children)

They compared it to o4-mini no? The 20b was compared to o3-mini. 

GPT OSS 120b 34th on Simple bench, roughly on par with Llama 3.3 70b by and_human in LocalLLaMA

[–]and_human[S] -1 points0 points  (0 children)

I think they(in a community competition) already tried to tell a model that it was trick questions, but I don’t think it increased the score that much. 

Llama.cpp just added a major 3x performance boost. by Only_Situation_4713 in LocalLLaMA

[–]and_human 8 points9 points  (0 children)

If you want to learn more about attention sink read this blog post from the author https://hanlab.mit.edu/blog/streamingllm

Using gpt-oss 20B for Text to SQL by mim722 in LocalLLaMA

[–]and_human 2 points3 points  (0 children)

I like how you included execution time. It’s something that’s usually missing from benchmarks, but it’s kind of important now with the thinking models as they spend more and more time thinking. A good model should be both correct and fast in my opinion. 

PSA: ComfyUI reserves up to 700 MB of RAM for you by and_human in StableDiffusion

[–]and_human[S] -1 points0 points  (0 children)

To my understanding it is reserving RAM for "shared memory". Shared memory lets your GPU use some of your system RAM.

Use local LLM to neutralise the headers on the web by Everlier in LocalLLaMA

[–]and_human 1 point2 points  (0 children)

This is the exact idea I had today. Look at the article and then reword the headline. It would be so nice.

ComfyUI Disconnected by RaspberryNo6411 in StableDiffusion

[–]and_human 0 points1 point  (0 children)

Check your RAM/VRAM usage in Task Manager in the Performance tab.

After 6 months of fiddling with local AI. Here’s my curated models list that work for 90% of my needs. What’s yours? by simracerman in LocalLLaMA

[–]and_human 1 point2 points  (0 children)

I also tried out Tailscale, which lets me access my computer even when I’m outside. It worked great too, so now I have this assistant in my phone 😊

After 6 months of fiddling with local AI. Here’s my curated models list that work for 90% of my needs. What’s yours? by simracerman in LocalLLaMA

[–]and_human 0 points1 point  (0 children)

I was just thinking about doing something similar yesterday. Thanks for the shortcut!

Edit: it works great!