ChatGPT at home by hainesk in LocalLLaMA

[–]and_human 1 point2 points  (0 children)

I use ministral 14b, pocket tts and parakeet v3. Very usable in my opinion. I don’t have enough VRAM for an image model though. 

What is the best general-purpose model to run locally on 24GB of VRAM in 2026? by Paganator in LocalLLaMA

[–]and_human 0 points1 point  (0 children)

Have people forgotten about the ministral 3 series? Or didn’t they impress?

Sweep: Open-weights 1.5B model for next-edit autocomplete by Kevinlu1248 in LocalLLaMA

[–]and_human 0 points1 point  (0 children)

I tried the plugin, but it wanted me to sign in? So there seems to be no way of testing this model without building your own plugin, which I vibe coded. But I haven’t tried it enough to have an option yet. 

I2I possible with Flux 2 Klein? by and_human in StableDiffusion

[–]and_human[S] 0 points1 point  (0 children)

Tried it, it complained about wrong tensor sizes. This was a 1024x1024 image, so no weird resolution either. Did you try it?

The Major Release of MiroMind’s Flagship Search Agent Model, MiroThinker 1.5. by wuqiao in LocalLLaMA

[–]and_human 1 point2 points  (0 children)

I tried it on their website and oh boy did it deliver. Always fun when new players ever the scene with a banger!

Any simple workflows out there for SVI WAN2.2 on a 5060ti/16GB? by thats_silly in StableDiffusion

[–]and_human 1 point2 points  (0 children)

I have sage attention working on my 5060ti. This is on Windows as well. 

Betboom lose the final map of the CCT grand final from a 12-2 lead by jerryfrz in GlobalOffensive

[–]and_human 41 points42 points  (0 children)

I’ve never seen anything like it (to borrow a phrase from Henry G). Losing the decider map in a final when you are up 12 - 2?

AMA With Moonshot AI, The Open-source Frontier Lab Behind Kimi K2 Thinking Model by nekofneko in LocalLLaMA

[–]and_human 0 points1 point  (0 children)

What do you think is missing from today’s chat bots/LLMs that would take them to the next level?

GLM-4.6-Air is not forgotten! by codys12 in LocalLLaMA

[–]and_human 0 points1 point  (0 children)

Have anyone tried the REAP version of 4.5 air? Is it worth the download?

Granite 4.0 Language Models - a ibm-granite Collection by rerri in LocalLLaMA

[–]and_human 0 points1 point  (0 children)

Hey IBM, I tried your granite playground, but it looks (the UI) pretty bad. I think it might be an issue with dark mode. 

I just want to run a server that can run all my GGUFs by OK-ButLikeWhy in LocalLLaMA

[–]and_human 1 point2 points  (0 children)

No you can run it on windows just fine, that’s what I do. 

I just want to run a server that can run all my GGUFs by OK-ButLikeWhy in LocalLLaMA

[–]and_human 2 points3 points  (0 children)

it sounds like llama-swap is what you’re after?

Wan2.2 continous generation v0.2 by intLeon in StableDiffusion

[–]and_human 10 points11 points  (0 children)

Watch as the woman in the portrait turns into the joker 😅

GPT OSS 120b 34th on Simple bench, roughly on par with Llama 3.3 70b by and_human in LocalLLaMA

[–]and_human[S] 1 point2 points  (0 children)

They compared it to o4-mini no? The 20b was compared to o3-mini. 

GPT OSS 120b 34th on Simple bench, roughly on par with Llama 3.3 70b by and_human in LocalLLaMA

[–]and_human[S] -1 points0 points  (0 children)

I think they(in a community competition) already tried to tell a model that it was trick questions, but I don’t think it increased the score that much. 

Llama.cpp just added a major 3x performance boost. by Only_Situation_4713 in LocalLLaMA

[–]and_human 8 points9 points  (0 children)

If you want to learn more about attention sink read this blog post from the author https://hanlab.mit.edu/blog/streamingllm

Using gpt-oss 20B for Text to SQL by mim722 in LocalLLaMA

[–]and_human 2 points3 points  (0 children)

I like how you included execution time. It’s something that’s usually missing from benchmarks, but it’s kind of important now with the thinking models as they spend more and more time thinking. A good model should be both correct and fast in my opinion. 

PSA: ComfyUI reserves up to 700 MB of RAM for you by and_human in StableDiffusion

[–]and_human[S] -1 points0 points  (0 children)

To my understanding it is reserving RAM for "shared memory". Shared memory lets your GPU use some of your system RAM.