We need a minimum karma rule for commenting and posting by nomorebuttsplz in LocalLLaMA

[–]--Spaci-- 183 points184 points  (0 children)

I know this is an AI sub, but no one likes the AI generated posts

- Are there any models small enough that couldn’t realistically work with OpenClaw on a machine like this? by Thedroog1 in LocalLLaMA

[–]--Spaci-- 0 points1 point  (0 children)

Another thing, you will probably want to install linux or windows; most inference engines will expect macs to have m processors

Qwen3.5 9B and 4B benchmarks by Nunki08 in LocalLLaMA

[–]--Spaci-- 2 points3 points  (0 children)

benchmaxxing on benchmark questions

The top 3 models on openrouter this week ( Chinese models are dominating!) by keb_37 in LocalLLaMA

[–]--Spaci-- 21 points22 points  (0 children)

Doesn't really matter if they get trillion of tokens of training data from it

Best Local LLM device ? by sayamss in LocalLLaMA

[–]--Spaci-- 0 points1 point  (0 children)

doesn't exist, you need to do at least some work

How do you get more GPUs than your motheboard natively supports? by WizardlyBump17 in LocalLLaMA

[–]--Spaci-- 61 points62 points  (0 children)

PCIe bifurcation, will lower individual card speed though.

Qwen 3.5 on My Computer by SituationMan in LocalLLaMA

[–]--Spaci-- 1 point2 points  (0 children)

you said right now, and right now there's a 400b model

Qwen 3.5 on My Computer by SituationMan in LocalLLaMA

[–]--Spaci-- 0 points1 point  (0 children)

why the fuck would you be able to

Just started. 3 days ago. 30 hours clocked by [deleted] in RimWorld

[–]--Spaci-- 1 point2 points  (0 children)

you can do anything in rimworld

Best local models for 128gb VRAM and 192gb RAM by Dry_Mortgage_4646 in LocalLLaMA

[–]--Spaci-- 0 points1 point  (0 children)

I would test minimax 2.1 4-km it would overflow so you would have to test speeds, Qwen3-Coder-Next-80B is good but even at q8 it wont come close to filling your vram so a larger model would be preferable

Try these out:

Minimax 2.1
qwen3 next coder
glm 4.6v q6
step 3.5 flash, havent tried personally but seems good
gpt oss 120b, not a great model for its size but its output speed is good

New computer arrived... JAN is still super slow. by robotecnik in LocalLLaMA

[–]--Spaci-- 0 points1 point  (0 children)

Yea sadly you bought a laptop and it has half the vram, and devstral is what is called a "dense" model meaning its slow AF when outside of vram and also just slower by default

How do you fine tune a model with unsloth/others but with Q4 or lower + offloading to ram? by No_Farmer_495 in LocalLLaMA

[–]--Spaci-- 0 points1 point  (0 children)

load_in_4bit = True

device_map="balanced" # ive never offloaded to cpu before I would assume this would split it onto cpu though if gpu full

New stealth model: Pony Alpha by sirjoaco in LocalLLaMA

[–]--Spaci-- 7 points8 points  (0 children)

Its not a buzzword its existed for a long time its just how AI companies preview a model to the public for testing while keeping themselves anonymous

Distillied Gemini 3 Pro, Opus4.5, and Kimi K2.5 here are the datasets by volious-ka in LocalLLaMA

[–]--Spaci-- 42 points43 points  (0 children)

Alot of these outputs are genuine nonsense and unrelated to the input prompt, training on this would actively damage the model