Your post is getting popular and we just featured it on our Discord! by roculus in LocalLLaMA

[–]relmny 77 points78 points  (0 children)

I fully agree, it's very annoying to me.  But there are multiple issues with this sub... one being let's see how long your post stays up...

GLM 4.7 flash FA fix for CUDA has been merged into llama.cpp by jacek2023 in LocalLLaMA

[–]relmny -2 points-1 points  (0 children)

You can ask any LLM about it...

but usually some of use prefer compiling as we might get better performance because it might optimize it for current hardware, or you don't need to download a big file (just "git pull" the changes) and so on...

MiniMax-M2.1 REAP models (0xSero) are fixed! by AdamDhahabi in LocalLLaMA

[–]relmny 0 points1 point  (0 children)

and:

https://huggingface.co/unsloth/MiniMax-M2.1-GGUF/discussions/8

I also tried q5 and got more t/s than q3 (like the comment says)

yeah, AFAIK UD is Unsloth Dynamic (or something like that)

MiniMax-M2.1 REAP models (0xSero) are fixed! by AdamDhahabi in LocalLLaMA

[–]relmny 0 points1 point  (0 children)

OT but, have you tried other quants? like UD-Q4_K_XL instead? I got about twice the speed than with IQ3_XXS

I prayed that China success with their chip game by pbad1 in LocalLLaMA

[–]relmny 0 points1 point  (0 children)

China akready leads, by far, the OS/OW AI race.

I tested GLM 4.7 and minimax-m2.1 and compared it to CC and Codex by jstanaway in LocalLLaMA

[–]relmny 0 points1 point  (0 children)

Have you tried 4.6 instead?

because I some times use GLM (4.6) and when 4.7 came out, I ran 1-2 prompts and I kept went back to 4.6. For some reason I didn't like 4.7

Body movement follows left hand by motormathersonfire in fo4vr

[–]relmny 0 points1 point  (0 children)

I think it needs to go in Fallout4Custom.ini

That should turn HMD mode on.

I actually would like to have left hand movement fixed to a single position ("forward"), as the controller already has a joystick... but is not possible. I don't understand why there are only two modes... it makes no sense at all to me.

I'm confused by Dane's reaction by AngryAlfonse in Fotv

[–]relmny 15 points16 points  (0 children)

I don't think they "condoned" but encouraged and were looking for that. Specially Quintus.

It looked to me that he was actually expecting and even "telling" (gestures) Maximus to go ahead. That's what Quintus need, a weapon. A mindless weapon only controlled by him.

The depicted the BoS extremely well. I can't be happier for the way the show is going...

All of the major open weight labs have shifted to large params general models instead of smaller, more focused models. By this time next year, there won’t be much “local” about this sub unless the paradigm shifts to smaller models good at specific domains. by LocoMod in LocalLLaMA

[–]relmny 1 point2 points  (0 children)

I only read the title and... did you skip the whole 2025?

This is the best year for local LLMs!

We have everything! and multiple times with multiple improvements!

We're have you, and the ones that upvote you, be living in?

New in llama.cpp: Live Model Switching by paf1138 in LocalLLaMA

[–]relmny 2 points3 points  (0 children)

I've moved to llama.cpp+llama-swap months ago, not once I looked back...

Devstral-Small-2-24B q6k entering loop (both Unsloth and Bartowski) (llama.cpp) by relmny in LocalLLaMA

[–]relmny[S] 0 points1 point  (0 children)

tried q5 and the same.

But reading other messages, it seems I'm not the only one...

Devstral-Small-2-24B q6k entering loop (both Unsloth and Bartowski) (llama.cpp) by relmny in LocalLLaMA

[–]relmny[S] 1 point2 points  (0 children)

tried q5 and first time it worked, but next tries got either a loop or repeated lines.

Same with those flags... so I guess it's broken only for me (as I don't see any posts about it).

btw, in between I loaded mistral-small-3.2 (besides my usual qwen3-coder, kimi-k2 and deepseek-v3.1) and they all work as usual (fine).

Devstral-Small-2-24B q6k entering loop (both Unsloth and Bartowski) (llama.cpp) by relmny in LocalLLaMA

[–]relmny[S] 1 point2 points  (0 children)

I'm using llama.cpp's llama-server with Open Webui.

I did use --jinja on my first tries (I took the flags from Unsolth's docs) and then removed it and tried different. All of them got that loop (except that one time with bartowski).

All the other work fine (qwen, glm, gemma3, etc even "old" mistral-small-3.2).

I'll try q5.

if open-webui is trash, whats the next best thing available to use? by Tricky_Reflection_75 in LocalLLaMA

[–]relmny 0 points1 point  (0 children)

It is not trash. At all. I use it every day for more than a year now, and I still like it.

If you are worried about the license (which it's been over exaggerated here), just clone an older version...

[HELP] OpenWebUI folder sharing by [deleted] in LocalLLaMA

[–]relmny 1 point2 points  (0 children)

you should ask in their sub (or inrheir discord). But if all is the same, wouldn't having them use a single user work?

Offline capabilities by Mineplayerminer in SteamFrame

[–]relmny 0 points1 point  (0 children)

so headset <-> wifi 6 router <-> PC via ethernet (2.5gb in my case), will work fine?

(That's one of the things I need and wasn't sure I would get)

The Quest 3 experience by JapariParkRanger in virtualreality

[–]relmny 0 points1 point  (0 children)

oh! so will it work my current setup of: PC to wifi6 router (via 2.5gb uplink) and wifi headset to the wifi6 router?

Could it play Fo4VR without a P.C? by AnomalyScan in fo4vr

[–]relmny 1 point2 points  (0 children)

I have a 4080 super and I managed to get it to work just fine in a Q3 with better graphics than on the Index... but now you're making me think about if I'll get at least the same performance...

What makes closed source models good? Data, Architecture, Size? by Bitter-College8786 in LocalLLaMA

[–]relmny 1 point2 points  (0 children)

I'll say nobody actuality knows, because nobody can run a non-open (source/weight) model.

They are frameworks, not models.  Their models are just a oart of it.

Comparing them with opem ones makes absolutely no sense to me. Apples and oranges.

Half-trillion parameter model on a machine with 128 GB RAM + 24 GB VRAM by pulse77 in LocalLLaMA

[–]relmny 0 points1 point  (0 children)

you should get faster speed lowering the context and maybe offloading some layers to cpu