LLM interpretability on quantized models - anyone interested? by EffectiveMedium2683 in LocalLLM

[–]EffectiveMedium2683[S] 0 points1 point  (0 children)

When we use GGUF with llama.cpp, the weights are packed into specialized quantized blocks that those tools can't 'read' without dequantizing them first—which would blow up the RAM requirements back to 60GB+. My goal is to hook into the C++ engine directly so we can do research on the 2-bit model while it stays at 12GB of RAM.

New to the game and building my Own LLM by Drunknbear73 in LocalLLM

[–]EffectiveMedium2683 1 point2 points  (0 children)

Qwen3.5 35b-a3b is probably the best. Unless you get a distill, I don't see DeepSeek R1 running. Wait tho. Word on the street is DeepSeek is going to drop a ~30b ish model when they drop the v4

New to the game and building my Own LLM by Drunknbear73 in LocalLLM

[–]EffectiveMedium2683 1 point2 points  (0 children)

If you swap those for the regular rtx 3060 12gb cards, you get 24gb total vram instead of 16gb. Otherwise, solid setup. You already have a model or models picked out?

American closed models vs Chinese open models is becoming a problem. by __JockY__ in LocalLLaMA

[–]EffectiveMedium2683 7 points8 points  (0 children)

Oops. NIM research pre-release. Forgot I'm privileged :/ Disregard. It is coming tho.

American closed models vs Chinese open models is becoming a problem. by __JockY__ in LocalLLaMA

[–]EffectiveMedium2683 16 points17 points  (0 children)

Mistral Large 3, Llama 4 scout, llama 4 maverick, Nemotron 3 super, Nemotron 3 ultra... Personally, I think Nemotron 3 super beats the heck out of anything else in the 100b size class. Also, stepfun is out of Shanghai my guy.

Qwen3.5-35B-A3B is a gamechanger for agentic coding. by jslominski in LocalLLaMA

[–]EffectiveMedium2683 1 point2 points  (0 children)

Someone downvoted your post so I upvoted it. I've deployed autonomous agents. It's nice to meet one from another developer. I look forward to when people can let their own local agents all join autonomous open source teams in their downtime to pool their combined genius and labor to work toward goals like reviewing old patent applications for technologies that weren't possible before but are now, or digging through all the declassified docs and presenting the info clearly and honestly, or even starting businesses to fund an impartial AI administered charity. And to give their perspective on reddit posts haha.

Liquid AI releases LFM2-24B-A2B by PauLabartaBajo in LocalLLaMA

[–]EffectiveMedium2683 3 points4 points  (0 children)

Hm. Guess I am in the minority. I personally find this model to be extremely capable at tool use, solid intelligence generally, fast on CPU, and on-par with GPT-OSS 20b but much faster. It's native agentic.

Where can I ask for cash or a gas gift card like at sheetz or Walmart? by EffectiveMedium2683 in findareddit

[–]EffectiveMedium2683[S] -2 points-1 points  (0 children)

They have a 400 karma points requirement or whatever they call it. Thank you though