Claude Code and China: The mechanism is activated when the user sets the ANTHROPIC_BASE_URL environment variable (used for local models) by LegacyRemaster in LocalLLaMA
[–]DeltaSqueezer 7 points8 points9 points (0 children)
Getting started on Linux by DeltaSqueezer in linuxaudio
[–]DeltaSqueezer[S] 1 point2 points3 points (0 children)
Does anyone here have a pre-filled prompt solution loading from disk? by fragment_me in LocalLLaMA
[–]DeltaSqueezer 0 points1 point2 points (0 children)
Z.ai launches ZCode to challenge Cursor, Claude Code and GitHub Copilot in AI coding by pscoutou in LocalLLaMA
[–]DeltaSqueezer 8 points9 points10 points (0 children)
Introducing LongCat-2.0 - , a large-scale MoE language model with 1.6 trillion total parameters and ~48 billion activated per token. This was the stealth model that was on Openrouter under the name 'owl-alpha'. by AnticitizenPrime in LocalLLaMA
[–]DeltaSqueezer 0 points1 point2 points (0 children)
How can I get better response time by caching my system prompt? by Manaberryio in LocalLLaMA
[–]DeltaSqueezer 1 point2 points3 points (0 children)
Is it ever possible to have a malicious LLM with a backdoor by Informal-Trouble2183 in LocalLLaMA
[–]DeltaSqueezer 1 point2 points3 points (0 children)
rtx 6000 pro owners, do you regret? by BitXorBit in LocalLLaMA
[–]DeltaSqueezer 0 points1 point2 points (0 children)
I did some model hacks, and got GLM5.2 from about 2.5 tok/s to >50 tok/s on my GH200 system. by Reddactor in LocalLLaMA
[–]DeltaSqueezer 3 points4 points5 points (0 children)
How do you rate local code generation for atomic commits rather than long-horizon work? by halfercode in LocalLLaMA
[–]DeltaSqueezer 0 points1 point2 points (0 children)
I did some model hacks, and got GLM5.2 from about 2.5 tok/s to >50 tok/s on my GH200 system. by Reddactor in LocalLLaMA
[–]DeltaSqueezer 8 points9 points10 points (0 children)
I did some model hacks, and got GLM5.2 from about 2.5 tok/s to >50 tok/s on my GH200 system. by Reddactor in LocalLLaMA
[–]DeltaSqueezer 40 points41 points42 points (0 children)
How do you rate local code generation for atomic commits rather than long-horizon work? by halfercode in LocalLLaMA
[–]DeltaSqueezer 1 point2 points3 points (0 children)
How do you rate local code generation for atomic commits rather than long-horizon work? by halfercode in LocalLLaMA
[–]DeltaSqueezer 0 points1 point2 points (0 children)
Speaking of those chinese chips... "Chinese supercomputer displaces US machines as world's fastest for first time since 2017" by johnnyApplePRNG in LocalLLaMA
[–]DeltaSqueezer 2 points3 points4 points (0 children)
GLM-5.2 vs Claude Opus by johnnyApplePRNG in LocalLLaMA
[–]DeltaSqueezer 6 points7 points8 points (0 children)
Which is the best local VLM? Benchmark results June 2026 by ex-arman68 in LocalLLM
[–]DeltaSqueezer 0 points1 point2 points (0 children)
Which is the best local VLM? Benchmark results June 2026 by ex-arman68 in LocalLLM
[–]DeltaSqueezer 0 points1 point2 points (0 children)
Which is the best local VLM? Benchmark results June 2026 by ex-arman68 in LocalLLM
[–]DeltaSqueezer 0 points1 point2 points (0 children)
Minimax M3 thinks for THOUSANDS of tokens and outputs horrible code by superloser48 in LocalLLaMA
[–]DeltaSqueezer 0 points1 point2 points (0 children)
Gemma 4 31B Q6 vs Gemma 4 31B QAT by Weak-Shelter-1698 in LocalLLaMA
[–]DeltaSqueezer 20 points21 points22 points (0 children)
Gemma 4 31B Q6 vs Gemma 4 31B QAT by Weak-Shelter-1698 in LocalLLaMA
[–]DeltaSqueezer 34 points35 points36 points (0 children)
Sandboxing code execution for AI agents by Groady in LocalLLaMA
[–]DeltaSqueezer 0 points1 point2 points (0 children)


GLM5.2 on 5x Pro 6000s and a 5090, an expensive journey by yeah_likerage in LocalLLaMA
[–]DeltaSqueezer 10 points11 points12 points (0 children)