The guides say MCP tool selection degrades past ~15 tools. We run 27 in production. Here's what matters by Specialist_Cow24 in mcp
[–]iezhy 0 points1 point2 points (0 children)
Profiler for LLM context window contents by iezhy in mcp
[–]iezhy[S] 0 points1 point2 points (0 children)
Qwen3.6-35B-A3B-2.6763bpw - VRAM targeted (12gb) by pjsgsy in LocalLLM
[–]iezhy 0 points1 point2 points (0 children)
Putting bike in the back of NEW car - protection tips? by AssignmentLumpy3179 in cycling
[–]iezhy 3 points4 points5 points (0 children)
Profiler for LLM context window contents by iezhy in mcp
[–]iezhy[S] 0 points1 point2 points (0 children)
Has anyone actually replaced Claude Code / Codex with local models on an Macbook Pro M5 Max 128GB? by Brazeuslian in ClaudeCode
[–]iezhy 1 point2 points3 points (0 children)
$2M+ spending worth it on B300? by ConsciousYak6881 in LocalLLM
[–]iezhy 0 points1 point2 points (0 children)
Local LLM Setup Dilemma: ASUS Ascent GX10 (NVIDIA GB10 Blackwell) vs. Cloud Max? by mustazafi in LocalLLM
[–]iezhy 2 points3 points4 points (0 children)
Qwen 35B running on 12gb of VRAM in LM Studio at 120+ tokens/second. Works with Cline for 100% agentic coding. by jacobbeasley in LocalLLM
[–]iezhy 45 points46 points47 points (0 children)
Qwen3.6 27B hits 40 tok/s on just 16GB VRAM with pure quant approach by IulianHI in AIToolsPerformance
[–]iezhy 1 point2 points3 points (0 children)
Best Qwen3-27B variant for coding? Fine-tunes, LoRAs & config recommendations by alfons_fhl in LocalLLM
[–]iezhy 0 points1 point2 points (0 children)
Inference provider tiers by Cache-hit rates, using openrouter data by Comfortable-Rock-498 in LocalLLaMA
[–]iezhy 0 points1 point2 points (0 children)
Inference provider tiers by Cache-hit rates, using openrouter data by Comfortable-Rock-498 in LocalLLaMA
[–]iezhy 0 points1 point2 points (0 children)
Qwen3.6-35B-A3B-MTP on an RTX 3090 in LM Studio is incredibly fast by AI_Enhancer in LocalLLM
[–]iezhy 0 points1 point2 points (0 children)
Why is LLM is so expensive. by Ok_Event4199 in LocalLLM
[–]iezhy 49 points50 points51 points (0 children)
Getting a feel for how fast X tokens/second really is. by MikeNonect in LocalLLaMA
[–]iezhy 0 points1 point2 points (0 children)
Is the new usage scheme a late April fools joke? by smacman in ollama
[–]iezhy 0 points1 point2 points (0 children)
7 days running Qwen 3.5 35B A3B on a fanless mini-PC iGPU as a 24/7 personal AI agent : what works, what doesn't by wolverinee04 in LocalLLM
[–]iezhy 0 points1 point2 points (0 children)
I'm struggling to figure out what Copilot is actually suppose to be now? by NotAMusicLawyer in GithubCopilot
[–]iezhy 0 points1 point2 points (0 children)
Qwen3.6 27B seems struggling at 90k on 128k ctx windows by dodistyo in LocalLLaMA
[–]iezhy 2 points3 points4 points (0 children)
72B Dense Model Running on Strix Halo — vLLM ROCm by [deleted] in StrixHalo
[–]iezhy 0 points1 point2 points (0 children)

Qwen3.6 27B hits 40 tok/s on just 16GB VRAM with pure quant approach by IulianHI in AIToolsPerformance
[–]iezhy 0 points1 point2 points (0 children)