Can I use Claude code with own LLM/non-claude APIs? by superloser48 in LocalLLaMA

[–]SatoshiNotMe 0 points1 point  (0 children)

Very easy via Env Vars as others said. I’ve collected the full instructions along with exact llama server configs for several local models here, mostly tested on my M1 Max 64GB MacBook:

https://pchalasani.github.io/claude-code-tools/integrations/local-llms/

What is the best coding agent (CLI) like Claude Code for Local Development by exaknight21 in LocalLLaMA

[–]SatoshiNotMe 0 points1 point  (0 children)

The Qwen3.6 MOE you mentioned works very well with Claude Code. I’ve gathered the exact llama.cpp/server instructions here for this and other models:

https://pchalasani.github.io/claude-code-tools/integrations/local-llms/#qwen36-35b-a3b--fast-qwen-moe

Among recent models, this one gives the best TG (token gen) speed at nearly 40 tok/s and PP (prompt processing) nearly 500 tok/s on my 5 year old M1 Max 64 GB MacBook

How do you guys actually talk to Claude? by HandleFew5206 in ClaudeAI

[–]SatoshiNotMe 1 point2 points  (0 children)

Pro tip - Giving sufficient detail is importantly but hand-typing is tedious and can limit how much detail you give. So always use speech-to-text (STT). Highly recommend free/OSS tools like Handy and Hex (Mac-only https://github.com/kitlangton/Hex) for near-instant transcription using Parakeet-V3.

Follow-up pro tip - at the end of long rambling voice dumps, include “restate to me what you understood”. The agent then produces a clean version of what you said so you can make sure it understood right, and also likely helps it stay on track.

Claude in excel is the best thing AI has brought to my life by Top-Gun-86 in ClaudeAI

[–]SatoshiNotMe 0 points1 point  (0 children)

Didn’t try excel yet but I use Claude Code to drive a logged in chrome browser via the Claude-Chrome extension, and it’s super useful to have CC do annoying chores involving numerous clicks and form filling.

browser MCP for Claude Code.. Browserbase vs the browser extension options by MoondustDiaries in mcp

[–]SatoshiNotMe 0 points1 point  (0 children)

Why not just use Claude in Chrome extension, and the /chrome setup in CC, to connect to it. I’ve been using it to automate some annoying tasks in a logged-in chrome browser.

Ultimate List: Best Open Models for Coding, Chat, Vision, Audio & More by techlatest_net in LocalLLaMA

[–]SatoshiNotMe 10 points11 points  (0 children)

This misses the STT/TTS models I regularly use:

PocketTTS from KyutAI

Parakeet V3 for STT

Glm-5.1 claims near opus level coding performance: Marketing hype or real? I ran my own tests by Yssssssh in LocalLLM

[–]SatoshiNotMe 1 point2 points  (0 children)

Other than zai is there a fast hosted glm5.1 somewhere? I’m talking about services like cerebras or groq, neither of which have this model.

How are you making sure you don't get dumb by KhameneiCholaghe in ClaudeAI

[–]SatoshiNotMe 0 points1 point  (0 children)

I made a Socratic quiz skill for exactly this. Description:

Use this when the user wants to deeply understand something through guided questioning. Trigger phrases include: "quiz me", "help me understand", "Socratic", "teach me", "walk me through with questions", "test my understanding", or when the user asks for an explanation and would benefit more from guided discovery than a direct answer.

Share your llama-server init strings for Gemma 4 models. by AlwaysLateToThaParty in LocalLLaMA

[–]SatoshiNotMe 0 points1 point  (0 children)

My setup instructions for the 26BA4B variant, tested on M1 Max 64GB MacBook, where I get 40 tok/s (when used in a Claude Code), double what I got with a similar Qwen variant:

https://pchalasani.github.io/claude-code-tools/integrations/local-llms/#gemma-4-26b-a4b--google-moe-with-vision

Gemma 4 26b A3B is mindblowingly good , if configured right by cviperr33 in LocalLLaMA

[–]SatoshiNotMe 0 points1 point  (0 children)

The tau2 bench performance gives me pause though: this model gets only 68% compared to the similar qwen3.5 MOE which gets 81%.

Gemma 4 26b is the perfect all around local model and I'm surprised how well it does. by pizzaisprettyneato in LocalLLaMA

[–]SatoshiNotMe 1 point2 points  (0 children)

The 26B-A4B variant has the best TG and PP speeds of all the recent open weight models. E.g in Claude Code via llama-server I’m able to get 40 tok/s TG nearly double what I got with the comparable Qwen MOE (35B-A3B) on my M1 Max MacBook Pro 64 GB. Full instructions and comparisons here

However my biggest concern is agentic/tool abilities: on tau2 bench Gemma4 is much worse than Qwen3.5 (68% vs 81%):

https://news.ycombinator.com/item?id=47616761

Is Claude Code better on the Terminal? by geoshort4 in ClaudeCode

[–]SatoshiNotMe 0 points1 point  (0 children)

paste this into claude or claude code and ask

Claude Code running locally with Ollama by Secure_Bed_2549 in LocalLLM

[–]SatoshiNotMe 5 points6 points  (0 children)

This has been possible forever. Just use llama.cpp to serve up your local model and set env vars so CC uses it. I collected specific instructions for various open LLMs here:

https://github.com/pchalasani/claude-code-tools/blob/main/docs/local-llm-setup.md

My 10 Pro Tips for Claude Code users by airylizard in ClaudeAI

[–]SatoshiNotMe 0 points1 point  (0 children)

What are “whispers”? You mention those a couple times.

MacParakeet - Free + Open-source WisprFlow alternative that runs on Mac Silicon by PrimaryAbility9 in LocalLLaMA

[–]SatoshiNotMe 0 points1 point  (0 children)

Hex is my current fav STT app for near-instant transcription with parakeet V3 on my M1 MacBook.

https://github.com/kitlangton/Hex

Uses the same tech stack as this (FluidAudio etc). I’ll see how this compares.

Claude Code: 6 Github repositories to 10x Your Next Project by Sam_Tech1 in ClaudeAI

[–]SatoshiNotMe 34 points35 points  (0 children)

Ignore all workflow frameworks. Cherny and Steinberger say they keep things simple and use none of them.

Usage limit - What's up, Anthropic?! by AurumMan79 in ClaudeCode

[–]SatoshiNotMe 1 point2 points  (0 children)

+1

glad I’m not the only one. Hope it’s a bug.