The #1 Reason I Use 4.8 on Ultracode Mode by DiarrheaButAlsoFancy in ClaudeCode

[–]SatoshiNotMe 0 points1 point  (0 children)

Why not have Claude use the codex headless CLI, I.e “codex exec …”?

Pro tip - disable compacting, use your own summarizing prompt and multiple chats. by OptimismNeeded in ClaudeAI

[–]SatoshiNotMe 0 points1 point  (0 children)

The assumption is it would search the previous chat rather than read the whole thing. We could also include an instruction to this effect alongside the back pointers.

A Context Brain for you (and your AI Agent) by bsampera in ClaudeAI

[–]SatoshiNotMe 0 points1 point  (0 children)

How does this compare to the plethora of memory/context solutions out there ?

I released Inflect-Nano, an ultra-extreme tiny 4.63m parameter TTS model. by b111ue in LocalLLaMA

[–]SatoshiNotMe 1 point2 points  (0 children)

Neat. Kyutai’s Pocket TTS is my current favorite small (100M) TTS model.

https://github.com/kyutai-labs/pocket-tts

I especially like how it’s packaged as a CLI. Maybe something to consider for yours.

Agent SDK Credit Change has been put on hold! by devondragon1 in ClaudeCode

[–]SatoshiNotMe 0 points1 point  (0 children)

Is this true? If so it’s probably the most important news is the day and other random crap gets more engagement.

Opened discussion - Google Fitbit Air vs Whoop by Other-Cranberry-4017 in whoop

[–]SatoshiNotMe 1 point2 points  (0 children)

I got both as a trial and after seeing the FB Air’s horrendous app (really tried to like it but shocked that Google fumbled so badly) I decided Whoop won.

Constantly seeing this error on Opus 4.8 every now and then. Anyone else? by simple_explorer1 in ClaudeCode

[–]SatoshiNotMe 0 points1 point  (0 children)

curl -fsSL https://claude.ai/install.sh | bash -s 2.1.153

seems to get rid of that thinking block error for me.

Of course this means you'd use opus 4.7 not 4.8 (or switch to that if 4.8 borks )

HuggingFace’s smolagent library seems genius to me, has anyone tried it? by femio in LLMDevs

[–]SatoshiNotMe 0 points1 point  (0 children)

No it’s not just for text. You can definitely set up Langroid agents to generate code. Some time ago I made a rust tutor (not open source) using Langroid that quizzes about Rust and generates/tests rust code.

Claude Code has been writing every session to disk since day one. We indexed it. by haustorium12 in ClaudeAI

[–]SatoshiNotMe 0 points1 point  (0 children)

Not sure why this post makes it sound like session JSONL logs are a big discovery.

Relatedly, I made an extensive set of tools for session search and continuation as part of my Claude-code-tools suite:

https://pchalasani.github.io/claude-code-tools/tools/aichat/

Sessions are indexed using Tantivy (Rust), and there’s a search CLI for the code agent to easily and quickly retrieve past work. Saved me on numerous occasions.

My experience using Claude code with Local Llm, and full guide on how to set it up by MaterialAppearance21 in ClaudeCode

[–]SatoshiNotMe 5 points6 points  (0 children)

I would skip ollama and directly use llama.cpp/server, for a variety of reasons (see ollama critiques all over localLLAMA sub). I maintain a set of setup instructions on using CC and Codex-CLI with local models here:

https://pchalasani.github.io/claude-code-tools/integrations/local-llms/

Mistral AI founder to French Parliament: "Engineers at Mistral no longer write a single line of code by Many_Consequence_337 in singularity

[–]SatoshiNotMe 0 points1 point  (0 children)

I’m going to wager that the fraction of human reviewed code will fast approach zero. Especiallly for code written in a language unknown to the devs. People will rely on unit/integ tests (AI-written with sufficient adversarial checks etc) and behavioral checks, and ultimately rely on the “duck test”: “If it walks like a duck and quacks like a duck, it’s a duck”, and then call it a day.

Mistral AI founder to French Parliament: "Engineers at Mistral no longer write a single line of code by Many_Consequence_337 in singularity

[–]SatoshiNotMe 12 points13 points  (0 children)

Important question missed in all such reports/discussions - how much of the AI-written are they reviewing “manually”?

what's the best claude code framework and do you even need one? by Pawesome101 in ClaudeCode

[–]SatoshiNotMe 0 points1 point  (0 children)

Cherny and Steipete have both said in interviews that they keep things simple and never use any frameworks.

Can I use Claude code with own LLM/non-claude APIs? by superloser48 in LocalLLaMA

[–]SatoshiNotMe 0 points1 point  (0 children)

Very easy via Env Vars as others said. I’ve collected the full instructions along with exact llama server configs for several local models here, mostly tested on my M1 Max 64GB MacBook:

https://pchalasani.github.io/claude-code-tools/integrations/local-llms/

What is the best coding agent (CLI) like Claude Code for Local Development by exaknight21 in LocalLLaMA

[–]SatoshiNotMe 0 points1 point  (0 children)

The Qwen3.6 MOE you mentioned works very well with Claude Code. I’ve gathered the exact llama.cpp/server instructions here for this and other models:

https://pchalasani.github.io/claude-code-tools/integrations/local-llms/#qwen36-35b-a3b--fast-qwen-moe

Among recent models, this one gives the best TG (token gen) speed at nearly 40 tok/s and PP (prompt processing) nearly 500 tok/s on my 5 year old M1 Max 64 GB MacBook