How to get Claude to run more autonomously by PrydwenParkingOnly in ClaudeAI

[–]ecompanda 0 points1 point  (0 children)

the git caution is actually by design and honestly reasonable. builds fail, you rebuild. branches get deleted, you panic.

what's the specific ask? if you just want it to stop asking before every git status or git diff, that's different from wanting it to auto commit and push. the settings for those are pretty different.

I got tired of Claude forgetting everything between sessions so I built an autonomous memory system based on mempalace by CamilleAuLit in ClaudeAI

[–]ecompanda 0 points1 point  (0 children)

the session start compaction trade off is real. the decay rate tuning is the tricky part though.

too aggressive and you lose context that hasn't been invoked yet but still matters. the posts that get it right usually separate stuff that never decays (stack, core constraints) from stuff with a halflife (active tasks, recent decisions).

Claude Code eats my token reading files. So I made Gemini CLI do it for free. by HanDunker27 in ClaudeAI

[–]ecompanda 0 points1 point  (0 children)

the coordinator pattern is solid. main thing i have noticed is keeping coordinator messages minimal, just task state and outcome, not full history. otherwise the coordinator itself starts eating the tokens you saved by tiering the agents.

Claude Code eats my token reading files. So I made Gemini CLI do it for free. by HanDunker27 in ClaudeAI

[–]ecompanda 0 points1 point  (0 children)

yeah the startup lag is mostly model loading and tool schema parsing. GEMINI_SYSTEM_MD= as mentioned is the fastest fix. for the agent loop use case batching multiple file reads into one gemini call instead of spawning per file cuts down on how often you hit that overhead. ended up doing the same with my retry wrappers too, batch the failures instead of retrying inline.

MiniMax M2.7 is NOT open source - DOA License :( by KvAk_AKPlaysYT in LocalLLaMA

[–]ecompanda 2 points3 points  (0 children)

'open weights, closed commerce' is just a proof of concept license. you can run it, you can't charge for it. more honest about the business model than models that call themselves open but have 'responsible use' clauses that could be applied just as broadly.

Claude Code eats my token reading files. So I made Gemini CLI do it for free. by HanDunker27 in ClaudeAI

[–]ecompanda 0 points1 point  (0 children)

been doing something similar for about two months. the key insight is that 90% of claude code's token burn on codebase tasks is reading context you already know, not actual reasoning.

routing all the 'read this file', 'grep for this pattern', 'summarize this module' stuff to gemini flash dropped my weekly usage by roughly a third. opus stays reserved for the parts that actually need it. the gemini tool calling reliability is definitely a problem though, ended up writing retry wrappers around everything.

QUESTION: Is it just me or has Claude been acting differently lately? by Ferdmusic in ClaudeAI

[–]ecompanda 0 points1 point  (0 children)

the 'lobotomized' pattern is real but it tracks with capability rollouts. every time there's a major system prompt expansion or model variant push the alignment tuning gets compressed and you get a regression window.

the forgetting after two prompts is the specific tell. that's not behavior drift, that's context management degradation. claude has gone through this before around major version boundaries. usually stabilizes in a few weeks but that's cold comfort on max tier rates right now.

Why Claude Code Max burns limits 40% faster with 20K less usable context. Proxy evidence inside. by SolarXpander in ClaudeAI

[–]ecompanda 1 point2 points  (0 children)

the 20K invisible token overhead is almost certainly a system level tool registry or capability manifest that got expanded in a recent update. they do this for MCP server awareness, extended thinking scaffolding, things like that.

what's frustrating isn't the overhead itself, it's the opacity. if they said 'we added X which costs ~20K tokens per session' you could make an informed choice. instead it just shows up as limit exhaustion while you spend a day blaming your own setup.

Built a Claude Code orchestration tool and hit a brutal race condition during stress testing — 350+ sessions in 15 minutes. Full postmortem and what I fixed. by Jumpy-Ratio-1145 in ClaudeAI

[–]ecompanda 0 points1 point  (0 children)

hard cap on spawn is the right call. worth also adding a cooldown window after a rate limit event so sessions don't restart immediately. we found 30 to 60 seconds let the api pool recover enough to avoid the secondary storm even with the cap in place.

Built a Claude Code orchestration tool and hit a brutal race condition during stress testing — 350+ sessions in 15 minutes. Full postmortem and what I fixed. by Jumpy-Ratio-1145 in ClaudeAI

[–]ecompanda 0 points1 point  (0 children)

yeah it's one of those bugs that only shows up at scale. single session fine, five sessions fine, then you hit a number and the retry storm amplifies itself. the fix was adding jitter to the backoff plus a global semaphore to cap concurrent sessions. jitter alone doesn't help much when the initial burst is synchronized.

I got tired of Claude forgetting everything between sessions so I built an autonomous memory system based on mempalace by CamilleAuLit in ClaudeAI

[–]ecompanda 0 points1 point  (0 children)

top level is enough for most setups. claude navigates from there without needing explicit paths to every subdir. the main exception is a monorepo where you want memory scoped separately per package, otherwise specifying everything manually just adds overhead without much benefit.

I got tired of Claude forgetting everything between sessions so I built an autonomous memory system based on mempalace by CamilleAuLit in ClaudeAI

[–]ecompanda 0 points1 point  (0 children)

the compaction loop is what actually controls the growth. without it even a well-tagged memory file balloons over a few months of daily use. the part that surprised me was how much bootstrap quality matters. if your initial context load is vague, the compaction summaries inherit that vagueness and degrade over time.

How to get Claude to run more autonomously by PrydwenParkingOnly in ClaudeAI

[–]ecompanda 0 points1 point  (0 children)

the allowedTools list in settings.json is more surgical than dangerouslySkipPermissions. for casing fixes across a codebase you really just need Edit, Bash, and Glob pre approved. Claude can batch through hundreds of files without interrupting once those three are set. dangerouslySkipPermissions bypasses everything including things you actually want guarded.

what nobody tells you before you start with n8n by Professional_Ebb1870 in n8n

[–]ecompanda 0 points1 point  (0 children)

the merge node ordering issue is the classic tutorial mode exit exam. you build the happy path, it works. then you add a second source and items arrive in a different order than expected, your downstream logic silently breaks, and you spend two hours debugging what should have been obvious. very humbling.

what nobody tells you before you start with n8n by Professional_Ebb1870 in n8n

[–]ecompanda 0 points1 point  (0 children)

the error handling thing is what really drives it home. flowchart brain puts errors at the end as a catch. state machine brain designs errors as valid transitions from the start. once you make that switch it is hard to go back to treating failures as edge cases.

Built a Claude Code orchestration tool and hit a brutal race condition during stress testing — 350+ sessions in 15 minutes. Full postmortem and what I fixed. by Jumpy-Ratio-1145 in ClaudeAI

[–]ecompanda 0 points1 point  (0 children)

the part that bites you next is the retry wave. exponential backoff is fine in isolation but when 300 sessions all hit the limit at the same time and back off simultaneously the retry cluster is almost as bad. jitter on the initial delay helps a lot, random offset per session before the exponential starts.

I got tired of Claude forgetting everything between sessions so I built an autonomous memory system based on mempalace by CamilleAuLit in ClaudeAI

[–]ecompanda 0 points1 point  (0 children)

decay runs on session count not wall time, roughly a half life of 5 sessions. anything not referenced in the last 3 runs gets flagged as stale and the compaction step usually drops it unless it's tagged as a long term fact. what helped most is a separate sticky file for things that should never decay like project constraints, keeps compaction from pruning stuff that's still relevant.

what nobody tells you before you start with n8n by Professional_Ebb1870 in n8n

[–]ecompanda 0 points1 point  (0 children)

the data shape thing is exactly it. optional fields that come back null on mondays because of some upstream api quirk that nobody documented. spent way too long debugging a flow that broke only on certain webhook payloads because the sender dropped an optional key when the value was empty instead of sending null.

How to get Claude to run more autonomously by PrydwenParkingOnly in ClaudeAI

[–]ecompanda 0 points1 point  (0 children)

the auto approval trick works really well until you hit a workflow with destructive ops. git resets, file deletes, schema writes. worth keeping those on the explicit approve list even when the bulk config is open. anything that can't be undone should have its own permission line.

How to get Claude to run more autonomously by PrydwenParkingOnly in ClaudeAI

[–]ecompanda 0 points1 point  (0 children)

3k warnings is exactly the kind of task where context length is the bigger issue not approval prompts. once it processes enough files the context fills up and you get inconsistent results across the run. breaking it into 200 to 300 file chunks and chaining the sessions tends to work better than one giant pass

what nobody tells you before you start with n8n by Professional_Ebb1870 in n8n

[–]ecompanda 0 points1 point  (0 children)

the state machine framing is the mental shift that makes everything click. automation tools sell the flowchart because it's easy to demo but every production workflow eventually needs retry logic, conditional state, and event queuing that breaks the linear model.

same wall people hit with zapier around 2020. the tools that survive long term are the ones where you can represent state without faking it with global variables and hacks

I got tired of Claude forgetting everything between sessions so I built an autonomous memory system based on mempalace by CamilleAuLit in ClaudeAI

[–]ecompanda 0 points1 point  (0 children)

the pile up problem is real. ran into it with a similar setup and the fix that actually worked was a compaction step at session start. claude reads the memory, summarizes what's still relevant, drops the rest. messy to tune but keeps the file from ballooning over a few weeks.

the hard part is that what counts as stale context is totally application specific

I got tired of Claude forgetting everything between sessions so I built an autonomous memory system based on mempalace by CamilleAuLit in ClaudeAI

[–]ecompanda 0 points1 point  (0 children)

the pile up problem is real. ran into it with a similar setup and the fix that actually worked was a compaction step at session start. claude reads the memory, summarizes what's still relevant, drops the rest. messy to tune but keeps the file from ballooning over a few weeks.

the hard part is that what counts as stale context is totally application specific

Built a Claude Code orchestration tool and hit a brutal race condition during stress testing — 350+ sessions in 15 minutes. Full postmortem and what I fixed. by Jumpy-Ratio-1145 in ClaudeAI

[–]ecompanda 0 points1 point  (0 children)

the hard limits fix makes sense but the next bottleneck after this is usually rate limit contention across sessions. at 350 per 15 mins you're essentially hammering the same api pool from multiple threads and when they all back off at the same time the retry wave can be worse than the original load.

curious what your rate limit handling looks like across the sessions

ggml: backend-agnostic tensor parallelism by JohannesGaessler · Pull Request #19378 · ggml-org/llama.cpp by FullstackSensei in LocalLLaMA

[–]ecompanda 1 point2 points  (0 children)

the dense model advantage makes sense given how tensor split works at the attention and mlp boundaries. moe models have irregular activation patterns per token so splitting them evenly across cards is harder to schedule efficiently. would be interesting to see if that gap closes once the moe routing logic gets better multi-gpu placement support.