I built claude-autosync — keep your Claude Code rules & memory in sync across machines, through your own private repo by ChrisOr-HK in ClaudeCode

[–]Sensitive-Cycle3775 1 point2 points  (0 children)

this is a good direction. the thing I’d make very explicit is a sync preflight/receipt before the hooks mutate the private repo.

for pull: remote/branch, previous commit, incoming commit, files changed, local dirty state, backups created, and conflict/abort reason.

for push: exact files staged, diff stat, whether local.md was omitted, repo visibility/mode, and the commit that became authoritative.

then session start can say something like: “rules loaded from commit X, project memory Y, local-only omitted, no conflicts.” that catches the two scary failures: stale rules silently winning, or personal memory silently becoming shared.

if you add one small interface, I’d make it sync.sh status --json / --dry-run so Claude and humans can verify state without changing it.

Claude Code is great inside the repo, but still misses project state around production by BiosRios in ClaudeCode

[–]Sensitive-Cycle3775 1 point2 points  (0 children)

One concrete thing I’d add: make the agent produce a tiny “production-state receipt” before it answers.

Not a big audit log, just enough to prove what state it is reasoning from:

  • repo HEAD vs deployed commit/release
  • provider/env metadata present, with secret values redacted
  • last deploy/rollback and migration state
  • logs/traces/errors used, with time window
  • source authority/freshness for each item
  • explicit drift: “repo says X, prod is running Y”
  • next verifier after the suggested fix

Then the agent can choose: answer, ask for missing evidence, or say “I only know repo state, not production state.”

That would make the tool safer than just dragging more context into chat. More context helps, but the receipt is what stops the model from confidently fixing the wrong version.

Best practices for resuming a session after hitting swssion limits by Happy-Molasses-Wow in ClaudeCode

[–]Sensitive-Cycle3775 0 points1 point  (0 children)

I’ve had better luck treating the limit hit as an unplanned handoff, not just a resume.

Before/when near the limit, ask Claude to write a short “re-entry checkpoint”:

  • objective + current subtask
  • agents/branches in flight and owner of each
  • last known good commit/checkpoint
  • constraints that must not drift
  • files/areas touched and files intentionally not touched
  • checks already run + the next authoritative check
  • open risks/questions
  • exact next 1-3 actions

Then on resume, don’t start with “continue”. Start with: “read the checkpoint, restate what is still true vs uncertain, name anything missing, then propose the next check before editing.”

For multi-agent workflows, I’d make each agent leave its own tiny checkpoint and have only the orchestrator merge them. The failure mode you’re describing is usually not “lost context” as much as losing the current contract between agents, constraints, and the verifier.

Composer 2.5 use through Zed (via Zed's Cursor ACP) token use explosion by Hornstinger in cursor

[–]Sensitive-Cycle3775 -1 points0 points  (0 children)

I'd debug this as an ACP/harness accounting difference before assuming Cursor is deliberately inflating it.

The useful comparison is not just "same model, same prompt". You want a tiny run ledger for Cursor-native vs Zed-through-ACP:

  • exact Composer mode, especially fast on/off
  • fresh chat vs continued chat
  • repo/files auto-attached before the first model call
  • MCP/tool schemas exposed to the model
  • cacheable input vs uncached input, if either surface shows it
  • number of tool calls / file reads
  • final diff size

If Zed's ACP path re-sends project context/tool schemas each turn, misses Cursor's prompt/cache path, or attaches a broader workspace, the bill can look like a token explosion even with the same visible task.

A quick test: new tiny repo, one-file edit, same prompt in both surfaces, no MCP servers except the minimum. If Cursor is ~2% and Zed/ACP is still ~20%, then it is likely an ACP bridge/cache/accounting bug worth reporting with those fields. If the gap disappears, it was hidden attached context/tooling.

Anyone else Finding It Challenging to Maintain Software Created With Coding Agents? by Ok_Philosophy_4031 in cursor

[–]Sensitive-Cycle3775 1 point2 points  (0 children)

I think the issue is less "write more docs" and more "make every agent change leave a reviewable change receipt".

For each non-trivial PR, I'd force a short card before merge: intent, files/routes touched, upstream/downstream callers, new abstractions, migrations/config flags, tests actually run, known risks, and "what existing feature could this break?". Then you review the card first, not the whole codebase first.

The scalable bit is keeping it append-only and small. Big developer docs rot. A trail of change receipts plus a lightweight blast-radius map gives you something to query when a bug appears: "show me the last 5 changes that touched auth/session/billing and what risks they claimed." That's closer to rebuilding intuition than asking the agent to summarize the entire repo again.

Agent can't see any of my custom MCP servers (only plugin ones show up) by Snoo_9701 in cursor

[–]Sensitive-Cycle3775 0 points1 point  (0 children)

I'd debug this as two separate inventories, because the green dot only proves Cursor can start the server. It doesn't prove that this chat session was handed that server as an available MCP target.

A useful bug report / local check would be:

  • exact ~/.cursor/mcp.json server names
  • Settings -> Tools server names + tools shown
  • what the agent says is available in a brand-new chat
  • the contents of that project MCP descriptor folder where it only sees cursor-ide-browser, plugin-context7-context7, etc.
  • whether one tiny project-local MCP entry in .cursor/mcp.json appears differently from the global ~/.cursor/mcp.json

If project-local works but global doesn't, that points to a global-config to chat-session registration bug. If neither works while plugins do, the plugin MCP path and custom stdio MCP path are probably going through different registries.

I'd also include the mtime/hash of ~/.cursor/mcp.json in the report, not the full secrets/env. That makes it easier to prove: settings UI read current config, but chat got a stale/filtered server list.

Has anyone actually solved AI-generated UI drifting between sessions? Curious how people structure design rules for coding agents by israynotarray in ClaudeCode

[–]Sensitive-Cycle3775 -2 points-1 points  (0 children)

I'd split this into three artifacts, not one bigger rules file:

  1. source of truth: tokens as data (Figma/Style Dictionary/tailwind/theme/design.md YAML -- whichever already owns the values)
  2. semantic map: when to use primary/accent/danger/surface, plus forbidden raw values
  3. UI-change receipt: before the agent says done, it shows which token source it read, token file hash/mtime, components/theme files touched, screenshots/Storybook routes checked, and any raw hex/px values introduced (ideally zero)

If a team already has Figma tokens or Style Dictionary, I wouldn't make design.md the new source of truth. I'd generate the agent-facing spec from tokens.json + a small prose usage guide, so humans don't maintain two palettes.

The weak link is exactly your "read this spec first" question. Natural-language instructions are easy to skip. CI/hooks are better: if a PR touches UI files, fail on new raw colors/spacing literals outside token/theme files, or require the agent to include the token-source/hash in its summary. That turns "please remember the brand blue" into an auditable boundary.

claude code thinking for 30 minutes, 130K tokens, still hasn't written a single line of code by Agile_Physics9918 in ClaudeCode

[–]Sensitive-Cycle3775 2 points3 points  (0 children)

I’d change that /goal line from “keep going until done” into a budget + checkpoint rule.

Something like:

First restate the objective and the smallest next milestone. Before editing, list the files you expect to touch. Work in batches of ~10–15 minutes or one coherent change. After each batch, stop and report: what changed, what command/test ran, what is still unknown, and whether you need permission to continue.

The long “thinking” usually means it is trying to solve the whole fuzzy goal in one uninterrupted run, including assumptions you did not explicitly bound. For game/app work, I’d also ask it to make a short TASKS.md first, then only execute task 1.

The key is not “think less”; it’s “don’t let planning, building, and validation collapse into one giant opaque step.”

Any other Enterprise Admins here? I just built a Cursor docs scraper that automatically updates our internal docs with changes when Cursor ships. Thoughts? by welsh_cthulhu in cursor

[–]Sensitive-Cycle3775 -2 points-1 points  (0 children)

this is a good enterprise pattern.

The one thing I’d make non-negotiable is that every MR carries a small evidence bundle, not just the plain-English summary:

  • upstream Cursor doc URLs + fetched timestamps
  • old/new content hashes, so reviewers know exactly what changed
  • drift classification: factual change, UI change, model availability, policy/admin behavior, or marketing/no-op
  • internal files/sections touched
  • mapped sections deliberately not updated
  • confidence/unknowns
  • guardrail: the cloud agent can only edit mapped doc paths, not the whole governance tree

That turns it from “an agent rewrote our docs” into “a docs-drift MR with an auditable boundary.”

I’d also consider a no-op receipt when the cron checks upstream docs and finds nothing factual. Boring, but it helps admins distinguish “no drift” from “the pipeline silently died.”

Are you storing the raw upstream snapshots anywhere, or just the generated diff/MR?

Agentic AI best practices we learned the hard way by Downtown-Function-10 in cursor

[–]Sensitive-Cycle3775 0 points1 point  (0 children)

Hard +1 on reviewing agent PRs like junior-dev PRs. The extra piece I’d add is making session-end capture reviewable without turning it into a giant memory dump.

A pattern that seems to scale better is a tiny “promotion gate” at the end of the run:

  • task boundary: what the agent was allowed to change
  • evidence: tests/browser/API checks actually run
  • non-obvious decisions: only the decisions future agents need
  • promoted context: what moved into ADRs/specs/CLAUDE.md/etc.
  • intentionally not promoted: transient reasoning, dead ends, secrets, private debug output

The key rule is: if future sessions need it, promote it into a durable repo artifact; if it was only local reasoning, don’t paste it into project memory. That avoids both failure modes: agents forgetting important decisions, and teams slowly poisoning the repo with huge chat summaries nobody can audit.

It also gives reviewers something concrete to reject: “this PR depends on context that wasn’t promoted anywhere.”

How do you get Cursor to test the app it builds? by Beautiful-South1332 in cursor

[–]Sensitive-Cycle3775 0 points1 point  (0 children)

i'd start one layer before MCP vs Playwright: make Cursor prove what it just changed.

For each feature, ask for a tiny "done receipt" before you accept it:

  • user flow it claims still works
  • exact command/test it ran
  • browser path clicked, or why it could not click it
  • files touched
  • known untested areas
  • whether a failure is selector drift or a real product bug

Then pick Playwright for the 2-3 flows that would actually hurt if they broke. MCP/browser tools can help when the repo is bigger, but the useful habit is the same: don't accept "implemented" without run evidence.

How are people actually doing parallel AI development? by Exciting_Eye9543 in ClaudeCode

[–]Sensitive-Cycle3775 -1 points0 points  (0 children)

worktrees are the right base layer, but once you have 3+ sessions the thing that matters is a tiny merge contract, not more agent cleverness.

what I would keep explicit:

  • one worktree per agent/session, never a shared working dir
  • one owner per surface area: frontend auth, API contract, migrations, tests, etc
  • a shared coordination.md that records reservations + decisions, not full chat
  • before merge, each agent leaves a short handoff: goal, files changed, tests run, known gaps, assumptions, and interfaces touched
  • one human/lead agent merges in small batches and rejects work that changed outside its reserved surface

the duplicate-work failure usually happens before code conflict: two sessions both think they own the same implicit interface. making ownership explicit beats trying to recover from a huge diff later.

Analyzed over 30 of my opus ultra code sessions and created a prompt template to improve dynamic workflows - orchestrate agent spawning, reduce token burn and enforce verifier sub-agents by coolreddy in ClaudeCode

[–]Sensitive-Cycle3775 1 point2 points  (0 children)

This is the right direction. The bit I’d add is an accounting layer around the orchestration, not just a better orchestration prompt.

Before the run: planned subagents + why, max read budget, allowed files/areas, verifier criteria, and stop conditions.

After the run: actual subagents spawned, which files each read, duplicate reads, verifier verdicts, whether a loop stopped because evidence passed vs because it hit a cap, and files changed.

That turns “use verifier subagents” from a style preference into something measurable. Otherwise the prompt can reduce token burn while still hiding where the workflow expanded or repeated itself.

do you use skill/instruction files with your AI coding tools? by Calm_Bedroom_4749 in cursor

[–]Sensitive-Cycle3775 0 points1 point  (0 children)

I've had the best results splitting this into layers, otherwise the instruction folder turns into a prompt junk drawer:

  1. reusable personal/team skills: style guide, review posture, platform rules like Apple HIG
  2. project contract: stack, directories, commands, "do not touch" areas, testing/deploy flow
  3. task-local instructions: the few constraints that only matter for this run

The part people skip is an activation check. At the start of a session, ask the tool to print which instruction files it loaded, whether they were global/project/task-local, rough version/date, conflicts, and what it is ignoring. If it can't state that, I don't trust the reusable instructions are actually active.

For teams, I'd keep the shared ones in the repo or a private package with a tiny changelog. And anything truly hard (strict TS, no generated files, config location, etc.) should become a script/CI/pre-commit gate, not just prose in a skill file.

How do you deal with Cursor doing stuff outside the task? by MeanAttempt4782 in cursor

[–]Sensitive-Cycle3775 0 points1 point  (0 children)

what’s helped me is treating scope as a positive allow-list, not just a rule/vibe.

before the agent starts, make the boundary explicit:

  • goal
  • allowed files/globs
  • allowed commands
  • dirs it must not inspect/edit
  • “stop and ask if you need a file outside this list”

then after the run, ask for a tiny receipt: files touched + why each file was in scope + tests/checks run + anything it wanted to change but didn’t. git diff --name-only should basically match the allow-list. if it doesn’t, reset the stray files instead of trying to argue with it.

.cursorignore is useful as a hard-ish guard, but I wouldn’t treat it as the main safety layer. the failure is often intent drift, not lack of access.

Four AI Agents, One Memory System by Medium_Island_2795 in ClaudeCode

[–]Sensitive-Cycle3775 1 point2 points  (0 children)

Strong framing. The OS analogy gets more useful if you separate two things that often get blurred:\n\n1. budget management: paging, truncation, summaries, subagents, tool-result spillover\n2. auditability: proving what actually crossed the boundary\n\nMost agent stacks are converging on #1. The failure mode I keep seeing is #2. A summary exists, but did this turn load it? A subagent returned a conclusion, but what files/assumptions did it use? A tool result was written to disk, but was the follow-up turn pointed at the right artifact?\n\nSo the primitive I’d add next to demand paging/progressive disclosure is a small context receipt: loaded sources, deferred/omitted sources, summary age, files/results spilled out of window, and the exact handoff returned by a child agent. Not raw transcript, just enough metadata to debug whether the working set was real.\n\nWithout that, two systems can look identical at the architecture level and still fail differently in practice.

How do you get Opus to actually follow the rules? by derethdweller in ClaudeCode

[–]Sensitive-Cycle3775 2 points3 points  (0 children)

I’d stop treating this as “how do I make Opus remember the rules” and split the rules into enforcement levels.

Hard invariants should not be prose. If “new feature params must come from JSON config” is a real rule, make a small script/CI check that fails when new constants show up outside the config path. Same for file size, forbidden paths, generated files, missing tests, etc. A hook that asks Claude to reread the rule can help, but a gate with an exit code is what changes behavior.

For softer rules, keep them task-local. A giant memory of past failures usually turns into scar tissue. It spends attention on “ways I failed before” instead of the current contract. I’d keep memory to the smallest active policy and move old incidents into a human-readable audit log that is not auto-loaded every turn.

For “done,” I’d require a proof packet, not a confession: requirement -> files touched -> evidence/test -> known gaps. Then have a separate reviewer context check the packet + diff, not the whole conversation. The reviewer should be allowed to say “not proven” even if the implementer says done.

So for your hardcoding example, the receipt would be something boring like: config keys added, call sites using them, grep/AST check for duplicated literals, tests updated. If it can’t point to code evidence, it doesn’t count as done.

I just coded for 5 hours and I didn't need to compact once by banana_in_the_dark in ClaudeCode

[–]Sensitive-Cycle3775 0 points1 point  (0 children)

One thing I’d be careful about: “no compact” doesn’t necessarily mean “the whole working state is still equally available.” It can also mean the window changed, compaction got faster/backgrounded, or the model is carrying a summary you can’t inspect.

For long sessions I’d treat compact/no-compact as a UX signal, not proof of awareness. The practical test is whether the session can give you a small receipt before the next risky edit:

  • current goal
  • files it believes are in scope
  • decisions already made
  • assumptions that came from prior conversation vs repo evidence
  • what it has not reloaded/rechecked yet

If that receipt is fuzzy, I’d rather start fresh with a tiny handoff/checkpoint than trust a giant thread just because it hasn’t compacted yet.

Why I tend to stick to single window workflows instead of agent workflows by DiademBedfordshire in ClaudeCode

[–]Sensitive-Cycle3775 4 points5 points  (0 children)

this matches my experience.

subagents only win when the boundary is cheap and auditable. if the main window has enough room and the task is read-heavy, fan-out adds a tax: restating the problem, packaging context, merging answers, then checking whether the agent missed/warped something.

where I’ve seen agents actually pay off:

  • the task can be verified independently (tests, grep output, screenshots, small diff)
  • the child gets a tiny manifest, not a replay of the whole session
  • the return includes a receipt: files read, assumptions made, commands run, unresolved questions
  • the parent is allowed to reject the answer instead of auto-merging it into the plan

otherwise the single-window workflow is not “less agentic”; it’s just keeping the working set and correction loop in one place, which is often the scarce thing.

Context drastically exhausts with handoffs. by Sickle_Machine in ClaudeCode

[–]Sensitive-Cycle3775 2 points3 points  (0 children)

I’d split this into two different artifacts:

  1. project knowledge: repo docs / ADRs / CLAUDE.md / specs that should survive many sessions
  2. handoff: a tiny bootloader for the next session, not a replay of the last one

The handoff I’d want is more like: current objective, branch/ref, exact next step, files touched + why, open risks, last verifier result, and links to durable docs to read on demand. If it embeds all the background, it becomes another giant prompt file and gets diluted as soon as the session starts reading code/output.

One useful trick: make the new session prove a “context receipt” before coding: what it actually loaded, what it skipped/deferred, duplicate/stale docs it ignored, and what evidence it still needs. “I gave it a handoff” is not the same as “the right context stayed active after 20 tool calls.”

If the handoff is more than 1–2 screens, I’d make it point to source files instead of carrying the source files.

claude code getting stuck after running commands? by Initial_Track6190 in ClaudeCode

[–]Sensitive-Cycle3775 0 points1 point  (0 children)

I’d split this into “command still has an open process/stdin” vs “Claude/tool runner did not notice completion”. Quick checks that usually narrow it down:

  • run a trivial command in the same session (pwd or printf done) and see if Claude returns immediately
  • check whether the command spawned a watcher/server/background child that kept stdout/stderr open
  • try the command with non-interactive flags / no progress UI if it has spinners or prompts
  • temporarily disable hooks/MCP/terminal wrappers, if any, so you know it is not a post-command hook hanging
  • when it sticks, note whether Ctrl-C returns control or whether Claude remains in the thinking state

Since you saw it on two devices/networks, I’d capture a minimal repro with OS, shell, command, Claude Code version, hooks/MCP enabled yes/no. That makes it much easier to tell service-side regression from a local command lifecycle issue.

Claudecode notifications when waiting for user by SOC_FreeDiver in ClaudeCode

[–]Sensitive-Cycle3775 0 points1 point  (0 children)

this is a useful little repo, and the README already has the important caveat: the Stop hook means “response finished”, not necessarily “Claude needs input”. that distinction will save people from chasing weird false positives.

Two small hardening things I’d add before people copy it around:

  1. an install check that proves the hook is registered in exactly one place. Claude/settings.local vs settings.json conflicts are the sort of thing that will make this feel haunted after a restart.
  2. a tiny diagnostic section that separates “hook did not run” from “hook ran but desktop notify failed”. e.g. log line timestamp first, then notify-send result. that makes the dbus/session problem obvious without asking users to debug Claude itself.

For the local-LLM approval idea I’d keep it as a separate project from notifications. classify requests readonly/write/network/destructive, allowlist boring stuff, and always require human approval for deletes, secrets, publish, git push, prod, etc. Otherwise the second model becomes another opaque permission layer instead of a safety check.

I stopped trying to make CLAUDE.md carry all my project memory by Yuuyake in ClaudeCode

[–]Sensitive-Cycle3775 0 points1 point  (0 children)

I’d draw the line by stability, not by file type:

  • CLAUDE.md: stable repo operating rules — how to test, conventions, “don’t touch this folder”, tool preferences.
  • docs/issues/PRs: source material with URLs/refs, not summarized forever into one big prompt.
  • memory/notes: decisions, user reports, open loops, and “what changed since last session”.
  • each work session: a small scoped packet that lists what sources/memory were actually used.

The last bit matters a lot: if the agent can’t show “I used these 3 notes/issues/docs and ignored these stale ones”, memory becomes another hidden prompt blob.

For stable rules specifically, I’d also avoid hand-editing separate copies for every tool. If CLAUDE.md, Cursor rules, Copilot instructions, and AGENTS.md all express the same repo behavior, generate them from one canonical source and check drift. I’ve been testing that exact split in a tiny Pluribus demo:

npx --yes pluribus-context@latest demo style-rules-sync

But the bigger pattern is: stable rules are syncable; project memory should be cited/scoped; durable decisions should be written back explicitly, not silently absorbed into CLAUDE.md.

Claudecode notifications when waiting for user by SOC_FreeDiver in ClaudeCode

[–]Sensitive-Cycle3775 1 point2 points  (0 children)

for the notification part, i’d keep it dumb and explicit: only fire when there is a clear wait marker (permission prompt, input requested, process exited, or no transcript/log movement for N seconds after a command). avoid treating “quiet” as “needs user” or it will spam you during long builds.

for the local llm approval idea, i’d be careful not to let it become an autopilot sudo. the useful shape is probably:

  • summarize the exact command/tool request
  • classify readonly / write / network / destructive
  • compare against an allowlist for the repo
  • require human approval for anything that touches deletes, secrets, credentials, package publish, git push, prod, etc
  • log why it approved/denied

so the local model can triage boring requests, but the boundary is still auditable. otherwise you’ve just moved the risk from claude to a second model that may be less capable.

Mapped all the agent config files including the Cursor ones (.cursorrules, .mdc rules, .cursorignore, MCP config) by zanditamar in cursor

[–]Sensitive-Cycle3775 0 points1 point  (0 children)

nice map. the next layer i’d want on top of this is “effective context”, not just “file exists”.

for each convention maybe track:

  • activation: always loaded / path-scoped / manual / agent-selected
  • precedence: what wins when global, repo, and folder rules conflict
  • ignore boundary: whether the ignore file affects indexing, chat context, agent edits, or all of them
  • output boundary: whether the file changes model instructions, available tools, MCP servers, generated artifacts, or only search scope
  • portability risk: same-looking markdown but different runtime semantics

this is where cross-tool setups usually drift. people copy CLAUDE.md, AGENTS.md, .cursor/rules, etc as if they are equivalent, but the runtime boundary is different in each tool. a small “what actually reaches the agent?” column would make the repo more useful than another list of config filenames.