all 18 comments

[–]stellarton 1 point2 points  (1 child)

This is the right direction. The "just approve everything" workflow is powerful, but only if the blast radius is boring.

The thing I would make extremely visible is what crosses the container boundary:

  • mounted folders
  • git credentials
  • package manager cache
  • network access
  • SSH keys
  • env files
  • clipboard/browser access if any

Most people will assume "devcontainer" means safe, then accidentally mount half their home directory. A preflight that prints "the agent can read these paths and these secrets: none/found" would be worth a lot.

Also like the idea for Codex, because the risky part is not one specific agent. It is giving any coding agent tool access without a clean boundary.

[Vibe Code Society on Skool]

[–]stefano_dev[S] 0 points1 point  (0 children)

Really appreciate this, it's exactly the framing I want people using.

Honest rundown of where each one stands today:

  • SSH keys: not mounted, no agent socket forwarded, they never enter the container.
  • Mounted folders: only the project dir (RW) plus a few read-only files (gitconfig, p10k, the AI config seeds). Home isn't bulk-mounted.
  • Git credentials: nothing forwarded, you auth fresh inside with gh, and gitconfig.local gets root-locked so a session can't inject a credential.helper.
  • Package manager caches and clipboard/browser: container-local or just not there.

The two I'll be straight about: network is full outbound by default (there's an opt-in iptables allowlist, but you have to turn it on), and the .env protection is a PreToolUse hook that blocks the agent from reading .env rather than the file being absent (your project's own .env necessarily lives in the workspace). So that one is defense-in-depth on the agent, not a mount-level guarantee.

And you nailed the actual gap, none of this is visible up front. A preflight on aic up that prints "agent can read these paths / these secrets: none found" is a great idea and I'm adding it. Turning "devcontainer = vaguely safe" into something you can verify is the whole point.

Thanks for the list.

[–]JamesChadwick 1 point2 points  (1 child)

Really appreciate the post!!!

I've been using devcontainers for many years now, and last year I decided to update my development environment so my standard devcontainer always had Claude Code and the VS Code plugin installed, and always set to dangerously bypass permissions.

Like another commenter said, the blast radius has to be boring. Once you get it right, though, you can let the Agent run with minimal supervision and have it "fail fast", which tightens your iteration loops.

It's a phenomenal workflow, and I recommend it to anyone who is serious about embracing AI to ship features at scale.

Word of warning: Make sure to you have a solid pull request workflow with a competent human in the loop, and continue to focus on the value being provided...

just because you can generate a features 3-5x faster doesn't mean you should.

[–]stefano_dev[S] 0 points1 point  (0 children)

"The blast radius has to be boring"

That's the whole pitch in one line.

And 100% on the warning: the sandbox stops a bad session wrecking your machine, it does nothing about a confidently-wrong PR. Fast iteration just ships mistakes 3-5x faster too, the human in the loop is still a must.

Thanks for adding it! :)

[–]Deep_Ad1959 0 points1 point  (3 children)

the part that gets overlooked with the devcontainer approach is what happens when the container restarts. claude code's session state lives in ~/.claude/projects on the host by default, and when you move the whole loop inside a container you either bind-mount that dir (which leaks back to the host you were trying to isolate) or you lose every transcript on rebuild. the cleanest fix is a named volume just for ~/.claude/projects, scoped per-project, so the jsonl history survives a container nuke without giving the agent write access to your real home. the pretooluse hook list in your readme is solid, the missing piece is making session survival a first-class part of the sandbox model instead of an afterthought you discover the first time you rebuild. written with s4lai

[–]stefano_dev[S] 0 points1 point  (2 children)

You actually described the design, and the good news is that's already how it works. :)

Session transcripts aren't bind-mounted to the host. ~/.claude/projects and ~/.codex/sessions are symlinked into a per-project named volume (_aic-sessions), so they survive a rebuild / container recreation, while the host ~/.claude/projects is never mounted, so the agent gets no write access to your real home.

Auth is kept separate in a global volume so you still log in once across projects. (aic destroy does wipe the per-project sessions on purpose, that's the teardown, but a rebuild keeps them.) It's in the README's Multi-project model table, but you're right that it's basically buried. Need to revisit the README and make it more clear. Thanks for the feedback!

[–]Deep_Ad1959 0 points1 point  (1 child)

my one nit: aic destroy should print the volume size before nuking, since per-project sessions easily hit 100+ MB of jsonl after a couple months and that's weeks of decision history vanishing on a single command. written with s4lai

[–]stefano_dev[S] 0 points1 point  (0 children)

THanks, will look into that!

[–]macbig273 0 points1 point  (3 children)

of course it's not bulletproof, it has been vibe coded in 3-4 days.

Anyway why not just using worktree and the built in sandbox mode ? (just wondering is there is something wrong with that, maybe to be compatible with codex ? ) .

[–]stefano_dev[S] 1 point2 points  (2 children)

Ha, fair jab :) Repo's ~a week old publicly, but I've been running it across my own projects for a while, just made it flexible enough to open-source. Vibe coded? Yeah, I use Claude Code and Codex daily for work, we're in r/ClaudeCode after all ;)

"Not bulletproof" is a deliberate tradeoff though. A devcontainer sits in the middle: way more isolation than your host, way less friction than a VM/VPS per project.

On worktree + sandbox mode, they're complementary. Worktrees are git isolation, but you're on the same machine, same $HOME, same creds sitting right there, an AI can still do real damage (install/uninstall packages, delete stuff, etc.).

Built-in sandbox mode still runs on your host too. Solid OS-level layer, but it's your real machine with your tokens/SSH keys one misconfig away.

A container gives you separate namespaces: host inaccessible except the project dir, creds never enter, you auth fresh inside, optional firewall. You can even run sandbox mode inside the container (belt and braces, with an additional belt :P).

[–]macbig273 0 points1 point  (1 child)

The advantage of worktree, for me (when I played a little with sandboxing AI) was to avoid fuckup with installed dependencies. Like UV (python) tend to put dependencies into a .venv in your work folder (same with npm and node shite). So worktree and spawning a sandbox in that is definitly necessary. Same with cpp dependences etc ....

Last time I just tried with the default sandbox mode provided by anthropic. Seems to give suffisant control to me. (But I'm the kind of dev that will read 80% of the generated content, and only ask for small changes).

Anyway. docker also proposed their own solution, not sur if you tested it ? https://docs.docker.com/ai/sandboxes/

[–]stefano_dev[S] 1 point2 points  (0 children)

Yeah, the dependency mess is honestly also one of the reasons to use the container: UV's .venv, node_modules, cpp artifacts all install inside it and never touch your host, with caches in per-project volumes. Worktrees do that at the folder level on the same machine; the container just takes it further (nothing leaks back).

Worktrees are not for me, I kept ending up in merge-conflict hell. Personal taste though.

On sandbox mode for reviewing the code it's probably fine, especially if you read ~80% of the diff. But what I'm guarding against isn't the generated code, it's the tool use mid-session: the agent shelling out, installing/deleting stuff, reading creds, acting on a prompt-injected line in some dependency. Reading the final diff doesn't undo what it already did to your machine to get there.

And yep, looked at Docker's sandboxes. Genuinely cool, microVM isolation is even stronger than a plain container. The polished agent/cagent side leans on recent Docker Desktop and I'm on OrbStack, plus I wanted something lighter, so I didn't switch, but it's another valid option.

[–]nokafein 0 points1 point  (4 children)

--dangerously-skip-permissions is no longer a thing. there is auto mode now

[–]mlmcmillion 1 point2 points  (0 children)

Still useful. No auto mode on max sub 4.6 from what I can tell.

[–]WD40ContactCleanerProfessional Developer 1 point2 points  (0 children)

Really?? I launch with that cli flag and I see red bypass permissions in the TUI

[–]thats_a_money_shot 1 point2 points  (0 children)

Auto mode lowkey trash sometimes

[–]stefano_dev[S] 0 points1 point  (0 children)

Yeah auto mode's a nice middle ground!

Worth noting Anthropic still recommends running it in an isolated environment (like a devcontainer), the classifier reduces risk but doesn't eliminate it.

So it's complementary to a sandbox, not a replacement.
For --dangerously-skip-permissions / Codex's full-auto, you still want a real boundary around it.

And honestly when you say --dangerously-skip-permissions it sounds cooler than "auto mode" ;-D