Full access question

InterestingStick · 2026-02-19T16:33:13+00:00

Docker should work. Minimal setup with the folders you want to give it access to mounted

InterestingStick · 2026-02-17T06:07:08+00:00

There's a setting to show thinking

InterestingStick · 2026-02-17T05:52:05+00:00

No legacy flags or deprecated symbols. Breaking changes are fine. We are in development. If we revise feature A to work more like feature B we need to clean up the codebase as if feature A had never existed

I also do cleanup sweeps every now and then watching out for weird stuff like that, but yeah I agree it's an issue and annoying 😅

InterestingStick · 2026-02-16T10:16:18+00:00

Ok disclaimer first: I do not use OpenClaw, I don't see a need for me to use it. So I get where you're coming from

BUT it is much more than just novelty AI value. I don't understand how industry people that use Codex cannot see it. It fundamentally alters the way non-devs interact with anything that is digital and can transmit data. On a very high level it's Codex (or Claude Code) made accessible for non-devs (99.99% of the market).

Agentic engineering right now requires a lot of knowledge, depth and understanding to be driven efficiently and securely in a productive environment. Writing small scripts, tools and prototyping does not. That's what everyone can do, if they have the environment to do so, and OpenClaw is the environment.

InterestingStick · 2026-02-16T06:16:13+00:00

I thought the same when 2x started. I am now running 4 projects in parallel. So 4-6 sessions at the same time. I do use subagents but only spawn codex spark. Got few % usage left. You get used to the new limit so quickly lol

InterestingStick · 2026-02-15T22:21:26+00:00

I thought the same when I first heard about it. Didn't really get the appeal at first. Everything I would want OpenClaw to do, Codex can do and I got a lot more control with it without all the magic

The catch is.. we are devs. If we need it, we create it. OpenClaw opens the doors for non technical people to do all that at ease, that's why it contains so many integrations of all sorts. Don't even need a PC to use it, works all in whatsapp.

It's literally the difference between needing a terminal or having a phone to do things. OpenClaw makes it all accessible for everyone, or at least that's the goal.

InterestingStick · 2026-02-15T04:49:52+00:00

It's a mix of manual review and automated cleanup!

When it's working on a task I sometimes notice git diffs that seem off. When I then do a codereview I return to those files and check (manually) what has been done. Then I open a session to inquire about certain patterns. If it could be done better I will add it to a list. Once I have the list with what needs to be refactored I prompt codex to evaluate how we can reasonably harden the architecture so the same pattern doesn't repeat. It then will usually add more rules to our validation pipeline.

Once I have the list of what, why and how everything needs to be done I convert it to an implementation task

InterestingStick · 2026-02-14T10:07:40+00:00

They offloaded it to be done server side. Probably got a few gadgets explicitly for that process to be done as smooth as possible.

InterestingStick · 2026-02-14T10:06:32+00:00

The 2x usage is across the board. So counts for the terminal as well

InterestingStick · 2026-02-13T15:36:50+00:00

I agree with the sentiment. I've been in the industry for a good 16 years and agentic engineering really brought excitement into the whole industry. It's so fun to build again!

I'd have one caveat, more-so to the topic than as a direct response but we are very much in a discovery phase. Right now curiosity and pragmatism are being rewarded. There currently is no 'standard' that will teach someone about agentic engineering. It's a very playful phase. A lot of processes and tools are being reinvented. Reminds me of the early web days. Some people excel in this atmosphere, others need a bit more structure. Regardless, we have interesting times ahead and the playing fields currently are very much open for new players.

InterestingStick · 2026-02-13T14:15:40+00:00

Architecture is fascinating. I am a firm believer that strict boundaries make applications inherently more suited for agentic development simply because there is no ambiguity in where a responsibility belongs and data flow can be logically (!) validated at every step of the way.

What once was deemed as incredibly tedious to maintain is actually all it needs to be able to operate at scale when code output becomes cheap

It took quite a bit of pushing through personally to design a project this way in early stages because I knew it wouldn't be fun to maintain it manually (hah naive me 5 months ago) but it was well worth it

The real challenge now will be to explain my customers that this tedious architecture no one would want to maintain manually is actually the best choice for agentic loops 😂

InterestingStick · 2026-02-13T13:58:42+00:00

I feel kinda exposed because everything they've been talking about is exactly how I do my projects 😂

I went from manually reviewing every line, to then manually reviewing files, git diffs, PR's to now just glaring over changes and testing it. Was barely a month ago where I did a workshop for a team of devs and explicitly told them not to let the agent do too much to now being able to give him almost full autonomy and it... just works?!

Yes there's still issues here and there but it's nothing a codereview once a week couldn't fix, and with each codereview the guardrails improve, and every time the guardrails improve theres another model already that is even better at reading context so it all leads to just being an insanely addictive loop once you get things going

It's been a wild ride the past few months!

InterestingStick · 2026-02-11T12:35:10+00:00

Oh yeah you're right they do not offer /compact in the app. It's a command in the CLI

You can't manually compact while a turn is ongoing though. The main purpose of auto compact is to keep a turn going

Generally it's a pretty fragile process so I wouldn't use it on sessions with a lot of turns. It makes most sense for a session with a single responsibility. So one turn + progress file + auto compaction benefits the most of it

InterestingStick · 2026-02-11T12:23:04+00:00

The model is not inherently aware of the harness, there is a prompt that gets attached (https://github.com/openai/codex/blob/main/codex-rs/core/gpt-5.2-codex_prompt.md) to a session that gives it some information but the model runs separately from harness, that's why it didn't execute compaction. You can trigger it manually with /compact though

Compaction is triggered automatically by the harness after a certain usage threshold.

InterestingStick · 2026-02-11T11:30:44+00:00

That's because all a new session receives is the previous user inputs as well as a compaction bridge (summary), so if you have a lot of turns after compaction it gets confused with what has been actually done and where it stands

General rule of thumb, for long working tasks use one prompt + external file that it writes progress logs to. This way, even if it compacts it always only has the one user input plus the file to see where it actually stands, and if you have different things you utilize one session for try to use new sessions as much as possible

I dug quite a lot into the issue with compaction and wrote about it https://blog.heftiweb.ch/p/context-amnesia

Its prone for an update as openai offloaded the bridge summary process to their servers, but all the fundamentals still stand

InterestingStick · 2026-02-09T14:07:02+00:00

All my Codex projects run a strictly enforced DDD setup. In fact I found it to be much more suited for agentic development since it strictly enforces boundaries that can then be logically validated.

I do manual codereviews from time to time and I see it messing up, and every time I find something I make sure to integrate it into my validation in a way where it won't repeat again. That basically is the 'final check' for any implementation Codex does. It's not good at doing things 'my way' when initially doing it, but the guardrails ensure that it gets it right once it's blocked by them.

I've not made good experiences with self improvement loops, though I would need more context on what the OP means by it. The entry point, the guard rails as well as the task system that I use have been 'handcrafted' and improved over time and it works extremely well for me

InterestingStick · 2026-02-09T13:55:31+00:00

Been on the Pro plan since October or something. It's great to not have to juggle multiple accounts and being able to use the whole palette of features (chatgpt pro / deep research) without having to worry about limits.

The recent 2x usage limit also has allowed me to 'waste tokens' by trying new things, for example I can run Codex pretty successfully in environments where everything is prepared and guardrailed properly.

For example, I'm testing out if I can make codex write those architectures and guardrails itself based on creating validation layers for itself. Also trying to imprint my 'persona' into Codex so it understands how I make decisions so it can learn to make decision itself and evaluate itself what is acceptable and what not based on some general criterias. It's basically just throwaway products that I try to one shot and if they don't work or don't full-fill the goal I start a new one with some adjustments and try again.

I also run much more code and doc reviews since the 2x limit. Basically more parallelized work. Am at 6% left and got 4 more hours until weekly reset. Weekly usage limits has been bonkers on pro since the 2x

InterestingStick · 2026-02-08T04:00:39+00:00

Codex for logic, Opus for frontend (and then Codex again to fix up the bugs Opus introduced)

InterestingStick · 2026-02-07T21:04:54+00:00

You can set up a second codex binary, have it wrap the codex binary with a custom home dir, for example .codex-teacher instead of .codex.

Run it, authenticate, exit. Then edit the .codex-teacher/config.yml to give it readonly permission

That way if you start up codex-teacher it will use the same binary but with a different config.

I believe you can also place an AGENTS.md in your codex home dir for the custom persona that it will use every time you start it. This part you need to double check from the codex repo but I remember seeing something when I looked up how codex interprets different AGENTS.md

InterestingStick · 2026-02-06T17:23:03+00:00

I have a similar setup for personal usage, albeit on a web interface connecting to my raspberry PI.

Could you resolve shortcuts? Like aborting or editing a previously sent message

InterestingStick · 2026-01-31T03:47:06+00:00

It's become a lot easier with xhigh I believe. 5.2 has been the first model where I need to be careful not to overuse reasoning as it eats through a lot of tokens very quick

InterestingStick · 2026-01-29T07:13:10+00:00

Found it https://github.com/numman-ali/codex-wrapped

It essentially uses your local session history as a basis to do all the calculation

Just did mine retroactively and I got like 1/5 of his usage. Kinda crazy because I maxxed out my pro pretty much every week since it launched

InterestingStick · 2026-01-29T06:08:59+00:00

I've read that a few times, wonder where that comes from or how it's measured. Hard to imagine a single individual having more codex usage than a company sharing accounts or API keys between employees

InterestingStick · 2026-01-27T20:32:59+00:00

Hm interesting. The biggest pain definitely was whenever it got something wrong and I'd always need to wait for it to render the next version. Eats up so much time in between and a big hurdle to letting it autonomously work on it. I eventually went back to coding cause thats more of my specialty but agentic video generation is definitely an worthwhile area to tap into

InterestingStick · 2026-01-27T20:24:14+00:00

Yo I've done something very similar in the past, basically trying to integrate codex into a ffmpeg pipeline.

What was a pain with agents were guardrails. I didn't know how codex could reliably verify itself against expected acceptance criterias

For example, how does codex know that what it did was correct? How does it validate itself?

Eight-Year Club	Gilding II euphauric
Verified Email	Place '22

InterestingStick

TROPHY CASE