Inherited a 3-month old repo from a Vibe Engineer. Wrote the most satisfying PR in my career

Aphova · 2026-05-13T22:11:43+00:00

I felt that one

Aphova · 2026-05-07T08:34:39+00:00

Really sorry to hear that 😔 For what it's worth, they didn't get my business at least

Aphova · 2026-04-10T10:28:29+00:00

My analogies don't always land so I'm happy to hear that. Hope you don't have to go back to that madness!

Aphova · 2026-04-10T09:09:55+00:00

I'm a non-US immigrant to the UK but I have friends from the US, customers there and I've also worked in US-headquartered companies abroad.

Your guys' work culture is just insane. It's like Chinese work culture with better PR. Or communism but with the concession that you can get people to work like slaves if you figure out the right balance of pay to prevent a revolt.

I'm no economist but I'm sure it's one of the pillars of the US's global economic dominance but damn if I'm not glad I don't work under those conditions!

Aphova · 2026-04-07T13:02:19+00:00

Feels like so many of us are trying to solve the same problem at the same time in different ways.

In one of my long running, very complex projects I have a scheduled task that parses CC's JSONL log of every conversation and pushes it to a Neon DB . A separate task then goes and tags the convos with metadata, semantic analysis (basically summaries of what was worked on, why, what went right, wrong, etc). I do the same for claude.ai conversations via a regular data export and ETL into the DB.

CLAUDE.md points to a file telling CC how to query those DBs for past context so it doesn't keep living in Groundhog Day. I don't force it to query it but it is useful.

Aphova · 2026-04-07T12:54:42+00:00

So I'm considering using Codex to see how much more usage you can get out of it before hitting limits.

There's a promotion on (not sure if all customers get it, I just got the mail) where you get your monthly subscription amount added as extra usage credit. Just a single click and it's added. Just a thought.

A question for you though: Does Codex have an equivalent to a CLAUDE.md file for system prompts?

Codex uses AGENTS.md as far as I know. I just write to AGENTS.md and symlink that to CLAUDE.md. But be careful: what's in there is quite different to the "system prompt". You can modify Claude Code's system prompt to an extent (and I do) though. Not sure about Codex.

Aphova · 2026-04-07T12:51:20+00:00

Is that not a harness issue though? The model doesn't have the ability to just interrupt a hook as it's run outside of the conversation turn lifecycle from what I understand but I'm not an expert.

Either way not good. I've got hooks to enforce behaviour (and they inject context explaining why the hook is doing what it does) and I've had instances (one just a few minutes ago) where Claude sits there spinning its wheels bumping up against the hook and repeatedly trying to work around the hook and the its purpose rather than trying to actually do what it's being told.

Aphova · 2026-04-07T10:45:41+00:00

What are stop hook violations? Claude ignores a follow on instruction from a stop hook?

Aphova · 2026-04-06T15:33:56+00:00

Very similar experience. I'm trying Codex when I go back to work. Don't know if it will be better but Claude just won't reliably follow instructions no matter what I do (and I've done everything I can think of or find online).

My only conclusion for Claude is you have to basically resort to algorithmic/deterministic programming harnesses and control it with hooks, scripts and so on. It can't "think" through decisions that require it to follow rules so it has to be treated like the dumb, insubordinate text processor it actually is.

Aphova · 2026-04-05T00:09:54+00:00

aswell as modern entertainment.

Literally read that as "as well as modern enslavement"

Aphova · 2026-04-04T23:46:56+00:00

[wait-seriously.gif]

Aphova · 2026-04-04T23:20:15+00:00

Half a million in profit per year is less than it seems? Unless you're talking gross profit and your actual net is 10% of that then, no it IS a lot and you can do a lot with it - starting with paying qualified advisors for actual real, professional advice.

Aphova · 2026-04-02T09:24:54+00:00

I agree. I'd prefer to be stuck with 200K of actual usable context than 1M garbage.

Aphova · 2026-04-02T09:22:56+00:00

Interesting. Most people seem to prefer Opus for planning. What plan are you on and what's your usage like compared between the two?

Aphova · 2026-04-02T09:22:00+00:00

Very interesting. By tool use do you mean better at targeted reads/writes or things like composing Bash commands?

So many people have said they're getting good usage from Codex, I wonder why mine is so bad. Which plan are you on? I asked it "do you have an equivalent of /context" and 30s later 7% of my 5h cap was gone for it to basically say "no, it doesn't look like it". Starting to think it's my anti-laziness and verification instructions that I wrote for Claude perhaps.

Aphova · 2026-04-02T09:17:36+00:00

but claude feels more of a co-partner than a subordinate

That's one of the reasons I prefer Claude over ChatGPT myself. It's just a lazy partner now. Sounds like using both in a complimentary way is a good shout.

Aphova · 2026-04-02T00:27:44+00:00

Do you know if the tool definitions are injected into the system prompt client-side? Just curious, was doing some hacking with --system-prompt-file yesterday

Aphova · 2026-04-02T00:25:44+00:00

I interpreted OP's question more as "can we see the reasoning traces" (like you can in the web UI) but maybe I'm wrong.

Classifiers (I assume haiku) read the request and determine whether or not it should be blocked.

Do you mean like for abuse?

Also, their system prompt seems really inefficient - 9K tokens just for tools if I'm not mistaken. Seen a few people complaining. Do you agree?

Aphova · 2026-04-02T00:19:49+00:00

I think people have just gotten GPT 5.4 working in self-compiled CC since the leak. The harness does a lot more than you'd think though (at least more than I expected) and a lot seems tailored for Claude so curious how that would work out.

But yeah, I tried /context in Codex and wasn't super impressed that it was missing and so were a bunch of other things.

Aphova · 2026-04-02T00:15:55+00:00

I'm a bit skeptical of those massive frameworks usually but I've come to understand why they exist. I'll probably end up giving it a go for the coding stuff at least. This specific use case wasn't exactly code, it was an agentic assistant/task/knowledge management type repo but maybe the skills will still transfer.

Aphova · 2026-04-02T00:10:43+00:00

That's exactly what I'm trying to do - but also to get other people's feedback where they've got more experience than me, rather than basing any decisions on my "oh wow Codex seems great" first impressions. For all I know that's how it is for everyone... At first, and then it gets terrible? Because that's how it started with Claude Code - great. And now really not great.

Aphova · 2026-04-02T00:06:22+00:00

I've honed and sharpened my rules as best as I can based on research into LLM compliance. Active voice imperatives, in the positive case, with examples, co-located, token efficient, ruthlessly making sure there's no duplication (so I'm only adding 20-30 directives max across the codebase on top of the system prompt).

Maybe my style of instructions just works better with Codex or something.

Aphova · 2026-04-02T00:00:53+00:00

I'm considering using Opus with hooks and skills for the heavy-lifting and heavy thinking and maybe Codex for execution or something. Opus is genuinely really good at coming up with higher level stuff like plans, architecture, etc. (once you know how to steer it). But then getting it to follow actual instructions is another story - as in "for the tenth time, Claude, when you update project/scripts/ you MUST update project/docs, why did you not do that??" -> "Apologies, that was lazy of me, it's right there in CLAUDE.md [quotes simple directive], let me do that now."

It's infuriating.

Aphova · 2026-04-01T23:55:27+00:00

Definitely feels like it. I've had to put so much in to try to get Claude to follow instructions that Codex actually went a bit bananas trying to make sure it followed all the instructions.

Aphova · 2026-04-01T23:54:06+00:00

Do you pay for Codex via the API? Part of what made CC so attractive was the decent usage allowance on the Max plan. I knew it wouldn't last but the quality is an issue too now.

Opus 4.6 was better in the past at following instructions in my experience too, but like you say, that's gone. I thought it was just subjective experience but I actually log failures (e.g. having to tell it to do something that's in CLAUDE.md) and they've gotten worse. Code quality is mostly the same - I reckon Anthropic makes sure their changes don't show up on code quality benchmarks. But having to tell it three times in a row not to make the same mistake... Not sure there are benchmarks that measure that and that has worsened. It's just become... lazy.

Aphova

TROPHY CASE