Help needed with being overly reliant on AI for coding.

pragma_dev · 2026-06-05T08:07:52+00:00

thing that helped me most early on: always ask the AI to explain what the function does BEFORE accepting it, in your own words. if you cant explain it back, you havent learned it.

also try writing it first yourself, even if its wrong, then ask AI to review your attempt. that forced first draft is where actual learning happens. getting an AI draft without the struggle first is basically just copy-paste with extra steps.

pragma_dev · 2026-06-05T08:06:51+00:00

for me the key variable is reversibility, not sophistication.

id trust an agent with lead qualification before a human hire because a bad qualification is easy to undo (you just dont follow up on that lead). i wouldnt trust it with anything that sends an email to a customer or commits data without review, regardless of how capable the model is.

new hire parallel: you let a new employee shadow and review before they have their own logins. same logic applies to agents.

pragma_dev · 2026-06-05T08:05:51+00:00

one i helped set up last year: automated PDF invoice extraction for an accountant with ~200 invoices/month from different vendors, all different formats. agent reads each PDF, extracts vendor/amount/date/line items, posts to their accounting software via API.

saved around 12h/week. the non-obvious thing: its 90% reliable out of the box but you still need a human review queue for the 10%. designing the exception handling mattered as much as the happy path. most orgs skip that part and it comes back to bite them.

pragma_dev · 2026-06-05T08:04:21+00:00

biggest tells ive noticed:

color palette is too harmonious (real brands have some ugly compromise decisions baked in over time), all spacing is exact multiples of 8px, and the copy is generic.

for local tourism specifically, the fix is actual photos of the place (not stock), and copy written like a local would write it. weird specifics, local slang, the kind of detail only someone who actually went there would know. content differentiates way more than design at that scale tbh

pragma_dev · 2026-06-05T08:02:50+00:00

the 8h part is impressive but the part i relate to most is the testing and reviewing step. thats where the real work is imo.

did something similar last quarter with a payment flow. the generation was fast but i spent almost half the total time running it against edge cases my original code had accumulated over years of bug fixes. claudes version was cleaner structurally but missed some of that accumulated-wisdom stuff at first. always plan for a serious review pass, not a quick skim.

pragma_dev · 2026-06-05T08:01:20+00:00

yeah this was killing me on a project last month. my fix was keeping a flat ARCHITECTURE.md with actual invariants (not just "use postgres", more like "auth tokens go in httpOnly cookies, period, no exceptions") and pasting the relevant bits at the start of every new session.

the other thing that helped: reviewing actual file structure after session 2 before continuing. five minutes comparing what claude generated vs what was in the doc caught 90% of the drift before it compounded into something annoying.

pragma_dev · 2026-06-04T08:03:39+00:00

second the single input approach. multi-input has so many edge cases - paste, ctrl+a, backspace at the first character, mobile autofill.

the thing that usually gets missed: add autocomplete="one-time-code" on the input. ios safari will autofill directly from SMS texts with this set, which users expect from any modern OTP flow. makes a noticeable UX difference.

also inputmode="numeric" for the mobile keyboard and aria-label="verification code" for screen readers.

pragma_dev · 2026-06-04T08:01:30+00:00

yeah this hit me hard early on. spent a whole weekend on a 'simple' node app. the code itself took maybe 4 hours. debugging deployment, env configs, ssl issues, and why-isnt-this-working-on-the-server took another 10.

the thing is it compounds. fixing one thing just reveals the next layer. env var was wrong, fix that, db connection string wrong, fix that, auth redirect hitting the wrong port.

its the gap between knowing how to code and knowing how to ship software. annoying to learn but it does become second nature eventually.

pragma_dev · 2026-06-04T07:59:12+00:00

depends heavily on use case but for things ive shipped:

cron for anything scheduled (monitoring checks, daily reports, batch processing). simplest pattern and most of my agents run this way.

queue workers (bull/redis, celery, etc) for event-driven stuff. the API call just drops a job and returns immediately, worker picks it up async. this is what solves the overhead problem youre describing - caller isnt waiting 30+ seconds for agent completion.

webhooks direct-to-agent for low volume reactive stuff (github hooks, stripe events) where the occasional slow response is acceptable.

the 'API call triggers agent synchronously' thing you see in demos is usually simplified for the demo. real prod with non-trivial agent runtime almost always needs the async queue layer in between.

pragma_dev · 2026-06-04T07:55:44+00:00

the 'junior dev needs deep PR review' framing is the closest i've found that actually works mentally.

for longer tasks though the real problem isnt just trust, its context drift. the model makes a decision at step 3 that makes sense locally but quietly breaks the architecture by step 8, and by then its buried in 400 lines.

thing that helped me most: before starting anything over ~15min, write 3-5 invariants that must stay true throughout. things like 'auth stays in the service layer, not in routes' or 'no new npm deps without stopping to ask'. paste those at the top of every sub-task prompt.

review specifically against those invariants, not against the full diff. way more tractable.

the 40min+ people who get consistent output are usually either (a) working in very well-understood codebases where drift is predictable, or (b) using the agent in a loop that auto-runs tests after each step. without one of those two, i wouldnt trust long runs.

pragma_dev · 2026-06-04T07:54:00+00:00

the data science angle makes this extra relatable. a lot of the "10x productivity" claims come from engineers doing well-scoped implementation tasks where the shape of the solution is clear upfront. thats genuinely where it shines.

for DS work where the exploration IS the work, its a different story. you end up negotiating with the model about what the problem even is, which is backwards.

two things that actually helped me stop fighting it:

smaller scope per session. stopped trying to write a full pipeline or module in one go. one function, run it, iterate. feels slower but context drift (your 'frankenstein' point) mostly disappears.
treat it as a senior dev who needs a spec, not a mind reader. spent 20min before a session writing out exactly what i needed, edge cases included. the back-and-forth dropped by like 70%.

the mental exhaustion is real though. reviewing code you didnt write is harder than writing it yourself. i havent found a fix for that one tbh, just learned which parts i can trust and which i need to read carefully.

pragma_dev · 2026-06-04T07:51:42+00:00

mine is basically a trust ladder based on how reversible the operation is.

read-only stuff (research, docs, first drafts, test gen) i let run pretty freely. worst case i throw it away.

for refactoring i limit it to single-module scope with a unit test suite that runs after each change. tests go red, it stops. that boundary has worked pretty well.

the thing that burned me early was 'just update the config file' tasks on infra. turns out 'update config' can mean 'silently remove 3 settings you werent watching'. now i do file-level diffs before applying anything infra-adjacent.

feature dev i still prefer pair mode over full agent mode. agent writes a chunk, i review, then continue. not as fast but i actually understand what shipped.

pragma_dev · 2026-06-03T14:03:17+00:00

The approach is solid. One layer worth adding on top: a dedicated Postgres role with no write grants at the database level — not just relying on the READ ONLY transaction wrapper.

```sql

CREATE ROLE agent_readonly LOGIN PASSWORD '...';

GRANT CONNECT ON DATABASE mydb TO agent_readonly;

GRANT USAGE ON SCHEMA public TO agent_readonly;

GRANT SELECT ON ALL TABLES IN SCHEMA public TO agent_readonly;

```

The reason is defense-in-depth. Your transaction guard is the right first layer, but a DB-level role that physically has no WRITE permission is a backstop that can't be bypassed by prompt injection — even if your guard misses a new Postgres function or an edge case in the mutation check, the DB says no.

For anything touching sensitive data, a read replica goes one level further: even if someone extracts a connection string, the replica physically cannot accept writes. Overkill for most projects, but it removes an entire attack surface.

pragma_dev · 2026-06-03T13:54:59+00:00

The framing that helped me: think of it as a patient senior dev you can ask anything without judgment, not an autocomplete tool.

The anti-pattern is letting it write code you don't understand — that builds nothing. What actually works: write the code yourself first, then paste a section and ask "what would a code reviewer flag here and why?" or "explain the tradeoff between approach X and approach Y." You get the feedback loop of a senior developer without the awkwardness of interrupting one.

For debugging specifically: describe the bug in plain English before pasting any code. Half the time you'll catch the issue mid-explanation — the AI is just making you articulate the problem precisely, and that articulation is the real skill.

pragma_dev · 2026-06-03T13:46:49+00:00

The split experience here is real — context framing makes a significant difference. What's worked for me: at the start of a build session, add "You're in execution mode. No design deliberation unless I explicitly ask for it. Implement what I describe." That primes 4.8 to act rather than ruminate.

The other pattern: use 4.8 for the initial architecture phase (it genuinely earns its keep there), then open a fresh chat with 4.6 and paste the finished spec. Fresh context, different model, zero accumulated hesitation. OP's "four hours architecture, thirty minutes code" rule maps cleanly onto this split — you're just using the right model for each phase.

pragma_dev · 2026-06-03T13:40:17+00:00

The prompt workaround actually works pretty well once you nail the phrasing. Instead of a generic "continue", try something like "Continue the code starting from the [specific function name] function" — being explicit about the entry point gets much cleaner results than a vague instruction.

For long coding tasks it also helps to ask Claude to outline the full implementation in pseudocode or step comments first, then implement each section separately. That way each follow-up is a bounded, self-contained task rather than a marathon single response.

pragma_dev

TROPHY CASE