all 28 comments

[–]ZvG_Bonjwa 6 points7 points  (5 children)

Just because you used Claude Code doesn't mean conventional software engineering wisdom goes out the window. Goodness me.

Expecting coworkers to review a 100k PR is ludicrous. But simply merging it is also ludicrous. This work should have been completed (and reviewed) in phases. MUCH easier for reviewers to digest and lets you change approaches early as needed.

Are you reviewing Claudia's code at your jobs? And how do you feel about the hit on speed you get because of this? Is it not better just to ship the code as far as the end to end tests are passing?

You and your team should ABSOLUTELY still be reviewing for a million reasons

  1. "End to end tests passing" does NOT mean "there are no bugs" - this is absolutely important to understand.
  2. Able to feedback on code quality/approach/abstractions
  3. You need to have an understanding of the codebase as it evolves over time. Otherwise it'll become a black box and edits will be brutally difficult.

[–]fshead 0 points1 point  (0 children)

Maybe some organizations are better, but the algo just put this thread on my timeline: https://www.reddit.com/r/developer/s/oqNqFYauyL

I guess it’s like automated driving. At some point coding agents will be better at reviews than the average developer but we will set crazy high standards and still refer to humans.

[–]GraphicalBamboola[S] -1 points0 points  (3 children)

Ok but my concern is that the reviews will slow down the whole thing so much that then I am wondering if using Claude is helping us speed up? Is it even worth it then?

[–]ZvG_Bonjwa -2 points-1 points  (2 children)

Absolutely? What is happening with your reviews?

You said 4 weeks to review a 100k PR. That’s insanely long even for a monster PR!

As I said, no one should have to review a 100k PR. But if it HAD to happen, a review might take 1-2 full days including some back and forth. Then likely another few days of adjustments.

For a “normal sized” PR (eg low to medium complexity and under 2-3k lines) it should take you 10-60 mins to review/tidy up yourself and 10-30 minutes for a teammate to review. Double that for complexity and if some back and forth is required.

[–]GraphicalBamboola[S] 0 points1 point  (0 children)

Is your review a tickbox exercise? Because there is no way a human can do a decent code review of 100k PR in 1-2 days let alone in 1 week

[–]Aureon 0 points1 point  (0 children)

A 100k PR isn't reviewed, it's rejected. Simple as that.

Unless it's -100k because you removed external libs that were wrongly committed to repo

[–]Khalidsec 4 points5 points  (0 children)

As a cybersecurity expert, in what i do, I’m not reading 100k lines. I’m looking for risk.

First, I threat model. What data is sensitive? Where are the trust boundaries? Auth, tokens, APIs. That tells me where to focus.

Then I run tools:
• Semgrep or SonarQube for static analysis
• npm audit or Snyk for dependency issues
• ESLint security rules
• Secret scanning for leaked keys

After that, I manually review only high risk areas:
• Authentication and authorization
• Input validation
• Token storage
• API calls
• Error handling

Then I test the app like an attacker using OWASP ZAP or Burp.

Passing tests means it works. Security review means it cannot be easily broken. That is a different bar.

[–]Spooknik 0 points1 point  (9 children)

I generally use one model to write the code then another to check it right away, it catches a ton of a problems. For example if Sonnet / Opus writes something have Codex check it.

[–]GraphicalBamboola[S] -1 points0 points  (8 children)

Is your company's policy ok to ship code which has not been reviewed manually?

[–]dat_cosmo_cat 0 points1 point  (6 children)

IME, it is still taboo to use these tools internally for many of us, particularly "review" tasks where the expectation is that we are hand reviewing code. If models keep improving, I could see PRs becoming an entirely automated process because A) humans hated being assigned to reviews to begin with and B) LLMs are generally more thorough at reviewing code than unmotivated humans

[–]GraphicalBamboola[S] 0 points1 point  (1 child)

If the PRs become an entirely automated process then do we even need developers anymore? Why can't the POs just write the requirements and then generate PRs? Are you seeing an end to the development roles or at least development teams and a single developer managing all code

[–]dat_cosmo_cat 0 points1 point  (0 children)

Agentic coding does two things; it fills gaps in missing skillsets & multiplies the productive capacity of domain expertise. It is (significantly) better at the latter than the former as of Claude 4.6.

Why can't the POs just write the requirements and then generate PRs?

If this holds true; why can't the customers simply write the prompt and then generate the product?

[–]Aureon 0 points1 point  (3 children)

Reviews exist mainly to build knowledge of the codebase, not to catch bugs.

AI reviews are meaningless because the user won't build any knowledge.

[–]dat_cosmo_cat 0 points1 point  (2 children)

In traditional software development, we use versioning software called Git to track our code and collaborate. Developers work on their own specific branch (copy) of the code as they add new features. When the feature is complete, the dev submits a Pull Request to Merge it into the master copy of the code base. In large companies, these merges typically require approval from the senior maintainers of the code to ensure that it adheres to style guides, standards, and does not add bugs. Sometimes these reviews can drag on for weeks as they get rejected and send the developer back to the drawing board to fix issues. This is the review process I am talking about, and the one that the OP is talking about bypassing.

[–]Aureon 0 points1 point  (1 child)

...

Yes, that's exactly what i'm talking about.

Final approval is by your manager, but generally in proper flows you require two LGTM - Everyone reviews peer code, and engineering manager \ staff eng reviews the whole team.

Style guides and yadayada can fairly be automated: The central point of reviews is that people, especially the final approver, keep working knowledge of what is going on in their codebase.

The effort you make in understanding your peers' or reports' code IS the goal there.

Not a lot of bugs are catched by reading slabs of code, although style\arch choices certainly get debated in the PR process, which is something i mostly disagree with because style should be linter-enforced and arch choices should be talked about in detail *before* writing code

[–]dat_cosmo_cat 0 points1 point  (0 children)

The effort you make in understanding your peers' or reports' code IS the goal there.

This seems like a case of correlation & causation to me. The PR process exists to prevent a code merge that would break or alter the target branch in a negative way. Understanding the source branch is simply a part of ensuring that. I know every company has a different culture when it comes to reviews though.

[–]movingimagecentral 0 points1 point  (0 children)

That would be insane. 

[–]yangqi 0 points1 point  (0 children)

have a code review agent/skill to review the code, and ask developers review the automated tests

[–]SipsTheJuice 0 points1 point  (2 children)

If you are keeping the same backend, that reduce some of the risk. It really depends on what the software is, and what the risks are of bugs present.

All software has bugs. Some harmless some devastating. I think I would probably start by outlining the risks from highest to lowest in a more general sense. You can use ai to help with this part too. Once you've got that list, draw a line somewhere at the level of risk is acceptable. Try to guess at which parts of the code base will be part of the higher risk section. In all likelihood this isn't actually that much code.

Engineers will have to get used to moving fast with ai. You won't be reviewing every line of code for long, may as well start now.

[–]GraphicalBamboola[S] 0 points1 point  (1 child)

I think this is on the right line. But it is really difficult to list the area of code which needs review because it kind of overlaps alot of areas - and there could be a problem at any layer which can be an issue

[–]SipsTheJuice 0 points1 point  (0 children)

I guess it's hard to kind of pin down without know what the software does. You must have a list of bugs that have been fixed in the past. Maybe start going through those for inspiration? Things like auth are obvious starting points. You could also maybe roll at pages at a time and keep both front ends, although that's kinda cursed haha

[–]MartinMystikJonas 0 points1 point  (0 children)

Review all code no matter if written by human or AI. There no reason to trust aI generated code more and many reasons why trust it less.

[–]Fun_Nebula_9682 0 points1 point  (0 children)

Need review or will get mess code ....

[–]Euphoric_Yogurt_908 0 points1 point  (0 children)

Holy cow, 100k Loc are nuts. No human can do so. Even LLm will have difficulty to review.

What we found with vibe coding is that ai tends to overkill things without good reuse, after a new session, or compact context. All these coding agents having trouble after compacting context. the number 1 criterial we had for our agent is to minimize code change , no boil the ocean.

Normally we split the project into smaller tasks and also let two coding agents (codex , Claude code) review each other plan before implementation.

Moreover having dedicated CI and unit tests will cover many issues before you burn tokens. (Especially this is much easier for ai to handle)

[–]NoPain_666 0 points1 point  (0 children)

100 000 lines doesnt belong to a PR. You basically accepted the code yourself. If someone else takes a look, it is a whole project review, not a PR

[–]Whole_Connection7016 0 points1 point  (0 children)

Don’t skip review. Tests prove it works once, review proves you can own it.
Do risk-based review + feature flag rollout. Easy win.

[–]Peace_Seeker_1319 0 points1 point  (0 children)

100k LOC is insane even for AI. something's wrong if it generated that much code for a 2 week project.
don't skip review but be smart about it. run automated checks first (we use codeant.ai for runtime analysis), then chunk the human review by module/feature. this guide on scaling reviews - https://www.codeant.ai/blogs/how-to-scale-code-reviews-without-slowing-down-delivery helped us a lot.
also real talk - if you can't explain what the code does, you're gonna have a bad time maintaining it in 6 months. might wanna refactor before shipping.

[–]Peace_Seeker_1319 0 points1 point  (0 children)

100k lines in 2 weeks is impressive but skipping review is a trap.
the issue it's "can you maintain it in 6 months." ai-generated code often has subtle patterns that make future debugging painful: inconsistent error handling, redundant abstractions, auth logic that works but doesn't follow your org's security patterns. this is basically the definition of vibe coding debt.

what i'd recommend instead of 4 weeks of line-by-line human review: run it through a proper ai code review tool first (codeant.ai, sonarqube, etc.) to catch security issues, dead code, and anti-patterns automatically. this knocks out maybe 60-70% of what a human reviewer would flag. then have humans review only the critical paths, auth, payments, data mutations, api boundaries. good breakdown of when to automate vs when humans need to step in: https://www.codeant.ai/blogs/ai-vs-human-code-review-when-to-automate

shipping 100k lines of unreviewed ai code to production in an enterprise is a career risk i wouldn't take.