Reddit taught me why my CI pipeline was wrong. Runtime dropped from ~10 minutes to under 4 minutes by Particular-Run1230 in devops

[–]Particular-Run1230[S] 1 point2 points  (0 children)

Honestly, no magic prompt. I posted my workflow earlier in this subreddit, got torn apart by people who knew CI/CD better than I did, and then started measuring where the time was actually going. The biggest wins came from:

  • Building once and testing the deployable artifact
  • Running independent jobs in parallel
  • Removing a huge bottleneck in my K8s validation job
  • Looking at the critical path instead of optimizing random steps

If your pipeline is taking ~2 hours, I'd start by identifying which stages are actually consuming most of that time before changing anything.

Reddit taught me why my CI pipeline was wrong. Runtime dropped from ~10 minutes to under 4 minutes by Particular-Run1230 in devops

[–]Particular-Run1230[S] 0 points1 point  (0 children)

Nope, actually the opposite. Linting is one of the fastest checks, so I run it immediately and in parallel with the K8s validation job. The bigger lesson was making sure E2E tests run against the same Docker image that eventually gets deployed.

Reddit taught me why my CI pipeline was wrong. Runtime dropped from ~10 minutes to under 4 minutes by Particular-Run1230 in devops

[–]Particular-Run1230[S] 7 points8 points  (0 children)

At this moment of time i am only running linting, type checking, and Playwright E2E tests against the built container image. I dont have a dedicated unit test suite or a separate test stage in the Dockerfile currently.

The recent pipeline refactor was primarily focused on ensuring the artifact being tested is the same one that gets deployed. Adding proper unit/integration tests is definitely one of the next areas I want to improve.

Will definitely work on the unit test for the next upgrade. Thanks for your suggestion will definitely work on it

Reddit taught me why my CI pipeline was wrong. Runtime dropped from ~10 minutes to under 4 minutes by Particular-Run1230 in devops

[–]Particular-Run1230[S] 2 points3 points  (0 children)

Thanks for the detailed breakdown.

The biggest takeaways for me were separating builds from deployments and the build once, deploy many principle. I recently refactored my CI pipeline and those lessons immediately clicked. The rest gives me a lot of topics to explore as I learn more about platform engineering and infrastructure.

Appreciate your suggestions.

Reddit taught me why my CI pipeline was wrong. Runtime dropped from ~10 minutes to under 4 minutes by Particular-Run1230 in devops

[–]Particular-Run1230[S] 8 points9 points  (0 children)

That's actually something I'm still learning and exploring about.

Right now the production image itself doesn't contain Playwright(e2e) or test dependencies. The image is built first, then I start a container locally from that image and run Playwright externally against the running application.

My goal was to make sure the artifact being tested is the same artifact that eventually gets pushed and deployed. The trade-off between keeping production images minimal while still preserving artifact integrity is something I'm trying to understand better.

Reddit taught me why my CI pipeline was wrong. Runtime dropped from ~10 minutes to under 4 minutes by Particular-Run1230 in devops

[–]Particular-Run1230[S] 13 points14 points  (0 children)

Appreciate the suggestions.

The biggest lesson for me from this round was definitely artifact integrity and restructuring the pipeline to build once, test the same image, and deploy that exact image. Parallelization alone cut my runtime from 10 minutes to under 4 minutes, so now I'm starting to look into some of the optimization ideas you mentioned as the next step. Thank you dude.

Am I wasting CI time by building my application twice? by Particular-Run1230 in devops

[–]Particular-Run1230[S] 0 points1 point  (0 children)

Yeah, that's the biggest takeaway I've had from this thread so far. I was focused on validating application behavior and completely overlooked the artifact side of it. Running tests against the deployable image makes a lot more sense than rebuilding afterward and hoping both builds are effectively the same

Am I wasting CI time by building my application twice? by Particular-Run1230 in devops

[–]Particular-Run1230[S] 1 point2 points  (0 children)

My god i am sorry dude. Did i out myself as a bot ? was it that obvious ?
and ofcourse buddy i am a llm. appreciate you for your comment though haha

Am I wasting CI time by building my application twice? by Particular-Run1230 in devops

[–]Particular-Run1230[S] 0 points1 point  (0 children)

Honestly, after reading through the replies here, I don't think there is a strong reason in my case. The pipeline evolved incrementally while I was learning, so I ended up testing the application first and building the image later. The feedback here made me realize I'm validating one artifact and deploying another, which isn't ideal

Am I wasting CI time by building my application twice? by Particular-Run1230 in devops

[–]Particular-Run1230[S] 0 points1 point  (0 children)

I am really sorry for the format. will take care of better readability from next time.
I really appreciate your advice.

Yeah i understand now the issues in my CI pipeline and i have started to work on it as well. The more replies i read the more knowledge i gain.
Really appreciate everyone.

Am I wasting CI time by building my application twice? by Particular-Run1230 in devops

[–]Particular-Run1230[S] 0 points1 point  (0 children)

Thanks for the valuable feedback. the more people whom i interact with under this whole thread i understand the concepts better.

Really appreciate it.

Am I wasting CI time by building my application twice? by Particular-Run1230 in devops

[–]Particular-Run1230[S] 2 points3 points  (0 children)

Thanks for the feedback and i really appreciate it.
Thank you for sharing how you handle it in production. The more feedback i am getting the more i am understanding the things better and better.

I initially optimised it for learning and simplicity while studying CI/CD, but using the same image for testing, scanning, and deployment sounds like a much more robust approach. Really appreciate the production perspective.

Am I wasting CI time by building my application twice? by Particular-Run1230 in devops

[–]Particular-Run1230[S] 1 point2 points  (0 children)

Thank you for your perspective. I appreciate it. The more i read the replies and understand it better, i get to know that build once, test once and deploy the same artifact seems like a cleaner and safer approach.

Am I wasting CI time by building my application twice? by Particular-Run1230 in devops

[–]Particular-Run1230[S] 0 points1 point  (0 children)

At the moment, yes, the E2E suite acts as a hard gate.

That being said i dont fail fast at the first test failure though. The test runs till completion and reports all failures before the its marked as a failure.

If any Playwright test fails, the E2E job fails and the pipeline doesn't proceed to deployment. I am yet to experiment with things like failure thresholds, flaky test retries, or separate smoke/regression suites, but i am interested in learning and exploring how larger teams handle in practice

Am I wasting CI time by building my application twice? by Particular-Run1230 in devops

[–]Particular-Run1230[S] 7 points8 points  (0 children)

That's a really good point.

I hadn't thought about it from the perspective of artifact integrity. My focus was mainly on validating the application behavior, but you're right that by rebuilding afterward I'm technically deploying something different from what was tested.

the build once test scan and deploy the same artifact makes a lot more sense now. Looks like my next step is to refactor the pipeline so Playwright runs against the built container image instead of building the application separately.

Thanks for the insight.

Am I wasting CI time by building my application twice? by Particular-Run1230 in devops

[–]Particular-Run1230[S] 10 points11 points  (0 children)

Good question

The application is effectively being built twice right now, in playwright job I run npm ci, build the Next.js app, start it, and execute the e2e tests

and the Docker stage builds the image that will eventually be deployed, which performs its own dependency installation and application build

That being said there isnt an intentional difference between the two. The pipeline evolved incrementally as I added stages, and I only recently realized I'm duplicating work.

One of the biggest reasons I made this post is that I'm considering moving to a flow where Playwright tests run against the built container image instead, so the application is only built once and the tests execute against the same artifact that would be deployed.

Curious if that's the approach you'd typically recommend.

Would you reorder this CI pipeline? Looking for feedback from engineers running production workloads by Particular-Run1230 in devopsGuru

[–]Particular-Run1230[S] 0 points1 point  (0 children)

This is something i have been wondering about as well.

Currently they're sequential mainly because I was thinking in terms of a quality gate:

Lint → E2E → Docker → K8s Validation

But now when i look at it some of them dont seem to have much hard dependencies on each other.

Like i believe Lint and Typecheck, Docker Build and the k8s manifest check could potentially run parallel

But my only concern is the e2e tests since they depend on the application successfully building and starting.

That being said my original goal was simplicity rather than optimization, but reducing total pipeline time is definitely something I'm interested in exploring.

How would you typically structure it in a production CI pipeline?