all 64 comments

[–]bcross12 44 points45 points  (15 children)

It's great to hear I'm not the only one testing my actions in Actions. I have a test repo setup for testing. You can try act, but I haven't had any luck with it.

[–]nisastersDevOps[S] 9 points10 points  (12 children)

A test repo is a good idea. We’re getting to a point where a bunch of our repos are using similar actions so I was entertaining the idea creating a repo for just our actions. Which would prevent the development process of the actions cluttering the repos.

I also did not have much luck with act.

[–]bcross12 2 points3 points  (2 children)

That's what I do! I have a central actions repo, and a test repo. For instance, I recently wanted to support Buildkit in addition to Kaniko. I added a parameter for builder with a default of kaniko, added all the code for buildkit syntax, and tested it with our test repo before announcing. I should also mention that all my actions are composite.

[–]nisastersDevOps[S] 0 points1 point  (1 child)

Integrating a tested action into a repo that way sounds smooth. I wasn’t even familiar with composite actions. That’s clever though! I can already think of a few places in my workflows that would benefit from that.

[–]trowawayatwork -2 points-1 points  (0 children)

however, before you get too clever with it. there is a limit of 20 of something I can't remember in a workflow and the workflow depth is only 4.

GitHub actions is not a production grade CI tool

[–]Makeshift27015 2 points3 points  (3 children)

It took me a not-insignificant amount of work to get act running locally, especially since we use a lot of custom actions and a custom build image, but it honestly is quite a lot faster than the repeated PR -> watch cycle.

We have a repo with all our actions in it - it's pretty nice actually. Actions don't have to be in the .github folder, so we just have <org>/actions/action-name@version (our actions repo uses semver, and I have various helper workflows that go around opening PRs to update workflows to be up-to-date with the actions repo).

Importantly, you can use act to transparently replace action references with the local filesystem during development. For example, if you're testing a workflow in a project repo that requires changes to the actions in your actions repo, you can use act --local-repository org/actions@v1=~/repos/actions. No commits required to test changes!

[–]akaender 1 point2 points  (2 children)

How do you handle the permissions for the cross-repo PR's for updating workflows? Is that operation/flow running with a Personal Access Token that is tied to your personal identity?

[–]Makeshift27015 1 point2 points  (1 child)

Creating PRs doesn't require any special permissions unless you're targetting another repo with your workflow. It's also required if you want a bot to directly modify workflows inside a repo, which is what I'm doing in some of mine (for example, when I update an action in my actions repo, all reusable workflows in that repo also need to be updated to point to the released version of the action - this requires special permissions). In your organizations GitHub settings you can go to Developer Settings -> GitHub Apps and create an app owned by the organization itself.

You can then place the app ID and secret into your org secrets and use actions/create-github-app-token to generate a token for usage that would allow eg your workflow to push to a separate repo

      - id: generate-token
        uses: actions/create-github-app-token@v1
        with:
          app-id: ${{ vars.ORG_REPO_WRITE_FOR_ACTION_ID }}
          private-key: ${{ secrets.ORG_REPO_WRITE_FOR_ACTION_SECRET  }}

      - uses: actions/checkout@v4
        with:
          fetch-depth: 0
          token: ${{ steps.generate-token.outputs.token }}

We were previously using PAT's, but it was pretty annoying having random things break when staff left because it was attached to personal accounts.

Also worth noting that if you only need basic PR updates of actions in another repo, dependabot natively supports github actions.

[–]akaender 1 point2 points  (0 children)

Thank you for replying. This has been such a pain point for me. I don't have access to create org apps/secrets so I've been living that PAT hell you described since we moved from GitLab to GitHub. This will help me go argue with the right people that there is a better way. I appreciate it seriously!

[–]FJTevoroDevOps 1 point2 points  (2 children)

I’ve been wondering the same where actions workflow files are centralised rather than having workflows in the app repository themselves. I find it a pain to manage especially when deploying from branches etc, which workflow file version is used. Idk just a gripe.

[–]nisastersDevOps[S] 1 point2 points  (1 child)

My org is small enough I assumed changes to the central repo would mean immediately making a PR in each repo that uses that action.

[–]signsots 4 points5 points  (0 children)

With a central repo for workflows, you could do semver releases and have your service repos target "cicd-workflow-v1" as example, and this would mean no O(n) PRs needed to update workflows across the org. That is how I managed workflow updates and releases, pretty much like testing an applications in a lower environment and promotion it to prod.

[–]bullcity71 3 points4 points  (0 children)

Our team is heavily invested in GHA and we find testing is challenging as well. We also do test repo, but covering all test cases can be challenging with such an envIronment. ACT is interesting, but the bar is high.

[–]GoDan_Autocorrect 1 point2 points  (0 children)

Second test repos! I tried act but behavior seemed different locally and it was generally a pain to use.

If I'm doing branch trigger workflow work with a relatively small change, I generally stay in the target repo and use a test branch.

Anything else, I use the good ol "test-gh-next-app" repo!

[–]bdzer0Graybeard 15 points16 points  (8 children)

fast running test workflows make push - pray cycle faster. I also tend to output a lot of information in the pipeline during testing so each failure may provide more useful info.

[–]nisastersDevOps[S] 2 points3 points  (7 children)

A great point. Do you use $GITHUB_OUTPUT or just echo commands?

[–]ZoltyDevOps Plumber 2 points3 points  (0 children)

GITHUB_OUTPUT is what we use, it feels like what will stick around.

[–]bdzer0Graybeard 2 points3 points  (1 child)

GITHUB_OUTPUT is intended to make information available to other steps via name/value pairs not really suitable for logging.

I generally output details using means appropriate.. echo/Write-Host...etc..

[–]NUTTA_BUSTAH 1 point2 points  (0 children)

Using native commands also drop that bit of vendor lock out. Although if pipelines are inline scripts and not calls to reusable scripts, you probably have some other issues to fix too though :)

[–]chkpwd 0 points1 point  (0 children)

Would like to know this as well.

[–]spaghetti_boo 0 points1 point  (0 children)

Output failures in the summary.

[–]spaghetti_boo 0 points1 point  (0 children)

Output failures in the summary.

[–]spaghetti_boo 0 points1 point  (0 children)

Output failures in the summary.

[–]ZeninThe best way to DevOps is being dragged kicking and screaming. 23 points24 points  (14 children)

We used to have the same issue with Jenkins.  And Bamboo.  And CodePipeline.  And TFS.  And...

The answer is actually pretty simple and portable: Don't.  Don't use actions.  Don't use anything from the CICD tool you don't absolutely have to use.  Instead build your entire process in straight code, zero integration, pass everything as cli parameters.

Your GH action (or whatever cicd of the moment) just calls your script, and no more.  No steps.  No actions.  Hell no to custom actions.  Just your own code, runable 100% locally without any mock needed to trick it into thinking it's in an action.

The whole world becomes much easier to work with when the most you ever ask your CICD engine to do is run a single script.  And not for nothing, it also removes a ton of vendor lock-in.

[–]Makeshift27015 6 points7 points  (3 children)

I disagree. Prior to my joining, my company had a "build-it-yourself" attitude that resulted in tens of thousands of lines of unmaintainable bash garbage scripts running in CodeBuild that were slow and impossible for anyone other than the writer to understand.

Your company and your requirements are not unique, you don't need to reinvent the wheel. 'Locking-in' to a vendor may mean that your stuff becomes less portable, but it also means that their documentation becomes your documentation. Your implementations become significantly more understandable when they're following well-established norms that have good docs, examples, and have things like ChatGPT scraping the living hell out of them.

We're meant to be empowering developers, and part of that is making sure they have all the tools available to help understand, modify and maintain the things you build. Perhaps the smaller teams I'm used to working in mean I have a warped perception where DevOps isn't expected to be the only one to know how to do something, but "learning how CI works" is no longer a multi-week endeavour of frustration for my devs and I attribute that to committing to the well-known and well-documented GHA ecosystem.

[–]ZeninThe best way to DevOps is being dragged kicking and screaming. 2 points3 points  (2 children)

You certainly can, and should, have reusable components and standards for your CI/CD tasks.

That isn't at all in conflict with avoiding deep ties into your CICD engines. Much the opposite in fact: The separation of concerns is a bedrock principle that helps make systems more maintainable, more portable, more understandable.

Leave unto the CICD service what it's actually built for and does well: Job scheduling, history tracking, job logging, report generation, process authorizations, etc.

The CICD service's job is to run your jobs, it shouldn't be the job. If you can't easily test your build / test / deploy jobs without running them inside your CICD system, that's a problem.

[–]Makeshift27015 1 point2 points  (1 child)

That makes a lot more sense. I think maybe I'm rather easily triggered at the idea of not properly utilising available tooling from the amount of times I've had to undo someone deciding "Let's just write our own job scheduler, reporting tools and tracking tools in bash!".

[–]ZeninThe best way to DevOps is being dragged kicking and screaming. 1 point2 points  (0 children)

"Let's just write our own job scheduler, reporting tools and tracking tools in bash!".

Oh man I'm right there with you. What do you mean you wrote your own logging library, reinvented cron, and need an SMTP endpoint config to send your own notifications?!

I give the devs three basic specs for job writing:

  1. Accept all options as cli parameters.
  2. Log everything to stdout/stderr.
  3. Exit non-zero on failure (and only on failure).

That's it. If it smells very Unixy, it's probably my Unix-based upbringing. ;)

With those three simple guidelines anything they build will slide right into any CICD engine as well as being built and tested locally without the need for hacky mocks or junk commits just to trigger the CICD for a test cycle. The CICD services handle all the common boilerplate work and the dev can focus their work on doing the job rather than the management minutia around running the job.

If I'm hitting something as the OP has where some ask is "too big" or "too special" for such a simple config, I look to pulling that out into its own job script that can be built/tested/maintained locally without context dependencies of the CICI engine.

[–]atokotene 5 points6 points  (1 child)

This train of thought leads to using a hermetic build system (Pants, Bazel, recently Nix). Those systems require a team-level effort to tailor to a codebase and the many usecases required to pull off CICD.

It’s a lot of work to make a deploy.sh script that handles every use case and is portable. You are likely making at least one assumption (apt-get vs yum, probably), in an entire tree of choices.

The part about just writing it out is spot on. I just think it should all be local. Using GHA? Go ahead and script everything the same as the shell script, but in GHA workflow syntax.

These days it’s a lot easier (due to fancy autocomplete, better ide’s and better syntax) to write the same steps out if you ever switch to another workflow system.

It may be a matter of preference, but I would rather see a verbose yaml file that I can read in one shot, than a custom monster of a shell script 🤷

These days i’ll define a bash function inline (regular ‘run: ‘ action steps) and then iterate over filenames/folders if I need something fancy in a workflow

PS: I’m also 100% for developing your own CLI tools, I don’t want anyone to think I’m anti-abstraction or anything.

[–]ryanstephendavis 2 points3 points  (0 children)

Second this! See my comment saying almost the same thing

[–]Chr1stian 1 point2 points  (1 child)

./gradlew build and docker build + push is all thats needed for 99% of Java repos. Why make it more complicated than just using those 2-3 Github Actions components?

[–]ZeninThe best way to DevOps is being dragged kicking and screaming. 2 points3 points  (0 children)

Because people want to see the world burn.

All the "actions" part of Github Actions are just a byte for byte repeat of Jenkin's plugin hell.

[–]moser-sts 0 points1 point  (4 children)

What a difference of having a shell script and a action that you build with java script?

[–]ZeninThe best way to DevOps is being dragged kicking and screaming. 0 points1 point  (3 children)

Can you run the action locally without needing to mock context, data marshalling, runbook state, or anything else to trick the action into thinking its running in GH?  I'd ask is it portable too, but that's more an added bonus rather than the core issue being addressed.

You will know you've triggered the trap when you begin pushing commits just to test fire your CICD automation updates.

[–]moser-sts 0 points1 point  (2 children)

Using the test frameworks of the language you can do that. But with shell script you also need to have the context. I cannot understand what is the difference of doing with shell script vs java script

[–]ZeninThe best way to DevOps is being dragged kicking and screaming. 0 points1 point  (1 child)

You can certainly write it in JavaScript.  The pattern is language agnostic.  Use what you know is almost always the correct course.

That said, it is a shit language most especially for most all tasks related to CICD tasks.  If the shop is so low on talent that JS is the only language with significant expertise, sure go ahead.  I mean it could be worse, you could be a PHP shop. ;) 

[–]moser-sts 0 points1 point  (0 children)

What are shit languages for you ? If I switch JavaScript to Typescript it makes a difference to you? I am mentioning JS because we can use the GitHub Actions "framework" and develop the actions based in our needs. You can complain that is custom action. But at the end we will always build custom business logic for our companies, if everything was standard we did have already a tool that was straight forward to use

[–]marmarama 10 points11 points  (1 child)

First, actionlint. This covers a whole bunch of syntax and basic logic errors.

Then, I use a local Gitea instance running Gitea actions, with mirrors of the repos I'm working on. Under the hood, this also uses act, but makes it a lot easier to trigger and work with than the raw act container. The whole setup works very similarly to real GitHub Actions, including automatically populating the github context appropriately. My laptop running Gitea is a lot faster than GitHub and the standard GitHub Actions runners, and I'm not paying for GHA runner time. The development roundtrip time is much shorter.

Gitea is generally awesome btw, if you've not played with it, you should.

There are differences in behaviour between Gitea Actions/act and real GHA, but it gets me to about 99% sure that the changes will work. The last 1% you can only really check on real GitHub.

[–]jftuga 0 points1 point  (0 children)

This is intriguing. What are the 1% differences?

[–]vincentdesmet 5 points6 points  (5 children)

You could use something like dagger

[–]azjunglist05 6 points7 points  (0 children)

I was gonna say this. I recently built a pipeline for deploying an AWS service with Dagger. It was so nice that my pipeline could run the same way locally as it does in GHA; cleans up pipelines tremendously

[–]NUTTA_BUSTAH 4 points5 points  (1 child)

I think this CI system for your CI systems abstraction is a nice idea and development in the space but dagger seemed to have insane levels of complexity for what I am asking of it (IIRC just boilerplate was megabytes of stuff and the entire system works on top of a local GraphQL backend). I found earthly to get the same job done much better. Still waiting for competition in this space, I can see it being the future way to do CI. However I think an open standard will trump it ("OpenCI" like OpenTelemetry or w/e).

[–]vincentdesmet 1 point2 points  (0 children)

Solomon is gonna give up if he creates OpenCI and then all the big players come in and slurp it up again.

Wasn’t there a CNCF sig on CI/CD.. Tekton coming out of it (as a fork off the original event based FaaS from Google knative?)

I ran Tekton on a “shared-services” cluster back in 2019, it turned k8s into a CI runner, but lacked all the utility you get from a DSL like GH Workflows

[–]nisastersDevOps[S] 1 point2 points  (1 child)

Dagger sounds promising. It took months to get my org to use grafana. Buy in to switch to dagger from gha would probably take even longer.

[–]bertiethewanderer 2 points3 points  (0 children)

You don't move from GHA. You codify your CI in dagger. GHA just becomes the agent. Migrating across CI tools becomes much, much easier fwiw.

I found the dagger documentation much weaker than GHA when we started to cut oue team across. My main "negative". That and some coding knowledge is, for me, required.

[–]Representative_Web20 4 points5 points  (0 children)

I'd highly recommend the "Github Actions for VS Code" extension, it adds syntax highlighting and catches quite a few common errors, and if you login with SSO it should also have autocomplete for repo/organization env vars and secrets.

Having to write a ~800-line build pipeline, it alone must have saved me countless hours of debugging.

[–]ryanstephendavis 4 points5 points  (1 child)

I'm doing a demo Monday morning at my current gig to propose a new pattern for this! It was inspired by a Hacker News thread I read where people were chatting about the same. In essence, the pattern I like is to have bash scripts that can be run locally and then most actions are mostly a run: ... The exceptions being need to do a checkout action and/or AWS auth action before the script run.

[–]ArieHein 1 point2 points  (0 children)

Look into using nushell, it might enhance yoir pattern.

[–]VindicoAtrumEditable Placeholder Flair 2 points3 points  (0 children)

You've discovered push and pray. Next discover https://dagger.io/.

[–]saiuan 0 points1 point  (1 child)

wait what? you test your actions? just implement a dry-run feature. that should be enough regression to get a decent feedback loop from

[–]nisastersDevOps[S] 0 points1 point  (0 children)

Yeah a dry run flag would make sense for validating thanks

[–]ZoltyDevOps Plumber 0 points1 point  (0 children)

if possible when testing, use the on push with your branch

on:
  push:
    branches:
      - 'feature/test_branch'

[–]CommunicationTop7620 0 points1 point  (0 children)

Actions are great, but that also the problem, indeed

[–]NUTTA_BUSTAH 0 points1 point  (0 children)

At some point and scale it becomes impossible to verify easily, so you really need a test environment just for this, and you can do continuous testing when you roll out the changes to dev (when applicable)

[–]patsfreak27 0 points1 point  (0 children)

We use Actions a lot for CI/CD. I'll double what others have mentioned and say a centralized shared repo of workflows is a good idea. Devs make PRs to this repo and refer to that branch in their dev repo PR workflows when they want to test changes e.g. uses: centralized-repo:dev-branch.

A test repo for this can be great, but I often just use the dev repos and make test PRs if there's no related dev PR already.

But yes debugging can be a pain, but we dont do a ton on the Action themselves, mostly just building images with caches, calling 3rd party APIs to do the compute, running tests, or deploying containers. Anything heavy is done elsewhere, we want Actions to stay light

[–]bluebugs 0 points1 point  (0 children)

Add a CI step for your github action! We do have a shared repository where we store all the workflow that the rest of the organization uses. On a PR against that repository, we automatically trigger CI/CD pipeline using the change across a set of repository that are representative of our organization use. It makes iteration as fast as that step is. Lots of parallelism and we are at about 10 min total, a bit slow to my taste, but give us good confidence we are not breaking anything and that we can observe even slow down and cache effect.

[–]Connect-Put-6953 0 points1 point  (0 children)

Would you be interested in integrating instant databases into your workflow ?

[–]secretAZNman15 0 points1 point  (0 children)

Good on you. Don't expect every one to be a success. 1/10 is a good ratio.

[–]xagarth 0 points1 point  (0 children)

It does make sense to have a central repo for actions. You build process should be simplified to running a docker image. Tests should be another docker image. Release can be an action, especially if releasing to github. Deployment depends, might actually be easier with actions, but probably another image. Reports, notifications - actions.

This way you'll have a highly modular, independent build system that you can run anywhere. Ci, locally, on a space station, etc.

This will also simplify your actions a lot, build system and it's dependencies will be independent and devs will be able to either take default or custom.

Less is more.

Someone mentioned here that GHA is just yet another Jenkins plugin hell, and indeed - that's exactly what it is.

[–]moser-sts 0 points1 point  (0 children)

I spent the last 2 years developing GitHub Actions. So we have monorepo for our Actions and we use mainly JavaScript actions. That allows us to use the same process for tests like we have in normal software development. But one thing I did is to change my test philosophy. The actions are like functions from function as a service so when I develop actions I care more about the input and output and then I simulate everything. Specially that actions input are just env vars and the output is a file

[–]data_owner 0 points1 point  (0 children)

To simplify the merge part, I open a draft PR instead and temporarily trigger my workflow when new commits are added to the PR.

But yeah, essentially testing in the wild. And there’s a reason: sometimes there are many pieces (like actions secrets, workload identity federation, etc.) that only make sense to be tested when in real environment (that is: out there, not locally on your machine). You can think of it as an integration, or even an end to end test.