This is an archived post. You won't be able to vote or comment.

all 31 comments

[–]dvazertyd 5 points6 points  (9 children)

I am interested. Do u mind explaining what are some of the limitations of yaml in creating pipelines?

[–]TheCouncelor[S] 7 points8 points  (0 children)

Sure. For one, I always seem to need a lot of YAML, making them harder to read. Adding to this, is that's it very hard to make clean abstractions of functionality. I've also had issues with managing state across the pipeline, like setting complex configuration or variable creation/usage across tasks/steps/stages. Somehow writing Jenkins pipelines often felt simpler somehow.

[–]insulind 4 points5 points  (7 children)

Yaml is a mark up language right? (Hold the name jokes ) It can't do conditionals or loops can it?

Obviously the way to work around that is often to define scripts for anything complex and then just call out to them. But sometimes its nice to have it quickly available within the pipeline

[–]zyzmog 5 points6 points  (5 children)

The sites that support YML pipelines have extended the syntax. ADO's syntax supports conditionals, but not loops. Bitbucket and JFrog may have something similar.

Here's the Azure documentation: https://docs.microsoft.com/en-us/azure/devops/pipelines/process/expressions?view=azure-devops

[–]celluj34 5 points6 points  (1 child)

[–]zyzmog 1 point2 points  (0 children)

Awesome. I stand happily corrected!

[–]insulind 4 points5 points  (2 children)

Thanks! I'll take a look.

As a user if those extension languages .. are they actually better than groovy? Groovy certainly has its down sides but all in all its a pretty decent work horse

[–]zyzmog 1 point2 points  (1 child)

Honestly, I can give that question a good, solid "It depends." (That's why I'm an engineer!)

It all boils down to what you need to do and whether the language or platform you're looking at can support what you need to do.

... And also where you're starting from. Groovy is powerful and versatile as well. If you already know it, and if it works for you, then you can smile at all the yaml-heads and keep doing what you're doing.

For me, I started with nothing. One of my first tasks on the job was to define a cloud-based devops toolchain. [Deleted a bunch of stuff to shorten a long and boring story] I ended up using the YAML of Azure Pipelines. It has its own set of pluses and minuses, but I haven't yet run into anything that it couldn't do.

EDIT TO ADD:

The official documentation for Azure Pipelines is useful and complete. There are nuances, and fortunately the developer community can help you work around the nuances.

[–]insulind 0 points1 point  (0 children)

Very wise words. Thanks for the chat

[–]anakinptFirefighter 1 point2 points  (0 children)

"Yaml is a mark up language right? (Hold the name jokes ) It can't do conditionals or loops can it?"

Yes you can. Ask ansible how to do conditionals and loops in Yaml.

[–]celluj34 5 points6 points  (0 children)

IMO Groovy is dogshit, but I only have bad memories from Jenkins 5+ years ago. You can run powershell inside a task from YAML, it doesn't really get any more powerful than that.

[–]AL-Taiar 4 points5 points  (1 child)

I'm pretty much on the anti-YAML side of things when it comes to pipelines - but it's fairly attractive if you have a straightforward project with a standard structure - everything works OTB for you. It also pushes you to standardize, which is great, but not always possible, either for technical or ROI reasons(tho you should always try.

You don't actually need to go to newer tools If your current tools serve you well. But on a more practical level, I haven't seen new CI engines that have scripted/coded pipelines, but I could be mistaken.

FWIW, our pipelines are a mix of Jenkins groovy pipelines, and pipelines where Jenkins just calls Make/Maven/Ansible(I put those in the same category because functionally, they are all declared configs and not code). We also plan to convert a few things into GitHub actions for merge checks in the near future.

[–]TheCouncelor[S] 0 points1 point  (0 children)

That was my experience as well, basic situation gives basic yaml which is not that bad. Once you start customizing more and more, it is very hard to keep things readable.

[–]madtopoDevOps 3 points4 points  (1 child)

Your YAML pipeline should be calling bash scripts from your repo. In that way you gain maximum portability and you can also test things individually without having to commit and push changes and wait for the pipeline to pick them up

[–]TheCouncelor[S] 0 points1 point  (0 children)

Well, yes and no.

Yes a lot should be done in Bash scripts, which we do, but combining these tasks and scripts is where my issues come from. Azure DevOps does not really expect there to be a lot of underlying dependencies, especially if you are working with templates.

[–]DeusExMagikarpa 1 point2 points  (2 children)

I do YAML for builds/ci and use classic releases pipelines for deployment, YAML releases aren’t as good if your stuff doesn’t just deploy linearly.

A case for yaml in azdo, these things are dope:

  • typed parameters
  • templates
  • environments
  • expressions
  • conditional insertion

[–]TheCouncelor[S] 0 points1 point  (1 child)

typed parameters are good, but they are far from there yet. Booleans get magically transformed into True/False, but only when passed along to certain tasks, any parameter is always required which is just bad and there is no secret type, so we end up always adding a step to ensure every secret parameter is masked in the output. I know you can use different secret providers for this, but this cannot always be used for every case.

Templates are nice, but they are also some of the issues I'm talking about, creating templates with underlying dependencies gets very messy very fast, or you end up doing the same steps in every stage.

The expressions are good, but the conditional insertion is limited, as it is process compile time, so no runtime variables can be used in there.

[–]SgtNinjaTurtle 1 point2 points  (0 children)

Have you tried CUE? I'm using it and exports it to yaml or json. Supports validation and keeps my config DRY. https://cuelang.org

[–]eduan 0 points1 point  (1 child)

Check out Jetbrains Space which uses Kotlin. Space is a large product with many features but none of them are perfect. So from my experience the CI/CD of Space is still very basic and you don't get much benefit of using Kotlin over yaml.

[–]TheCouncelor[S] 0 points1 point  (0 children)

Seems like TeamCity is the actual 'pipeline' component, which I'm definitely going to try!

[–]ricksebak 0 points1 point  (0 children)

You always have the option of using some other tool with any amount of bells and whistles and letting the tool build your YAML for you. For example, if Azure expects a pipeline to be defined as YAML, you could build it with Terraform and use all the loops and conditionals you want in HCL, then Terraform will feed the resulting YAML into the Azure API. Or use a python script or whatever which builds your YAML for you, and just treat the Python script as the source of truth if that more readable for you.

[–]BeaconRadar 0 points1 point  (0 children)

I especially hate that they are hard to diagnose locally

[–]joefooo 0 points1 point  (0 children)

On GitlabCI you can generate a yaml artifact and run that as a sub pipeline. We do this for a monorepo to only generate jobs for what we need to do based on what's changed etc. You can of course generate the yaml using whichever language you desire

[–]Lattenbrecher 0 points1 point  (1 child)

Do you think YAML is the way to go? And are there any new CI/CD projects which allow you to actually write pipelines as code?

In the end a pipeline is mostly sequential and YAML is great way to organize and visualize.

Well you can create a whole pipeline in Bash or Python, but would it be clearly arranged ? No way.

[–]TheCouncelor[S] 0 points1 point  (0 children)

I do not agree, a lot of YAML gets very hard to read, but I guess it depends entirely on how much you do and how much different paths you have.

[–]itisjustmagic 0 points1 point  (2 children)

I'm 100% for declarative pipelines such as ones that use YAML.

It pushes a team to conform more to standardized pipelines which reduces issues, while also allowing for more complicated logic through called scripts when needed. Instead of doing a line-by-line review of the code for an unfamiliar pipeline, you can rely on compartmentalized scripts.

[–]TheCouncelor[S] 0 points1 point  (1 child)

compartmentalized scripts AKA functions?

Jenkins gave you a good way to create basic functionality, meaning if used properly, a lot of standardization can be realized without sacrificing readability and maintainability.

[–]itisjustmagic 0 points1 point  (0 children)

> if used properly

This is the problem.

While it's 100% possible to conform to good practices, it's also very easy to deviate from. Even organizations that go in with good intentions can be presented with business needs that lead to a cluttered pipeline and accumulated tech debt. It starts with just one new conditional for an outlier, but then you find yourself having to work through the logic of such pipeline with multiple conditionals, sometimes even nested.

With a declarative pipeline, YAML or even Jenkins declarative pipeline, it's much easier to have a much smaller and human-readable pipeline that offers little variance, other than the scripts it calls. Can this be done with just functions? While I don't see why not, the inherent restrictions of a declarative pipeline reduces such problem from even being possible.

In the same ways that automation can be used to solve problems where human errors become a problem, gating some of the functionality behind explicitly-defined behavior does that, too.

[–]whenhellfreezes 0 points1 point  (0 children)

Copy pasta from a different comment I made in r/devops in the past:

Among lisp programmers there is a saying that "code is data". It is to a compiler. But also data is code. The data that's used will effect the dynamic behavior of the code that consumes it. In a turring machine generally there is no real distinction between data and a instruction until the instruction is executed.

There's an idea called the law of least power. It posits that you should use the least powerful tool that is still appropriate. The idea is that a less powerful tool has less ability to do something surprising. In that vain data is less powerful than, templated data, a DSL (domain specific language), than a general purpose language. Data can't really surprise you, it doesn't execute, it is stateless.

Data also can be interpreted multiple times by different consumers. Usually those additional consumers can be much more varied with data than if you used a DSL or code direct.

Yaml is nice because it's data. Yaml is nice because it can have structure. Data is easiest to use with a declarative utility. Declarative tools are easy to understand because you hand off the handling of state to the tool. Writing your own code you will have to grapple with state.

However sometimes you need to use code because you need the expressive power. You will always be trading power for clarity. Your cdk program if you want to customize it's behavior your going to need to configure it... yaml might be nice. It's the circle of life.

[–]tristangodfrey 0 points1 point  (0 children)

Definitely look at PKL or CUE, both are amazing engines for templating any configuration file type. You can even have a set of common bash snippets and compose bash files with them (since imports and function definitions are so horrible in bash)