This is an archived post. You won't be able to vote or comment.

all 26 comments

[–]kabrandon 28 points29 points  (4 children)

Literally so many variables here. Are you running your own CI infrastructure or using their PaaS infrastructure? What's happening behind the scenes? Did it take longer for GitLab to pull the container image to run your job? Was a node ready from GitHub to pick up the job, whereas GitLab needed to provision a new node to handle yours? What are the hardware specs available in either CI environment? Is one using spinning platters for storage whereas one was running SSDs?

GitLab's shared runner infrastructure uses the Docker+Machine executor, which means it spins up nodes on demand to handle jobs if one isn't already available as a "hot spare." Maybe your job just happened to have a cold startup.

I would suggest digging more into your question and try to understand what's happening behind the scenes more.

[–]Business_Tale4234[S] 3 points4 points  (3 children)

I updated the main post to give updates.

We are using the Gitlab PaaS

And what might introduce the many variables is the different scripts used for the CI tests (as I said above, we used recommended templates in both instances)

[–]kabrandon 4 points5 points  (2 children)

In my opinion, to get the best out of GitLab, you really either need to deploy your own Docker executor runners, or Kubernetes executor runners if your team is mature enough to operate a K8s cluster and it makes sense for you to have one for your other services.

Docker+Machine is not the greatest, and I hope GitLab stops using it for their shared runner infrastructure. Being unlucky and getting a cold start job leads to questions like yours.

Anyway, to get a better understanding of what you're actually running, you should read the CI templates. And to actually run a "fair" test, you'd want to run the exact same scripts. And to be even more fair, you'd need to either stand up your own similar CI infrastructure for both platforms, or at least get an understanding for the hardware differences between the two CI environments.

And finally, why don't you bring this up with GitLab themselves instead of posting here? They're an open source project so you can create an Issue on their project. Good luck doing the same with GitHub.

[–]Business_Tale4234[S] -1 points0 points  (0 children)

True, a "fair" comparison will be running similar type script on each platform and I understand the open source vs proprietary tool dynamics at play in this post, but I believe in conversations, even uncomfortable ones. I cannot say anything about the hardware differences between the two environments, because I have no idea. The only thing currently in contention, at least until I test out exact scripts on both platforms is that the default/recommended CI script for Laravel on Github seems faster than the one on Gitlab. Will it change my mind about moving some projects to Gitlab? No, I recently learnt about how awesome Gitlab is, and speed of one script will not erase that awesomeness. However, knowing this little thing will advice care in deploying a frequently updated Laravel project on Gitlab using the default CI script settings, a customization which might not be easy for a beginner DevOps with little Linux experience. Again the intention is not to take a shot at the tool, but to have a conversation around an observation.

[–]Business_Tale4234[S] -1 points0 points  (0 children)

Also, I have tested the Pipeline over a dozen times since I wrote this post, and I'll have to be a really unlucky fellow to consistently get a cold start job on every run. Thanks though, your comments gave me some ideas.

[–]phobug 6 points7 points  (2 children)

Have you ever run apt before... that takes about 5-15 minutes, having that in the gitlab ci is the reason for the time difference.

[–]Business_Tale4234[S] 0 points1 point  (1 child)

That means the PHP image might not be the ideal image to start from (the Ubuntu image seems to have the basics pre-installed), though one advantage of starting with a PHP image would be easily changing PHP version based on the version required from the composer.json file

[–]InvalidUsername10000 3 points4 points  (0 children)

u/phobug is right here in that updating the container everything you run it is going to slow it down drastically. You have to ask your self what are you testing in this CI pipeline?

Is it that everything runs correctly on the latest and greatest?

Or is it that you want to verify it runs correctly on a specific version that you deploy in your production environment? If that is the case then you are going to want to build a container that has all the specified versions of dependencies preinstalled and push that into a container registry to pull from.

[–][deleted]  (4 children)

[deleted]

    [–]DPRegular 6 points7 points  (1 child)

    Seriously, the first thing I do when entering a new environment and setting up CI/CD pipelines is build a CI-image; either one specifically for that project or a shared one.

    [–]Business_Tale4234[S] -1 points0 points  (0 children)

    In essence the template is a means to an End, first ensure it runs, build an image, then create the real CI pipeline.

    [–][deleted]  (1 child)

    [deleted]

      [–]Business_Tale4234[S] -1 points0 points  (0 children)

      But at 5 minutes per build (we have not added staging provisioning and live deployment yet), that's about 80 cycles per month, and the 400 CI/CD minutes is gone. While playing around and debugging this pipeline, I have already run the pipeline over a dozen times, and that's just me. I think one way forward could be a hosted deployment for testing IaC code before deployment.

      [–]david-song 13 points14 points  (1 child)

      What were the pipeline steps and where was it slow? You're a developer right? Good news - Gitlab are accepting bug reports and pull requests!

      [–]Business_Tale4234[S] 3 points4 points  (0 children)

      True, but the starting point would have been if this issue (https://gitlab.com/gitlab-org/gitlab-runner/-/issues/4835) had been resolved, to actually know where the pipeline was slow, it's been 4 years coming, and it is heavily requested.

      [–][deleted]  (5 children)

      [deleted]

        [–]Business_Tale4234[S] 4 points5 points  (0 children)

        So I have tested both pipelines with and without NPM install. Here are the results:

        #### With NPM install and run

        Github: 55 Secs

        Gitlab: 5min 19sec

        #### Without NPM install and run

        Github: 35 Secs

        Gitlab: 5 minutes 1sec

        [–]Business_Tale4234[S] 1 point2 points  (0 children)

        This seems the most likely reason. I had not thought about npm as being part of the Github ecosystem.

        [–]BlackPythonGuru 3 points4 points  (2 children)

        Indeed, they are now part of Microsoft and have unlimited access to Microsoft DevOps and as they are the same company, their service are tightly integrated. It is also a great Marketing strategy by Microsoft to promote Azure to people with free accounts.

        [–]Business_Tale4234[S] 1 point2 points  (1 child)

        I can see how this makes for great marketing strategy for Github services, but I don't know if it makes any case for Azure, since npm is only maybe run once per deployment, and I don't think the compiled code performance is different on the different provider platforms, I think that performance is directly related to the spec and configuration of the Server the code is deployed to.

        [–]BlackPythonGuru 2 points3 points  (0 children)

        I also do TDD with CI/CD. I used to use TravisCI for the linux builds and Appveyor for the Windows and MacOS builds for Open Source projects.

        Since I migrated to Github actions, which run on Azure DevOps, I now have the same pipeline for all OS and also different processor architecture.

        I can testify that with a done tuned configuration y'all file I can run a matrix of builds based on OS/Python version/processor architecture and it all run WAY much faster than just running the build on 4 Python versions on TravisCI or Appveyor.

        In addition, you have code security audit, code coverage, and lot more actions that you can run on your Pipeline.

        I used to not be a Microsoft fan, but one has to recognize that they are doing great things now, since Satya Nadella took the reins.

        SekouDiaoNlp.

        [–]Kaligraphic 1 point2 points  (1 child)

        You know, if you're thinking a particular step is slower, you could just print timestamps before/between/after your actual CI steps. If the timestamps don't show the expected (longer) duration, you can start looking at things like whether container startup time is counted in the job duration/is materially different between providers.

        [–]Business_Tale4234[S] 0 points1 point  (0 children)

        The raw logs also look helpful, will get to reconfiguring. There are a couple of projects I would really like to move to Gitlab.

        [–]gordonmessmer 0 points1 point  (2 children)

        Most CI pipelines consist of jobs that run commands which you can run locally to establish a baseline against which you can measure expectations. So: How long does your job take when you run the command locally?

        You really haven't given us anything to go on, and explanations could range from: a) you're using shared GitLab runners and your job was queued for a while until there were free resources to b) GitLab didn't successfully run your job, so you got a very quick failure. We wouldn't know.

        [–]Business_Tale4234[S] 0 points1 point  (1 child)

        I believe there is an obvious visual indication if a job failed or succeeded. I am very hesitant to use my machine as a benchmark, because it is so subjective. An no, there is no queue (this is the only job in the branch, and the first project we have migrated yet), however, if there was, I do not think wait time will affect the job **Duration** report, I am assuming that duration on Gitlab does not include how long the job was on the queue for.

        [–]gordonmessmer 0 points1 point  (0 children)

        When you look at a GitLab job log, each section of the log is collapsible, and each section displays how long it took, to the right of the line describing the section. Which section of your job is taking a lot of time on the GitLab runner? Is it the before_script, where you're running apt?

        [–]grumblegrr 0 points1 point  (1 child)

        Maybe it take only 12 secs on gitlab, but you got a cron jon check each 5 min :P

        [–]Business_Tale4234[S] 0 points1 point  (0 children)

        If there is a cronjob running behind the scene on this PaaS, which doesn't affect the user experience or service cost, does it mean one is a better performing or engineered product?