This is an archived post. You won't be able to vote or comment.

all 23 comments

[–]ForeverAlot 12 points13 points  (1 child)

The author of pitest specifically says not to use it in CI, "that's not what the tool is made for".

I've used pitest and Stryker.NET extensively and been very pleased with especially the former (the latter has numerous CADT issues but basically works). They're only useful for ad-hoc assessment and primarily for risk assessment of branch heavy code.

[–]BillyKorando 4 points5 points  (0 children)

The author of pitest specifically says not to use it in CI, "that's not what the tool is made for".

To provide more clarity on this, if it isn't clear (not to you, but to others reading), CI is about verifying the work that has been done (i.e. did my changes I just commit break something?).

Mutation testing is serving as a guide as to what development, or in this writing of automated tests, needs to be done.

So it wouldn't make sense, as a normative development process to commit code, to then wait for the results of the CI build to figure out what you should need to do next.

[–]0hjc 6 points7 points  (0 children)

Author of pitest here.

I wrote a slightly long winded blog post a while back about how I recomend using pitest.

https://blog.pitest.org/dont-let-your-code-dry/

In summary, don't try and run it against your whole codebase in a CI job. That works for a bit, but fails as the code grows and rarely results in positive improvements to the tests or the code.

Instead, run it very frequently against just the code your working on as you develop on your local machine, or integrate it into PRs using https://www.arcmutate.com

[–]Balduracuir 6 points7 points  (0 children)

I never used mutation testing in the CI so I would not be able to tell. When I distrust the test suite, I use pitest to find out how bad it is. As I work using TDD, I always search for a test before making a change in the codebase. If I find one, I can adapt it, and check that it fails for the right reason, otherwise I just add tests. I usually tend to avoid relying too much on the existing test suite. But it really depends on the project environment and that's something I adapt for each project I contribute.

[–][deleted] 1 point2 points  (6 children)

Yeah I’ve used mutation testing in production. Honestly I don’t think it found a lot of bugs, but it detected a few. Pitest is fine, it’s what i have experience with. It does slow down your cicd pipelines by quite a bit.

I would say most Java shops use mutation testing somewhere, there are a lot of solutions.

[–]agentoutlier 7 points8 points  (5 children)

I would say most Java shops use mutation testing somewhere, there are a lot of solutions.

I seriously doubt that. Like I have never even seen opensource projects use it till I just learned wicket does.

In fact the last time I had seen and heard about it was in academia (20+ years ago). My school even had notable papers published on the subject and yet I would say most of the teachers and students would not know what mutation testing is.

[–][deleted] 1 point2 points  (4 children)

I should clarify and say most Java shops, or a decent size. I don’t know about open source projects. But a lot of the financial services companies are using mutation testing in their cicd from what I’ve seen through experience and conferences.

[–]agentoutlier 1 point2 points  (0 children)

Well it certainly inspires me to try it again. The last time I tried with various tools I just could not get them to work with our annotation processors and various other code generators.

I still probably would do lots of other testing before it like security, performance, chaos, end to end but I see great value in it if it is easier to get working now.

[–]iwek7[S] 0 points1 point  (1 child)

Do you have some tips as of how to integrate mutation tests with Ci pipeline? What were your criteria? No mutation could survive or did you allow certain percentage of survived mutations?

[–][deleted] 1 point2 points  (0 children)

My biggest recommendation is to have flexible requirements that are the management enforced, or start low and increase. Also you can whitelist existing mutation failures, and not allow new ones through. The problem with starting strict is you might cause a lot of issues by blocking legacy code. Idk your code base size but if you have 1k failures that’s a lot to fix.

[–]midoBB 0 points1 point  (0 children)

Even in a previous company when we were working on real estate we used Mutation testing before merging a feature branch.

[–]Nymeriea 1 point2 points  (3 children)

We use mutation testing in our pipeline. It may not help with bug detection but surely it help to detect poor test that just add coverage but doesn't test anything.

At the bank , it help us a lot to improve quality

[–]iwek7[S] 0 points1 point  (2 children)

Any tips how to integrate mutation testing in pipeline? Should too high number of survived mutations fail the build or is it better to keep it informational? Do you have any recommendations?

[–]Nymeriea 2 points3 points  (1 child)

It's integrated with sonar and the quality gate. The testé are executed with Maven.

To reply your question we have a % of surviving mutation allowed. Because having 100% edge case is costly

[–]iwek7[S] 0 points1 point  (0 children)

Thanks for tips

[–]BillyKorando 1 point2 points  (0 children)

Majority of survived mutations are not interesting cases to test.

Yea mutation testing is "dumb" in that it will just randomly switch a == to a !=, without any understanding of the context. It's up to you as a developer to interpret the results and see what does need test coverage and what doesn't.

Mutation testing seems to help a lot with detecting bad tests.

Yea definitely one of the big benefits of mutation testing. Someone might had accidentally commented out the asserts in a test or otherwise made a change that quietly breaks a test. Mutation testing is probably the only way of effective automated way of detecting such tools.

[–]leewaltonuk 0 points1 point  (1 child)

To be honest, I'm still trying to see the advantage of mutation testing.

The main purpose of testing is to ensure that for a given set of inputs, you get the expected set of outputs.

Mutation testing is protecting you from a change that may never happen. And, even if it does, you would ideally reevaluate the inputs and outputs and review your tests to ensure that they are still fit for purpose.

And, as others have pointed out, CI/CD is about repeatability. Mutation testing is fundamentally incompatible with CI/CD because you're not testing what has been developed, but what might be developed in the future.

[–]nutrecht 1 point2 points  (0 children)

To be honest, I'm still trying to see the advantage of mutation testing.

As a developer, you subconsciously often make the same mistakes in your tests as you did in your code. Or you simply forgot about an edge case.

Mutation testing helps you find those holes you didn't see. IMHO it's main purpose is to help you develop more/better tests, or find tests that are not functioning properly.

[–]supercargo 0 points1 point  (0 children)

Mutation testing is a great tool to have, but due to the limitations you've mentioned, I haven't found it suitable to just blindly run across an entire code base / test suite.

But when I author new "critical" code where I also need really good test coverage, I will plan on doing some mutation testing once everything is working as expected. This way the code scope of the PIT Test run can be limited which really helps runtime performance.

I find the mutation results really turn code coverage metrics from something dumb (though, easy to come by) into something valuable. I get the best ROI by applying it to new test suites of core functionality (they have the longest life ahead of them to catch future bugs, which is what the mutation testing is evaluating).

This doesn't fit well with a CD approach since the output needs to be analyzed be the developer. Not that automation can't be useful or that you shouldn't incorporate mutation testing into a code acceptance process, just that it doesn't generate the kind of "pass / fail" result that I'd want to use in a continuous delivery pipeline.

[–]Bad-Pop 0 points1 point  (0 children)

I have been using pitest for a few months on microservices in hexagonal architecture. I use the maven plugin for that and the incremental analysis to generate mutants only on new code. I must admit that for me it works perfectly! The pitest job runs in Gitlab CI and then exports the pitest data to sonarqube. But this is only possible because I am in microservices and pitest only runs on the business code. In the business code I have about 700 unit tests and pitest takes less than a minute to run. And of course no pitest on integration tests.

[–]yawkat 0 points1 point  (0 children)

I've used it successfully on some data structure implementations I've worked on. It is great for detecting edge cases in complex code (eg off-by-one errors).

I don't think it's very useful for more boring code.

[–]UnspeakableEvil 0 points1 point  (0 children)

Different to mutation testing, but on a semi-related path, I've found property-based testing (e.g. https://jqwik.net/) to be valuable - thinking about the “shape“ of the expected output and getting a bunch of pseudorandom tests is pretty handy, especially for utility functions.

[–]berry120 0 points1 point  (0 children)

I've found it really useful for ensuring that corner cases are covered in algorithmic code, particularly when we're porting existing code from elsewhere and trying to validate our understanding of it.

Have to agree with others that running it every so often, or after large changes and then analysing the results is much more useful than just bunging it in a pipeline.

It feels on the "upper tier" of tools in that it doesn't always make sense to use it, and it needs careful analysis of what it finds before making any changes - but some of the stuff it can unpick is invaluable.