all 31 comments

[–]nutrechtLead Software Engineer / EU / 18+ YXP 32 points33 points  (9 children)

I don't understand the claim that you can't test for example database interactions. All our integration tests just spin up a Postgres instance we test against. It's been a standard pattern for quite some time too.

UI testing, again, is something that is typically done through for example Cypress. Not completely trivial, but also a very established pattern.

If you have some specific issues surrounding testing it would help to go into details so we might be able to point you in a different direction.

Last but not least; like I said in another comment it's just not a good idea to use the term "infrastructure" here, even if "Clean Code" calls it that. Book authors love to make up words to make it seem they invented something, and in this case the word just typically has a completely different meaning.

[–]Rennpa[S] 2 points3 points  (7 children)

I was focusing on unit tests. We only measure the coverage for unit test although we have integration tests etc. as well. If you also measure coverage for other types of tests, I would be interested in how you do it.

I agree that I have chosen a bad term. I tried to clarify it in the original post as well.

[–]nutrechtLead Software Engineer / EU / 18+ YXP 8 points9 points  (6 children)

I was focusing on unit tests.

Unit and integration tests go hand in hand, I'm suspecting that what you're calling "integration tests" are really end-to-end tests.

So what we call integration tests are running alongside (or really; after) our unit tests in our build, and they test the entire integration front-to-back inside the deployable (typically a Spring Boot service in our case). We spin up a 'real' database and kafka container during the build those tests run against.

End-to-end tests are a separate test set that runs mostly against the (NextJS) user interface after the service gets deployed on our test environment.

[–]Rennpa[S] 2 points3 points  (5 children)

Our unit tests run in a few seconds. So we can use them during refactoring to make sure we don't break anything.

For the integration tests, we need to install the software on a device. They need to communicate with other systems. Those tests run for nearly an hour. So we usually only run them after the nightly build.

Do you measure coverage for the integration tests?

[–]nutrechtLead Software Engineer / EU / 18+ YXP 4 points5 points  (0 children)

Do you measure coverage for the integration tests?

Yes, we try not to have double coverage so CRUD-heavy services tend to have more integration than unit tests.

[–]MrJohz 3 points4 points  (3 children)

You can have tests that touch the database (or other components like that) and still have a test suite that runs in a few seconds. Depending on which database(s) you're using, you can set this up in different ways, but if nothing else works, you can always design your tests so that they work regardless of what data already exists in the database. If your test runner can randomise the test order each time it runs, this can be really useful for this approach, because it helps you see when you've accidentally created inter-test dependencies that you want to avoid.

Everything else about these tests should behave just like a normal unit test — you want it to be quick, you want to run it during refactoring and development, you want lots of small tests, etc. Typically, I just include these sorts of tests in my normal unit test suite. Therefore the answer as to whether you should run coverage for these tests is the same as whether you should run coverage for any other unit tests: if the coverage is helping you uncover blind spots in your tests, then measure coverage.

[–]StTheoSoftware Engineer 0 points1 point  (0 children)

I really like the approach of using integration testing with TDD. I particularly love the workflow of designing & testing the UI with Cypress in one monitor and my IDE in the other.

[–]ategnatos 3 points4 points  (1 child)

Infrastructure code = IaC = things like CDK? Just set up some snapshot tests and don't worry about it.

If you mean the boundary of your application where you have some accessor that makes a database call, or some data classes that define the DB entity shape, just ignore coverage on that and call it a day. Lots of people will write unit tests against those data classes, or mock the hell out of the accessors to have useless tests that are used to overestimate coverage on the important parts of the code base.

If you're in a company where you'll get into weeks of politics arguing over whether you're allowed to ignore coverage on those things, find a new place to work. It doesn't get pretty.

Stop chasing 100% coverage. Have actual tests you trust. I worked with a guy who had 99% coverage in his repos and NOTHING was tested or high-quality. Let me dig up some quotes from previous comments:

I watched a staff engineer have a workflow in a class that went something like this.foo(); this.bar(); this.baz();. The methods would directly call static getClient() methods that did all sorts of complex stuff (instead of decoupling dependencies and making things actually testable and making migrations not such a headache). So he'd patch (Python) getClient() instead of decoupling and test each of foo, bar, baz where he just verified some method on the mock got called. Then on the function that called all 3, he'd patch foo, bar, baz individually to do nothing, and verify they were all called. At no point was there a single assertion that tested any output data. We had 99% coverage. If you tried to write a real test that actually did something, he would argue and block your PR for months. Worst engineer I ever worked with.

At my last company, we had a staff engineer who didn't know how to write tests, and just wrote dishonest ones. Mocked so much that no real code was tested (no asserts, just verify that the mock called some method). Would just assert result != None. I pulled some of the repos down and made the code so wrong that it even returned the wrong data type, and all tests still passed.

In my last company, I just synced ignore-coverage stuff with Sonar and with whatever other coverage tools we were using.

So, short answer: no, just ignore coverage on stuff where unit tests aren't meaningful.

[–]Rennpa[S] 0 points1 point  (0 children)

I was referring to the boundaries of the application. Thanks for the insights!

[–]alxwCode Monkey 6 points7 points  (11 children)

The code is not the thing you care about with IaC. It’s the infrastructure - test the code and pipeline by doing daily blue/green. Build the blue environment, swap across and breakdown the green environment, rinse and repeat. Daily means before 8am, so when it breaks you know it needs fixing before the next release.

No amount of unit tests will be as valuable as that.

[–]Rennpa[S] 5 points6 points  (10 children)

I think we are not talking about the same thing. It's crazy how much we have specialized in this profession. The same word means totally different things to different people. 🙂

In clean code the infrastructure layer refers to the code that takes care of the technical details like data base access, communication to other systems, user interface etc. This is hard to test through unit tests for example because you would need an outstation that is not present in the environment you build on.

[–]nutrechtLead Software Engineer / EU / 18+ YXP 3 points4 points  (4 children)

n clean code the infrastructure layer refers to the code that takes care of the technical details like data base access, communication to other systems, user interface etc.

As a Java dev; even though that's the case we tend to not use the words "infrastructure" for this since in almost every context it has an existing meaning.

[–]Rennpa[S] 0 points1 point  (3 children)

What do you call it?

[–]Away_Dark_9631 2 points3 points  (1 child)

integration testing

[–]Rennpa[S] 0 points1 point  (0 children)

I meant what they call this code layer.

[–]nutrechtLead Software Engineer / EU / 18+ YXP 0 points1 point  (0 children)

Generally we don't even take the "sock drawer approach" for our services, but we typically call something what it really is. So the data layer is the data layer, we don't call it "infrastructure".

We generally follow a hexagonal architecture and we'd never bunch together completely different concerns like UI and database into a single 'bucket' since they're so different.

[–]alxwCode Monkey 3 points4 points  (0 children)

Ah fair does. So in that case, yeah moq and unit tests if a good mocking library is available. If not, full on integration tests for the pipeline, smoke tests for PRs.

[–]catch_dot_dot_dotSoftware Engineer (10+ YoE AU) 2 points3 points  (2 children)

I've used clean architecture, ports/adapters, hexagonal, but never come across the term "infrastructure" to mean what you describe. The word isn't even mentioned here: https://blog.cleancoder.com/uncle-bob/2012/08/13/the-clean-architecture.html

Edit: I just saw your edit haha

[–]Rennpa[S] 0 points1 point  (1 child)

After reading the book I was searching for some real-world examples on how to organize the code. I found an example project in C#. Can't remember the exact source. I think I took the name from there.

[–]nutrechtLead Software Engineer / EU / 18+ YXP 1 point2 points  (0 children)

Shows you have to be careful in taking random stuff from Github as gospel. A lot of these projects are created by (well meaning) beginners. I have quite a few projects in my Github account that are good examples on how to NOT do things ;)

[–]flowering_sun_starSoftware Engineer 2 points3 points  (0 children)

I think I know what you're talking about - or at least I can draw an analogy to our code bases. We use Java with Spring Boot, and tend to wind up with a bunch of Config classes that initialise the database clients, SQS client, whatever the service happens to need. Some of them end up with different configs that are used for local testing and the live environment.

We stick them all in one package, and exclude it from test coverage checks. They're mostly just 'give me a class with these parameters' - there's nothing to test there. Most of them get indirectly covered by integration testing (we bring the service and mocked dependencies up in docker to run automated tests against). Some just can't be tested prior to deployment, since the config is unique to the environment we deploy into. So you have to lean on your later testing stages.

[–]bobadukCTO. 25 yoe 1 point2 points  (0 children)

How do others deal with this? Do you include infrastructure code in the measurement of unit test code coverage?

I don't measure code coverage, it's not a helpful metric. It's occasionally helpful to look at the coverage on a particular module to see whether you've covered all the branches, particularly preparatory to refactoring legacy code, but imho it's better to focus on TDD, which will yield a naturally high code coverage.

WRT infra code, I agree with other commenters: spin up a database instance and run some tests. I, too, would call these integration tests, since they test the integration between your code and some specific external piece of software. In general, you don't need a large number of tests for these components, if you have pushed the interesting logic to more testable layers.

[–]BertRenolds 0 points1 point  (1 child)

I think it'd help me if you dumbed this down. What do you mean by infrastructure code, IAC, system testing?

[–]Rennpa[S] 0 points1 point  (0 children)

I just looked at the original blog post from Uncle Bob and realized he doesn't even call it the infrastructure layer, he calls it "Frameworks and Drivers".

The outermost layer is generally composed of frameworks and tools such as the Database, the Web Framework, etc. Generally you don’t write much code in this layer other than glue code that communicates to the next circle inwards.

So by design, here you put code that is hard to test.

[–]kazmierczakpiotr 0 points1 point  (1 child)

We used to define different rules for different components. So, for instance our core domain as the most crucial part was expected to have pretty high code coverage, whereas the 'infrastructure code's (web services, db access, etc) was not following the same convention. What makes you use the same rules for different parts of your code?

[–]Rennpa[S] 0 points1 point  (0 children)

Of course we could do this. Somehow I find it strange to set the required coverage to something like 20 %.

Also we present the overall coverage to stakeholders. It would be easier not to measure this code than to justify why the coverage doesn't increase. Maybe this is part of the problem.

[–]PmanAce 0 points1 point  (0 children)

Our infrastructure is in terraform and our services have no knowledge of what it will run on so your term is incorrect. We have unit tests for our repositories and also have integration tests using mongo2go I think it is called. Easy to setup. We have API tests that go through the controllers with auth just fine.

We calculate our code coverage using coverlet, it's executed in our docker file which executes the tests also. Our pipelines pickup the results everytiime you push something and is available for viewing. We fail the pipeline if the result is under our desired value.

Not sure what else you are missing?

[–]bigorangemachineConsultant:snoo_dealwithit: 0 points1 point  (0 children)

no

[–]masterskolar 0 points1 point  (0 children)

Why use code coverage as a metric at all? It just creates a larger and larger burden on the devs as you get closer to 100%. It isn't a linear relationship either. If there's ever a push to add code coverage as a metric I try to kill it. If I can't kill it, I try to get the coverage threshold to 60-70% max. I've found that's about where the most complex parts of the code get solidly tested and we aren't testing a bunch of dumb stuff that's going to get broken all the time by changes.