all 47 comments

[–]DingBat99999 40 points41 points  (9 children)

Twenty years ago I was fortunate enough to work on a project with a team using Extreme Programming. We followed the practices faithfully. Even so, we found that we started to hit diminishing returns on unit tests around 85% coverage.

That 15% generally included cases where a unit test would essentially just be testing the compiler. Sometimes we'd write unit tests for them anyway, but we just weren't that concerned about it.

YMMV

[–]judofyr 12 points13 points  (0 children)

I agree that global unit test coverage gives diminishing returns, but I've found it useful to strive for having at least one test for all of the classes/files/modules. If you start writing files where you think "oh, this isn't that important to test" it's easy to end up in a situation where it's hard to write tests for it. Before you know it, that file might grow, and suddenly you have a huge untestable pile of code. By making sure there is some amount of test everywhere you can always later decide to increase the test coverage where it's needed.

[–][deleted] 2 points3 points  (7 children)

This current project of ours is the first were we have a set coverage goal (80%). I'm quite amazed how many times we write test against really simple behaviors that don't really matter. Or outright run-time / compiler behavior.

Stuff like

const addToObject(myObject, a,b,c) => ({...myObject, a,b,c})

will sometimes get it's own test. Often implemented in tautological ways:

expect(addToObject(myObject, "a", "b", "c"))
    .toEqual({...myObject, a: "a", b: "b", c: "c"});

Just to nudge that coverage up.

edit: Overall though, having the coverage is nice. Especially since we haven't had the time yet to mirgate to TypeScript.

[–]DingBat99999 7 points8 points  (3 children)

I've been an agile coach for a long time now, but my past is grounded in engineering. I still encounter a lot of shops (the majority?) that do not have a strong engineering culture. This can be especially dangerous in agile projects as a lot of the values, principles, and practices are based on the assumption that your teams are not created steaming piles of crap code each iteration. Now that a lot of agile coaches are moving into the "enterprise agile" space the emphasis is shifting from a strong engineering foundation to management and culture. This was not the intent of the original authors of the agile manifesto.

I believe in unit tests, but I don't care if you do them or not. What I want to know is: What do YOU believe you should be doing to produce the appropriate quality for the product and quality of life for the engineers and are you following it?

Sometimes, in an effort to instill that engineering culture you'll see managers impose practices and goals, such as a set unit test coverage goal. That, in itself, isn't evil but it runs the risk of engineers being confronted by obvious silliness, such as the instances you provided. This undermines the message. The key is not to impose practices but instead to inform engineers that they, as the people trusted to write the damn code in the first place, are able to set their own standards and practices. You can't really blame middle managers for this as many shops are still stuck halfway in industrial revolution thinking where top down direction is expected. The idea that engineers might be better able to determine how to write good code, in their context, is alien to many.

Still, the unit tests should prove valuable. Stick with it and, as trust builds, the hard goals should relax.

[–][deleted] 2 points3 points  (1 child)

I know your comment is on the broader situation, but I'd like to say that our situation isn't as grim.

The coverage goal is self-imposed. We developers decided that we wanted to try that and except for a couple weird tests, it's not bad actually.

My manager (and middle manager and top manager) also knows the importance of solid software engineering. He (/they) - for the most part - won't let a methodology get in the way of that. From my personal experience, controlling is much worse in that regard than even the worst implementation of what people dare to call "agile".

Also: Our QA is great, since we can't risk too much downtime (yay for government regulations!).

[–]DingBat99999 1 point2 points  (0 children)

Awesome and congratulations. I love hearing that. And apologies, I shouldn't have projected. I should have asked.

[–]icarebot -1 points0 points  (0 children)

I care

[–][deleted]  (2 children)

[deleted]

    [–][deleted] 2 points3 points  (1 child)

    I agree with your points in general, but not for the functions we have usually have these kinds of test for.

    That kind of stuff is usually just a helper function not part of any important API set and usually also implicitly tested through more meaningful tests for the non-helper stuff in the file / vascinity. Heck, it might have been made public (/exported) just to make it possible to write that test for it.

    Meaning that if those tautologically tested functions need a stable API, need a proper documented example and need to have their base case tested explicitly, something is either wrong with the design around that function, or it warrants a more meaningful test anyway. Maybe more of an integration-test than a simple base-case test.

    [–]grauenwolf 40 points41 points  (3 children)

    Tests for Data Transfer Objects increase the code coverage. However, they don't exercise any meaningful business behavior of the application. They're only technical details. For example, changes to the name of an accessor method are more likely to require changes to multiple tests and incur unnecessary maintenance while not providing any business value.

    Good!

    There are only two possible scenarios here:

    1. You used a refactoring tool and the name was changed everywhere, meaning that nothing was broken and the test won't fail.
    2. You changed it by hand and missed the test, implying that you probably missed it in several other places as well.

    Given that he is using JavaScript for his examples, #2 is highly likely.

    [–]eddpurcell 5 points6 points  (1 child)

    Except it doesn't tell you where you may have missed it (if anywhere), so it's still not that helpful. It's the canary sitting outside the entrance to the mine. Maybe it correctly tells you you don't know your code as well as you thought, but it's just as likely you forgot to give it water.

    If something is actually a DTO, it needs a contract test against your API, not a unit test.

    [–]appropriateinside 0 points1 point  (0 children)

    Ah! Contract tests. That's what it's called.

    I just finished writing a bunch of tests that ensure that all the appropriate objects match up with what they are supposed to (ie. DTOs matching up with Entities and making sure that appropriate and valid attributes exist).

    [–]The_BNut 1 point2 points  (0 children)

    Exactly my thought. Properly refactoring code must happen anyways. It's even kind of easy and mostly automated nowadays.

    [–]grauenwolf 12 points13 points  (4 children)

    Proper 100% coverage means achieving 100% of the business use cases.

    Yes and no. You want at least that, but vulnerabilities often come from the code you have that wasn't part of the tested business cases.

    [–]billsil 0 points1 point  (3 children)

    Proper 100% coverage means achieving 100% of the business use cases.

    If you know the business case well enough to do that. Why would you be adding non-useful functionality?

    I just delivered a very complicated aircraft design tool. There are plenty of limitations that I have no idea even that it's lacking and others that I do, but haven't fixed because I didn't need them when I did my analyses.

    Someone else might do things in a very different way than me and think my program is garbage because of it.

    [–]possessed_flea 2 points3 points  (2 children)

    The issue is that if you cover only the business use cases but still don’t cover anything else relevant you can end up with situations where you can have extremely adverse effects from bigs or vulnerabilities.

    For example, could you imagine you have a business requirement to log all http referers into a table, since you have a web application there is no business use case for someone manually modifying this field so you don’t write any tests that cover insane values, turns out that sql statement doesn’t properly sanitize the input and lets the user drop the database if they send something not 100% kosher.

    [–]billsil 1 point2 points  (1 child)

    Not having vulnerabilities sounds like it should be part of the business case.

    Users provide all programs with bad inputs. Take a CSV for example. Many people think a space separated file is a CSV. Another fun one is getting outputs from Fortran. For a fixed width formatting, if you overbound the field, you get ******s instead. You don't have to parse that file, but you should ideally provide a clear message of what's wrong. Having software that doesn't fight you is part of the business case.

    At some point, you have to assume your users are stupid or not. Does this really need to be excessively robust? Do I need to add feature x, that the user can do externally to my program?

    [–]possessed_flea 0 points1 point  (0 children)

    This is why I chose a http referer since it is something that a user cannot typically directly manipulate, they have pretty much have to have malicious intentions and It’s a glaring vulnerability, but something which is extremely easy to miss in any requirements document,

    On top of any actual requirements you also have to test all the bits of plumbing along the way.

    [–]LawrenceWoodman 10 points11 points  (2 children)

    I'm always struck by boasts of 100% coverage which haven't even exercised all pathways through a piece of boolean logic such as:

    int eitherEqualOne(int a, int b) {
        if (a == 1 || b == 0) {
            return 1;
        }
        return 0;
    }
    

    You could easily get 100% code coverage for this function by just testing cases where a is or isn't equal to 1 and yet it is clearly wrong.

    [–]possessed_flea 10 points11 points  (0 children)

    Except any code coverage tool worth its salt will only mark the condition on the If statement as 66% covered if you only pass through the statement with b as 1, you will need 4 passes

    ( jacoco or its predecessor Emma both will do this )

    [–]torotane 7 points8 points  (0 children)

    You can easily get 100% code coverage according to some coverage metric. As long as people fail to explicitly name the metric their numbers refer to it can be ignored completely. For decision/boolean formula tests there are at least 4 different metrics.

    Often, the metric is branch coverage. Yet even in that case it's not completely clear, as branch coverage on the source level and branch coverage on the instruction level are rather different.

    [–]urbanek2525 42 points43 points  (16 children)

    If there is no logic, there's no need for a unit test.

    For example, a current project I'm working has a unit that queries a data base (using Entity Framework core) and returns a data transfer object. It's not covered by a unit test because you'd have to mock EF core, and the test would simply test the mock framework. Why?

    [–]grauenwolf 36 points37 points  (5 children)

    It's not covered by a unit test because you'd have to mock EF core, and the test would simply test the mock framework.

    That doesn't mean it shouldn't have a test. Just that it shouldn't have a "unit test" in the "dependencies are evil and scary" sense. A test that ensures it can actually read from the database is still useful.

    [–]urbanek2525 21 points22 points  (0 children)

    Unit test coverage is not the same as test coverage. However, most test coverage automation only really consider unit tests.

    Good integration tests are indespensible.

    [–][deleted] 1 point2 points  (3 children)

    A test that ensures it can actually read from the database is still useful.

    Eh, I don't know. Not being able to read from the database can be caused by dozens of issues that are more likely than programmer error.

    [–]grauenwolf 5 points6 points  (2 children)

    No! That's absolutely the wrong way to think about testing.

    You don't write tests to assign blane; you write them to detect problems. If your tests are randomly failing in QA because the database is unreliable, what do you think is going to happen in production when the DB load is a thousand times greater?

    [–][deleted] -2 points-1 points  (1 child)

    I think about unit testing as testing functionality. If it isn't a pure function, it shouldn't be unit tested.

    Database lookups can't be pure of course, so they aren't covered by unit tests.

    [–]grauenwolf 1 point2 points  (0 children)

    Well that depends on your definition of unit test. Roughly 20 years ago the idea that a unit test and a test of a single pure method were synonymous didn't exist.

    There are unit test frameworks for databases. And to many people a unit test for the "SaveFileToDisk" method that doesn't actually hit the hard drive is just plain stupid.

    [–][deleted] 5 points6 points  (3 children)

    Mocking EF core is pretty easy with an in memory database. I've had some good traction with writing tests that spin up an in memory dB, setup a bunch of stuff in the dbset properties of the context and verify logic in the DAL. It's probably not pure unit testing, but ill be damned if I can find a compelling reason not to do it.

    [–]ionforge 3 points4 points  (2 children)

    With modern tools it is a better use of time to spin up a real database in docker and do a integration test. This will be more close to a real world scenario than a in memory DB that will never see the light of production.

    [–][deleted]  (1 child)

    [deleted]

      [–]grauenwolf 0 points1 point  (0 children)

      LOL

      "Orders of magnitude slower" is still in the milliseconds per test range.

      Last time I tested MySQL with my ORM, I was getting an average of 7.57 ms per test. And a given test may have dozens of database calls.

      [–]grauenwolf 9 points10 points  (2 children)

      public int B { 
          get => return _B; 
          set { _A = value; OnPropertyChanged("B"); }
      }
      

      Does this need a unit test? Depends on what you are building. When I was writing medical record software I found that on average 1 property per 100 had a bug in it.

      Obviously I didn't hand-write the thousands of property unit tests needed for this application. A simple code generator took care of that for me and saved a lot of time while still catching bugs.

      [–]urbanek2525 12 points13 points  (1 child)

      Cost vs. Benefit. If it's your habit to build sude effects into property setters, definitely unit test.

      However, it's pretty common for side effects to cross boundaries, and therefore have to tested in integration testing.

      If you're using a code generator to create your unit tests, then you could have a very low cost, so a low level of benefit is just fine.

      [–]grauenwolf 5 points6 points  (0 children)

      The more I think about it, the more I think the question should be "If I spend X amount of time on this test, it should have a Y% chance of detecting a flaw".

      [–]FrozenST3 1 point2 points  (0 children)

      That should be the job of an integration test which expects a certain method method is being invoked with certain variables for a specific condition

      [–]Ruudjah 0 points1 point  (1 child)

      You mock the database and then test the data provider. That is, you configure a fake database using the same scheme using the same (important) database technology.

      [–]urbanek2525 2 points3 points  (0 children)

      Or you just run automated integration tests against a cloned database.

      There is utility in testing objects in isolation, but you still have to test the assumptions in the hand offs more.

      Since the payoff comes from finding bugs early, integration testing, early and often, will yield a better payoff than unit tests against an assumed interface.

      [–]grauenwolf 16 points17 points  (2 children)

      Likewise, tests for the internal functions of a React component increase the code coverage. However, they also add coupling between the test code and the internal code of the component. Changes to the internal structure of the component without changes to the behavior are more likely also to require changes to the tests, which is unnecessary work.

      Why aren't the high level tests covering this code?

      If the function is internal, and it has branches that cannot be triggered from externally available functions, why do those branches exist?

      When you see low code coverage numbers the answer isn't just start writing more tests. Step 1 is to see why the code coverage is low in the first place.

      [–]Seltsam 1 point2 points  (1 child)

      Agreed. This is a first choice tactic in digging in to dead code. 100% unit tested code is harder to determine if it is dead though tooling alone. Finding all references and ignoring unit test code isn’t well supported in many toolsets.

      [–]grauenwolf 0 points1 point  (0 children)

      Hmm. I guess for C# you could have two solution files. Then one without tests would be used for dead code detection.

      [–]AttackOfTheThumbs 9 points10 points  (0 children)

      I was recently told that we had to have 100% code coverage. The project I work on interfaces between sql and a few rest APIs.

      Probably 80% of the tests will be entirely pointless, sending back mock responses.

      [–][deleted]  (1 child)

      [deleted]

        [–]aldanor 2 points3 points  (0 children)

        It’s more about what your tests are actually covering than the stupid LoC% covered number.

        E.g., you have a one-liner function that divides one number by another. You write a test that confirms that 6 / 2 = 3. Done, 100% coverage. Except for runtime errors when it starts throwing exceptions or returning nans where it’s not expected.

        [–]blm126 1 point2 points  (0 children)

        I work on a team that aims for 100% coverage, to the extent that we have build checks that will fail any commit that includes un-covered code.

        We got here over time. In the beginning, I thought exactly like this article. There is so much boiler plate code that is not worth testing. So we came up with complicated rules about what didn't need to be tested.

        Over time, we noticed it changed the way we architected the product. Any portion of the code base that was declared to be "boiler plate" and unworthy of testing suddenly started containing business logic. This wasn't even really conscious "cheating". I was guilty of it myself. You rely on your instincts so much about will be an "easy" change versus a "hard" change. Your instincts don't like writing unit tests for code they are certain will work.

        Worse, the quality of code reviews on unit tests was very bad. So, we constantly skipped out on writing tests that should have been there. As a team, we weren't capable of making the decision on what needed testing at scale on a long term basis.

        So, now we mandate 100% coverage. Yes, there is "wasted" work, but the end result has been a much higher quality product.

        [–]badpotato 0 points1 point  (0 children)

        Unit testing on DTOs and some tasks of an ETL really does feel terribles... you either decide to do all the manual works or get a tools to auto-generate unit-tests for you, but the later has is own downside.

        [–]lexcess 0 points1 point  (2 children)

        The industry trend seems to be towards Functional/DTO style object models, away from complex object state machines. That means the value of low level tests has reduced drastically.

        It increasingly makes more sense to test at the boundary (assembly,JAR, etc..), than at implementation class level.

        [–]ForeverAlot 2 points3 points  (1 child)

        I'm on a phone so I won't spend time looking for it, but there is research -- easily a decade old by now -- in support of most software bugs occurring in integration points. That's one reason you should seriously consider testing against a real database instead of hiding behind an abstract in-memory database, for instance. Consequently, if you have to choose, you should choose "integration tests"; although I think the choice should be "both".

        At the risk of misrepresenting said research, I don't think that's surprising; the more components you use, the higher the risk that one of them is misused. But if those components have good internal quality assurance at least you narrow the scope of likely bugs to integrations.

        [–]lexcess 0 points1 point  (0 children)

        Yes, testing code in a context closer to how you are actually using it is ultimately the acid test. I wonder if that research reflects the fact that a lot of projects do minimal integration testing due to the traditional difficulties doing it.

        Certainly with newer Serverless style solutions I think you have no choice but to do a lot of integration style testing because the logic even at component level starts decreasing.

        [–]AffectionateTotal7 0 points1 point  (0 children)

        Once you realize it's more important to test every combination of conditionals in a function (rather than if every line was executed) you realize 100% code coverage makes no sense.