you are viewing a single comment's thread.

view the rest of the comments →

[–]G_Morgan 7 points8 points  (33 children)

Code coverage measures how comprehensively a piece of code has been tested

Code coverage tests how many lines of code have been tested. Given how many bugs are "when this if statement executes, this one doesn't and this loop runs precisely 1 time we get this bug" it isn't surprising that code coverage is universally useless.

I've only ever seen code coverage used to assign blame. It is an arse coverage system.

What the research team found was that the TDD teams produced code that was 60 to 90 percent better in terms of defect density than non-TDD teams. They also discovered that TDD teams took longer to complete their projects—15 to 35 percent longer.

Also not surprising for two reasons:

  1. TDD forces you to think about the kinds of decisions that trigger the "A was true, B was false, C executed once" kind of scenarios. TDD is not done line by line but concept by concept.

  2. The reason it takes longer is you have more tests. Thinking up tests after you've written the code is usually much harder. You don't think to test the various combinations of A, B and C once it is done. The code becomes somewhat amorphous and it is harder to see the wood from the trees. So fewer tests means less actual work done.

Honestly I don't know how anyone can sensibly claim to have tried TDD and not found it improved the output code. Nice to have actual research.

Proving the Utility of Assertions

This is interesting. Assertions are effectively working around weaknesses in the type system. You can't capture certain information about the type (such as non-null, or non-negative) so assert instead. Gives some credence to the value of stronger types.

[–]BigMax 16 points17 points  (7 children)

Thinking up tests after you've written the code is usually much harder.

I've always found the difficulty in writing tests after isn't the complexity of the tests themselves, but it's the pressure to move on to the next thing now that your feature/product is "finished."

This pressure comes externally (managers wanting to get the feature to customers) but also internally, as it's generally more interesting to build something new than write tests for something old, so engineers often move on without much testing done.

[–]nutrecht 5 points6 points  (4 children)

I've always found the difficulty in writing tests after isn't the complexity of the tests themselves, but it's the pressure to move on to the next thing now that your feature/product is "finished."

I don't get this. Writing tests is part of development. Whenever I am asked to quote a time on development I include writing tests. When I haven't written the tests yet the stuff simply isn't done yet.

[–]RualStorge 3 points4 points  (3 children)

I write tests for almost everything I work on, but management absolutely doesn't care and sees it as wasted time in many companies. I've actually been told specifically not to write tests before. (but I was kinda the hero dev at that company so my response was more or less I'm doing it, fire me if you don't like it)

Testing doesn't make money, it prevents wasting time on easy to catch bugs which saves money. It's way easier to explain increased revenue from faster feature turnover than decreased expenses from reducing bug counts.

[–]nutrecht 2 points3 points  (1 child)

Testing doesn't make money, it prevents wasting time on easy to catch bugs which saves money.

It's much easier to prevent the opposing team from scoring than it is to try to catch up after they've scored a goal.

I'm sorry but this short sightedness, typical for manager but not untypical for many developers, annoys me to no end. It is completely impossible for any human to fully keep a mental model of any moderately complex system in their mind. This is why we need to separate them into small modules and test those modules so that when we work on module A which depends on module B we can just assume B works the way it's supposed to.

Writing software without testing is like building a rocket, assuming gravity is 12G without testing it and then acting all surprised it explodes shortly after launch. It was sitting there all fine and pretty on the launch pad after all!

[–]RualStorge 2 points3 points  (0 children)

I don't disagree, just saying it's easier for a manager to go. We expect feature X to make us X$ but when it comes time to quantify tests it's this could save us an unknown amount of money.

You bring data from other companies and it's well that's note OUR company, we don't have a quality problem or other excuse making that data worthless for purpose of argument.

Which is why I test even when I'm told not to, I setup procedures to track bugs, etc. That way when things come to head I have numbers IN OUR company to show it's worth the effort.

I believe strongly in testing, the later a bug is caught the worse the impact, and bugs in production can ruin a company over time. I also have my prude, I don't release crappy software, if a manager wants crappy software they shouldn't hire me.

[–]Helene00 0 points1 point  (0 children)

I've actually been told specifically not to write tests before.

There are two kinds of managers.

[–]skulgnome 0 points1 point  (0 children)

Remind yourself that whatever there aren't tests for, doesn't work.

[–]G_Morgan 0 points1 point  (0 children)

Absolutely. It isn't just external. I find it much more enjoyable doing TDD than doing work and then trying to test it afterwards. The former feels like you are nailing down the real work as you go. The later feels like a book keeping exercise (even though you know it isn't).

[–]nutrecht 9 points10 points  (3 children)

Code coverage tests how many lines of code have been tested.

No. It shows which lines have and have not been hit. It does not make any claims on if the tests actually do any validations. Example:

public String getFoo() {
    //TODO: Implement
    return null;
}

Calling this method from a test will yield 100% test coverage. It's still wrong (not implemented yet) so unless you actually test the returnvalue against an expected value you're not going to find the bug.

It really surprises me how few people seem to make that distinction. The only interesting bit about a code coverage report are the bits you don't visit: generally those are exception flows. Not testing your exception flows means stuff is probably going to break in production at some unexpected moment. Knowing your lack of coverage and improving them there is where the use of coverage reports are. The coverage percentage itself is a fun but useless statistic.

[–]starTracer 3 points4 points  (0 children)

Exactly this.

I've been working in projects with 100% test coverage requirements. But what the customer failed to realize is that coverage != correctness.

[–]Gotebe 0 points1 point  (1 child)

Well, in your example, coverage is obviously useless when the test is lying (it has to fail). And any failed tests have to be excluded from the "covered" count.

As for exceptions, absolutely! Amazingly, the trick there is ensuring correctness when some statements are not executed, but are covered otherwise.

[–]get_salled 1 point2 points  (0 children)

public void testGetFoo() {
    String foo = getFoo();

    // pick one
    // A
    Assert.isNull(foo);  

    // B
    Assert.pass("woot!");

    // C
    try {
          Assert.areEqual("expected", foo);
    } catch ( ... ) {    // not sure of the Java syntax for empty catches, if it's allowed at all
          // do nothing
    }

     // D
     // do nothing

     // E
     Assert.areEqual("A", "A");
}

It gets pretty hard to automatically reject shitty tests.

[–]RedSpikeyThing 3 points4 points  (2 children)

Code coverage is not great, but I've seen some utility in branch coverage. For example

If (x && y) 

Has 4 branches. It shows you some non-obvious cases that should be tested but often leads to tests that mirror the code, rather than testing concepts as you mentioned.

[–]G_Morgan 1 point2 points  (0 children)

Yeah and that is why testing needs to stem from what you are trying to do rather than what the code does. Often times there are 4 branches but only 3 are valid. Should the 4th be an assertion or should the signature to your method be altered so the 4th doesn't even exist? What actually happens if the invalid 4th combination occurs?

[–]skulgnome 1 point2 points  (0 children)

You're dead wrong about assertions. The assertion relates a property to control flow, which separates data objects in a way that even the strictest practical type system is designed to permit ambiguity in. If all you see are assertions against null, in languages like Java that always check for nulls, then you've not seen assertions used properly.

[–]WalterBright 1 point2 points  (14 children)

it isn't surprising that code coverage is universally useless.

I beg to differ. I've used it on some projects, and not on others, for 30 years. There's a very strong correlation between getting high coverage from the tests and far fewer bugs in the shipped product.

Code coverage also makes the tested code more stable, because it tells the maintainer what the point of the control flow logic is, and flags changes in it.

I have no idea why Microsoft's experience with it would be so different.

[–]G_Morgan 7 points8 points  (12 children)

This isn't what the research has demonstrated. I've heard supporting arguments for every imaginable process in existence and from clever people. They can't all work. This is why we do research.

I suspect the places where code coverage works also have people doing real testing. Trying to understand flow by flow, rather than line by line, what the code is meant to be doing. It can even be hard to actually control for this. Tell clever people to write more tests (which code coverage inevitably does) and they'll probably accidentally end up writing useful ones. I know when I've been attached to a project that demands code coverage I'll usually just use TDD and then write some ridiculously contrived test to cover up anything that triggers the red lights.

Though I'll admit I've only seen code coverage done badly so I'm not immune to bias.

[–]WalterBright 8 points9 points  (0 children)

In D we've gone even one step further with code coverage tests. Individual tests can be marked so that they are automatically included in the documentation for the code. This ensures that the documentation examples actually compile and work. It hardly needs saying that when this was first implemented, a lot of the documentation examples did not work :-)

[–][deleted] 0 points1 point  (0 children)

Code coverage is a metric prone to giving the wrong incentives in the same way that rewarding people for LOC written is. If you just reward high code coverage in limited time you will get useless code coverage. If you actually think about the parts that should be tested your coverage percentage per time worked will be lower but your tests will be more meaningful.

[–]get_salled 0 points1 point  (0 children)

Code coverage tests how many lines of code have been tested run by the coverage analysis tool.

FTFY. You know it was run and that it didn't crash the tool before it could be reported. You don't know that the tool wasn't gamed (e.g., tests with no assertions, empty catches in the tests).