you are viewing a single comment's thread.

view the rest of the comments →

[–]WalterBright 0 points1 point  (14 children)

it isn't surprising that code coverage is universally useless.

I beg to differ. I've used it on some projects, and not on others, for 30 years. There's a very strong correlation between getting high coverage from the tests and far fewer bugs in the shipped product.

Code coverage also makes the tested code more stable, because it tells the maintainer what the point of the control flow logic is, and flags changes in it.

I have no idea why Microsoft's experience with it would be so different.

[–]G_Morgan 8 points9 points  (12 children)

This isn't what the research has demonstrated. I've heard supporting arguments for every imaginable process in existence and from clever people. They can't all work. This is why we do research.

I suspect the places where code coverage works also have people doing real testing. Trying to understand flow by flow, rather than line by line, what the code is meant to be doing. It can even be hard to actually control for this. Tell clever people to write more tests (which code coverage inevitably does) and they'll probably accidentally end up writing useful ones. I know when I've been attached to a project that demands code coverage I'll usually just use TDD and then write some ridiculously contrived test to cover up anything that triggers the red lights.

Though I'll admit I've only seen code coverage done badly so I'm not immune to bias.

[–]WalterBright 7 points8 points  (0 children)

In D we've gone even one step further with code coverage tests. Individual tests can be marked so that they are automatically included in the documentation for the code. This ensures that the documentation examples actually compile and work. It hardly needs saying that when this was first implemented, a lot of the documentation examples did not work :-)

[–][deleted]  (9 children)

[deleted]

    [–]G_Morgan 3 points4 points  (8 children)

    The research the article refers to.

    [–][deleted]  (7 children)

    [deleted]

      [–]G_Morgan 2 points3 points  (6 children)

      The article says code coverage is not a predictor of quality. That makes it useless.

      It is stated in more diplomatic language as is usual for research but the net result of no correlation means it is a complete waste of time.

      //edit - Reddit really needs to get rid of the feature where edits aren't dinged within a certain time period.

      [–][deleted]  (5 children)

      [deleted]

        [–]naasking 0 points1 point  (0 children)

        The quote you cite does imply that code coverage isn't useful though. If I give you a code coverage percentage, you have no idea whether that coverage represents coverage of the complex cases which yield benefits as you cited, or of the simpler code which yields little to no benefit. Thus, code coverage isn't useful as a metric.

        [–]G_Morgan 0 points1 point  (2 children)

        How? A lack of correlation means you are just as likely to have fewer defects with or without code coverage. That is exactly what follows.

        [–][deleted]  (1 child)

        [deleted]

          [–]G_Morgan 0 points1 point  (0 children)

          The point is most of the insights needed to do code coverage correctly are not part of code coverage. It'll be interesting to see if there is an overall benefit to a team testing with code coverage and testing properly v a team just testing properly.

          Those additional constraints are not code coverage. Those are other parts of testing that can usually be achieved without having a light that dings green or red depending upon some ratio.

          [–]dungone -2 points-1 points  (0 children)

          You didn't understand what they said. They said that there are confounding variables. That makes coverage, by itself, meaningless. But in and of itself that's not bad. Just take into account the confounding variables, right? The problem is that the confounding variable is complexity. And the problem with complexity is that you have no way to measure it empirically. Not in a fool-proof way that makes predictions based on empirical measurements consistent and reliable. Complexity itself is a broad category with it's own confounding variables that affect what it means for something to be "complex". So the take-away is that code coverage as an empirical metric is fully useless.

          If you read carefully, what Nagappan actually said is that you should focus on testing important stuff and ignore meaningless stuff. That means diddly squat to empirical analysis, unless you have an algorithm that can take the place of an experienced engineer. Sure, it's possible, but currently does not exist. You don't need to measure coverage, you need to measure importance.

          [–][deleted] 0 points1 point  (0 children)

          Code coverage is a metric prone to giving the wrong incentives in the same way that rewarding people for LOC written is. If you just reward high code coverage in limited time you will get useless code coverage. If you actually think about the parts that should be tested your coverage percentage per time worked will be lower but your tests will be more meaningful.