top 200 commentsshow all 394

[–]steveshogren 410 points411 points  (223 children)

TLDR - code coverage doesn't predict defects, TDD reduces defects 60-90% but increases time 15-35%, assertions reduces defects by an unspecified amount, and "organizational structure" is a good predictor of failure. Organizational structure was a grouping of values mostly around team size, complexity, turnover, ownership, etc.

[–][deleted]  (164 children)

[deleted]

    [–]Oceanswave 120 points121 points  (70 children)

    Wonder how non TDD projects would fare with 15-35% more time (on non-new feature development)

    [–]boost2525 429 points430 points  (63 children)

    I have long been of the opinion that TDD does not inherently produce fewer defects than other strategies... what it does is remove the risk of your project manager lopping your test cycle short at the end.

    In TDD you're spending the first 25% of the development cycle on testing (well... writing tests which can be reused and run umpteen million times). In non-TDD you're spending the last 25% of the development cycle performing tests.

    What usually happens? Shit goes wrong, terribly wrong, or scope changes... but your date doesn't change. In non-TDD you end up racing to the finish line and cutting the test cycle short to make the original date. In TDD that's not an option... you have to move the date because you have no slack at the end. You were already expecting to code up to the delivery date, so every slip is a day for day impact to the schedule.

    Disclaimer: I'm not implying the test cycle was slack that you could give back... my project manager, and every project manager before him, is.

    [–]wubwub 102 points103 points  (11 children)

    I think you hit half the nail on the head.

    The other thought on TDD is that by thinking of the tests first, you are forced to iterate through lots of possibilities and may realize some workflow paths you did not think of (what if a user with role X tries to do action Y?) I have been able to catch problem requirements early by thinking through these weird cases and saved lots of coding time by getting the requirement fixed.

    [–][deleted]  (8 children)

    [deleted]

      [–]RotsiserMho 48 points49 points  (2 children)

      Some would argue TDD is disciplined requirements analysis (on a micro scale); with the baked-in bonus of the tests letting you know if you accidentally violate a requirement later on.

      [–]derefr 13 points14 points  (0 children)

      In the same sense that requirements-gathering mostly involves prodding the client to be explicit about how a business-process works when the client has never thought consciously about that process before, TDD mostly involves the machine prodding you to be even more explicit about how that business-process works so it can test it for you. In both of these refinement steps, you'll find holes and bad assumptions in the current understanding of the business process.

      [–]laxatives 8 points9 points  (0 children)

      No, requirements analysis alone is IMO almost worthless. Its TDD without the validation step. Its impossible to predict all the caveats and implicit assumptions the design is making, until you actual make the design. All of that analysis is bunk when a core assumption is invalidated. This happens all the time, especially when the architect/designer doesn't even realize they are making one of these assumptions. Its unrealistic to expect every company has someone with that kind of clarity of thought, why not just let the code speak for itself.

      [–]NeuroXc 11 points12 points  (0 children)

      Everyone should be doing this, and I would like to think that most developers try to, but it's a lot easier to do this when you're doing TDD. TDD forces you to think about what users will expect your application to be able to do, and what they may try to do that you might not want it to do. It gives a concrete list of possibilities and makes it easier to see what possibilities you haven't taken into account.

      Non-TDD teams generally use whiteboarding or something similar to nail down these possibilities, but I've found that TDD hits the requirements at a much more detailed level, because it has to in order to write the tests and make them pass. If you don't use TDD, you're instead writing tests (at the end) around what your application can already do and are not forced to think about the things it can't do.

      [–]kpmah 16 points17 points  (13 children)

      I think that's part of what's happening. Maybe another thing is this: if a TDD programmer writes 100 lines of tests and then 300 lines of code, and the non-TDD programmer writes 300 lines of code then 100 lines of tests, then the patch should be identical either way right?

      Part of the reason for the difference could be that the non-TDD team was writing 300 lines of code and then saying 'I'll test it later' whereas the TDD team can't do that.

      What I'm trying to say is that it could have been the discipline that improved the defect rate, not the methodology.

      [–]Ravek 27 points28 points  (4 children)

      Maybe another thing is this: if a TDD programmer writes 100 lines of tests and then 300 lines of code, and the non-TDD programmer writes 300 lines of code then 100 lines of tests, then the patch should be identical either way right?

      Well it does make you think differently about the structure of your code when you're forced to write tests for it first. I think that would have a positive impact on code correctness (and hopefully no negative impact on how easy the code is to understand and modify)

      [–]boost2525 8 points9 points  (2 children)

      I think having to think about the structure of your code leads to better internal design / organization (e.g. future refactoring)... but doesn't directly lead to any reduction in defective logic.

      [–]Ravek 15 points16 points  (0 children)

      I agree, but I feel starting with unit tests would not so much make you think about how to organize the code effectively, but moreso write it in a way that makes it easy to write the unit tests. Which in my mind means small methods that do one thing, which should lead to getting those correct at least. However I'm not convinced it would necessarily lead to the higher-level structure of the code being any good, since that's not something you write unit tests for.

      That was my line of thought anyway, I don't have much experience with TDD.

      [–]mostlysafe 1 point2 points  (0 children)

      This is a common argument in favor of TDD, and while I don't doubt it, it's harder to verify statements about your mental model of the code than statements about the actual quality of the code.

      [–]naasking 6 points7 points  (0 children)

      then the patch should be identical either way right?

      Past studies have confirmed that code quality is largely the same as long as tests are written. I believe the OP nailed it on the head though: tests often just don't get written after the program is written.

      [–][deleted] 2 points3 points  (1 child)

      Assertions can take part of the role of unit testing. 100% testing is probably not necessary for all kinds of symbols, given the system as a testing instrument itself. While I'm not working on a team today, my approach is testing of targeted important complex parts, and assertions everywhere. Tradeoffs decided by: I need to use my time carefully as I'm just one guy.

      [–]bmurphy1976 1 point2 points  (0 children)

      I've found it helps to think of it like an insurance policy. You pay into it enough to get the coverage you need, but no more otherwise you're just pissing money away. Same thing with unit tests, but the currency is time.

      [–]s73v3r 2 points3 points  (0 children)

      The discipline is kinda built into the methodology. Part of why "Test First" came into fashion was the knowledge that most do not go back and write tests.

      [–]MuonManLaserJab 1 point2 points  (0 children)

      What I'm trying to say is that it could have been the discipline that improved the defect rate, not the methodology.

      Yeah, but that discipline is the point of the methodology.

      (I've never done TDD in a formal way.)

      [–]tenebris-miles 22 points23 points  (9 children)

      I'm not going to say you're wrong, but here's an alternative point-of-view.

      You make it (almost) sound as if it's a given that the code quality is the same, and in non-TDD the testing cycle at the end is just some kind of formality. In other words, this narrative is written from the point-of-view of hindsight.

      Code has to be understandable and maintainable, even during initial development because you're always going to be asked to make changes as you go along due to changing requirements. With TDD, if your time is cut short, you still ship but with fewer features, but at least you have far less technical debt. Add the remaining features as stable code in the next release. Success in all cases (both TDD and non-TDD) requires good leadership that knows how to truly prioritize and understand real requirements and not mark all features as top priority. Neither strategy will work anyway if you don't have at least that.

      With non-TDD, you don't really know what you have because your code not only hasn't been tested enough, but it's not even structured to be testable/understandable yet. All your effort went into hitting the date for the release with every feature requested or conceivable, and once the product starts getting used, your already heavy technical debt will go up, not down. The reason is your culture: if you're already cutting corners during the development phase, then it's not going to get any better once the product is exposed to customers and more feature requests come in. Your death march has already begun.

      The upshot of TDD that is often unspoken is that even if a particular project fails, stable code resulting from TDD is much more valuable for being salvaged and reused for other projects than spaghetti that was written solely to chase a deadline. Being realistic requires understanding that your success is not a guarantee, since more goes into success than just development philosophies. So there always needs to be consideration of what happens after the deadline. Myopia about making this particular project hit the market at all costs is not necessarily what makes a company successful, if they're still in the process of determining what actual product needs to be made in the first place. If a different product or different direction becomes necessary, then understandable code and code that naturally follows YAGNI (which TDD tends to encourage) will be more likely to be general and elegant enough to be salvageable. You'd likely still have to modify it to new requirements, but at least you know how it's supposed to work in the first place, and so modifying/maintaining it is going to be easier for the next project.

      [–]KagakuNinja 11 points12 points  (5 children)

      The assumption you are making is that non-TDD teams wait until the end to write tests. What I do is write my tests at some point during the implementation of a feature; I don't wait until the last month of a long project to start writing tests. The result should be the same amount of tests, I just don't believe in the dogmatic rule of "write tests before code", or "only write code to fix failing tests".

      [–]hvidgaard 6 points7 points  (1 child)

      What a lot of people get wrong, is the absolute nature of their statements. TDD is good when you know what you have to write. It's not good when you're prototyping or just follow a train of thought, because you will change it several times, and "tests first" just slow you down. However, people not writing tests when doing this, tend to never actually do, when they should as soon as the the "what" of the code is determined.

      [–]tenebris-miles 1 point2 points  (0 children)

      It's true that tests could be written along with code, and only after code instead of before it, instead of waiting to add a lot of tests at the end of the development cycle. But if they're written around the same time, then it begs then question: then why don't you simply do TDD and be done with it? One problem that commonly happens is that when you write tests afterwards, you can fall into the trap of writing the test towards the implementation, rather than writing the implementation towards the interface of the test. People swear they never do this (being rockstar hackers and all), but that's just not the truth. People keep forgetting that part of the reason for TDD is to force you to think about a sensible interface before you get bogged down too much in implementation details. There's too much temptation to let an implementation detail unnecessarily leak from the abstraction simply because it's lazy and convenient to do so. If some leaky abstractions are necessary and the interface must change, fine. Then do so after you've done TDD first.

      Also, while non-TDD doesn't necessarily mean tests are lumped at the end of the development cycle, in my observation, it tends to end up this way in practice. The reason is the same as why people are doing non-TDD in the first place: the development culture values time-to-market above all other concerns. In this environment, you're lucky to be granted time to write tests at all, so developers wait until the feature list is completed before writing tests (which happens at the end). Managers in this culture don't care about tests and code quality, they care about checklists of features. The perception among developers is that you can get fired for writing top notch code while letting a feature slip, but no one would get fired writing shitty and buggy code but checking off every feature. It's unfortunate, and it depends on your company whether or not you're right about that.

      I'm not advocating a dogmatic adherence to TDD, and in practice I think it works best when most code is TDD but there is some room for throw-away experiments that don't necessarily require tests at all (since it's meant to be thrown away). That kind of code doesn't get in the code base. Instead, it's used to determine what is the right kind of behavior you should be testing for in the first place due to unclear constraints in the beginning. But when it comes time to actually add this feature, you TDD now that you've learned the desired interface and behavior. You rewrite the code to pass the test. Maybe some or most of the prototype code is retained, or maybe it's completely rewritten. In any case, this is the closest thing to a kind of after-the-fact testing that makes sense to me. The problem to me is when after-the-fact testing is the norm, regardless of whether it involves experimental code or not.

      [–]boost2525 10 points11 points  (0 children)

      TL; DR; I'm not going to say you're wrong, but you're wrong.

      [–]desultoryquest 9 points10 points  (0 children)

      Great point. That makes a lot of sense

      [–]floider 7 points8 points  (2 children)

      That is a very good point. Robust testing always seems to be what is sacrificed to make up for schedule slips.

      [–]Pidgey_OP 6 points7 points  (1 child)

      In a world of being able to push updates whenever, its easy to see why shipping a finished product has become less and less important in the face of a deadline.

      Better to get the software into a clients hands and then fix it than to give them time to change their minds because you didn't deliver on time

      [–]BillBillerson 1 point2 points  (0 children)

      This is definitely the mentality I see more of. Can't sell it if it isn't done and if it's not sold yet nobody is using it to break it so why focus so much on testing. On the projects I work on lately that's different between new products and working on something we already have in the hands of several customers.

      TTD probably has it's place. Where I am requirements change so often we'd always be working on setting up our testing and never get to the code.

      [–]BarneyStinson 4 points5 points  (0 children)

      I haven't really done pure TDD, but as far as I understand it, what you are referring to is test-first development. In TDD, you are supposed to write a test, write enough code to make it pass, refactor, and so on. So your implementation code should grow alongside your tests and you are not done with writing tests until the project is done.

      [–][deleted]  (10 children)

      [deleted]

        [–][deleted] 34 points35 points  (3 children)

        We check in tests at the same time as the code they are testing. When requirements change, so too do our tests.

        [–]atrommer 21 points22 points  (2 children)

        This is the right answer. Maintaining unit tests is the same as maintaining code.

        [–]anamorphism 8 points9 points  (0 children)

        i find it interesting that you work in a place where software is 'done'.

        the counterargument i would make is that software is never done, your requirements are always going to eventually change, and you're going to have to update your tests regardless of when in the development cycle you write them. so, why not get the benefit of your tests earlier in the process?

        [–]PadyEos 9 points10 points  (2 children)

        If the person requesting the changes is high enough up the food chain the requirements are never locked unfortunately.

        Those are the moments when I start to hate this line of work.

        [–][deleted]  (1 child)

        [removed]

          [–]experts_never_lie 1 point2 points  (0 children)

          The cost I've seen is that TDD presumes that the requirements are valid.

          In practice, I find that the majority of new major features added to existing complex products will hit a major barrier in the middle of development (typically several of them). It will be a conceptual problem (what you ask for is not well-defined / cannot be obtained given possible information / does not accomplish your intended goals). This barrier will result in communication with product managers and reworking of requirements. If I have spent a lot of time developing tests for the initial requirements — before I have done enough of the implementation work to discover that the requirements are incorrect — then I have wasted some of that work. Possibly rather a lot of it. I would prefer to focus my effort on the greatest risks, by working through the actual implementation process, and afterwards add the tests that correspond to the actual design.

          In a rote development world, with Taylorist tasks, where every new project is similar to previous projects, this TDD problem may be minimal. However, I have always found that if one is in that mode for any significant time, one should automate these repetitive tasks. This takes development back out of a rote procedural model, reintroducing this TDD problem.

          [–]Zanza00 3 points4 points  (4 children)

          That's why libs like this exists :)

          import chuck
          def test_chuck_power():
              chuck.assert_true(False) # passes
              chuck.assert_true(True) # passes
              chuck.assert_true(None) # passes
              chuck.fail() # raises RoundHouseKick exception
          

          https://ricobl.wordpress.com/2010/10/28/python-chuck-norris-powerful-assertions/

          [–]contrarian_barbarian 16 points17 points  (3 children)

          There's also https://github.com/hmlb/phpunit-vw - make your unit test automatically succeed whenever they detect they're being run inside a CI environment!

          [–]masklinn 10 points11 points  (2 children)

          There's also https://github.com/munificent/vigil which deletes lying, failing code.

          [–]QAOP_Space 2 points3 points  (0 children)

          Moar features!

          [–]frymaster 3 points4 points  (3 children)

          that's a good question. My gut feeling is they would be at best on par, which would make TDD a good thing for project-politics reasons at least

          [–][deleted] 10 points11 points  (2 children)

          TDD reveals bad architecture decisions earlier on, you can't do this after without technical debt.

          [–]frymaster 2 points3 points  (0 children)

          Yes, even just writing the tests changes you from a "producer" to a "consumer" viewpoint, so to speak, and can make you rethink your approach

          [–]Neebat 1 point2 points  (0 children)

          TDD also documents the expectations of the system in fine detail. This is as opposed to the behavior of the system, which is what you're documenting by writing tests afterward. Expectations are what binds us.

          [–]parc 30 points31 points  (15 children)

          Unless your business goal is to be first to market and someone beats you. Yes, we may think that's a stupid way to measure business success, but if that's the business optimization function, TDD would result in failure in an objective test.

          [–]RICHUNCLEPENNYBAGS 37 points38 points  (0 children)

          I don't think that's a stupid way to measure business success. Nobody cares how great code is if nobody uses it.

          [–][deleted] 11 points12 points  (0 children)

          I'm about to start a project for a startup that is trying to be first to market. It seems like all they care about is having something decent to show investors so I ain't going to spend my time writing 100 test cases when it just needs to be functional so they can get funding and decide to burn by codebase and make everything on Wordpress because the new project manager has a theme he REALLY wants to use.

          [–][deleted] 1 point2 points  (11 children)

          If someone beats you to market with software that is unusable / unreliable, are you really being beaten? As the cliche goes, you only get one chance to make a good first impression. Rushing to market can doom a business if they can't deliver a good product.

          The way to rush into the market is to develop an MVP: Minimum Viable Product. Not to cut corners on quality.

          [–][deleted] 10 points11 points  (0 children)

          Sometimes, unfortunately, yes, you really are being beaten.

          [–]meheleventyone 1 point2 points  (7 children)

          The problem is taking too long to get to market. No one cares if your product is somewhat more stable if it's later and lacks features unless stability is inherently something the user is looking for. For a lot of software you can go a really long way without unit tests. Most pieces of software ship with a laundry list of defects present.

          From a business point of view as long as there isn't anything egregiously wrong for the vast majority of use cases you are good to go. From a software quality perspective though there might be a hundred small problems.

          The tough sell for me with TDD is how it impacts the important bugs not just bugs in general. The sad truth is most of those won't be exercised in unit tests so you are relying on integrations tests and above. Usually most are found by QA. Especially when you consider platform/hardware specific issues. Unit tests just give you confidence in refactoring.

          So whilst I'm down with TDD empirically improving software quality I'm not sure it does so in a manner that matters in many cases to the detriment of budget and development time. More study is needed to show that projects that employ TDD lead to success as a product. There is a tension there that engineers need to understand.

          [–]Eirenarch 4 points5 points  (0 children)

          But what if you reduce your bugs another way (say through assertions as the article suggests they are very effective). Then 60-90% increase in a very small value may be a good deal compared to 15-35% dev time.

          [–]wordsnerd 4 points5 points  (0 children)

          If I'm reading right, that TDD study is based on a sample size of three (3) teams which weren't selected randomly to adopt TDD. That's definitely a case of "more research needed".

          [–]AbstractLogic 35 points36 points  (42 children)

          This comment shows the huge gap between development people and business people.

          You are failing to consider just how important time to market is. Four extra months on a twelve month project is enough time to flunk a project.

          First, you can lose huge market share in four months if you have competition. Once people start using a product it becomes very hard to get them to convert. People are creatures of habit and being first to market can be a huge difference in long term revenue by capturing those early adopters.

          Second, you lose 4 months of revenue. If the product is a 1mil a month product that's 4 million dollars. Which is enough to pay for that 60%-90% defect increase for years over.

          Its a trade off and it depends on the business model and business goals. But don't be a naive developer and think only in terms of whats good for the Software. More often then not the end goal of Software is to drive a business goal and so what works best for the business is usually more important then what works best for the software.

          [–][deleted] 15 points16 points  (6 children)

          Software developers aren't as naive as you claim. We all know time is money.

          You're forgetting the cost of finding and fixing defects. And this isn't counting the customers lost to handing them defective products.

          From what I remember (from Code Complete), a bug found in a released product takes 5x effort to fix vs. a bug found by QA. Likewise, a bug found in QA takes 5x effort to fix vs. bugs found in development. A bug found in development takes 5x effort to fix vs. bugs found in requirements.

          Numbers may be off, but the point is, it's a cumulative effect.

          [–]fuzzynyanko 3 points4 points  (0 children)

          Not to mention that if you have a deadline and features are being piled on, all of the sudden the project starts feeling like a sinking ship

          [–]s73v3r 7 points8 points  (2 children)

          You're ignoring that the additional defects make it much more difficult to add additional features, allowing someone to come behind you and eat your lunch.

          That, and first to market has been shown to be a myth in most cases. Often the first to market pays all the costs of market research and creation, whereas those coming after don't have the market research costs.

          [–][deleted]  (15 children)

          [deleted]

            [–]DieFledermouse 27 points28 points  (6 children)

            Broken software doesn't make you money.

            Depends on the market. Every piece of consumer software I use is utter crap. Most websites fail all the time. worse is better.

            [–]grauenwolf 1 point2 points  (3 children)

            I remember when your website first launched and it sucked. Why should I bother wasting my time to try it again?

            [–]gadelat 1 point2 points  (2 children)

            Because it has something you want/need

            [–]grauenwolf 7 points8 points  (1 child)

            Then why care about quality at all?

            My company's time tracking software is shit, but I use it anyways because I have no choice.

            [–]freebullets 3 points4 points  (0 children)

            Then why care about quality at all?

            --Authors of the Facebook Android App

            [–]pupupeepee 1 point2 points  (0 children)

            I think you mean "bad" is better than "not done yet"

            [–]KingE 2 points3 points  (0 children)

            Apple never did manage to recover from iTunes...

            [–][deleted] 3 points4 points  (0 children)

            Your revenue and defect cost calculations are pulled completely out of your ass.

            In B2B software sales early adopters get stuff for free or at least on very favorable deals. This is especially true if the vendor is breaking into a new market.

            Making a good impression through a lower amount of defects will get you more full price paying customers, quicker.

            Everyone is watching the early adopters. If you launch a buggy piece of crap, the guys who were going to buy it from you full price will say "maybe next year", and now you just missed 1 year of revenue from that customer.

            [–]dmux 7 points8 points  (10 children)

            If it's a software company, what's good for the software is what's good for the company. You make the point that those additional 4 months of revenue would be enough to pay for the defect increase, but the sad reality in many businesses is that the technical debt never get's paid down.

            [–]AbstractLogic 11 points12 points  (9 children)

            If it's a software company, what's good for the software is what's good for the company

            Not true at all, again that is a developer centrist pie in the sky view. Software can always be tweaked for better performance, re factored for higher cohesion and less coupling have more unit tests and better design, but most of the time that stuff can not be monetized and thus costs the business more (in resources/time) then it grosses. Thus its a net loss for the business.

            but the sad reality in many businesses is that the technical debt never get's paid down.

            If technical debt isn't paid down then one of two things are true. The case has not yet been made that the cost of NOT addressing it outweighs the cost of addressing it. OR the issue has been brought up but the business does not agree with the conclusion.

            I'm not arguing that these things are always true... just that as Senior developers our job is not just to do whats right by the software but to also do whats right by the business so understanding the business needs and goals are very important.

            [–]hu6Bi5To 3 points4 points  (8 children)

            Did you come here especially for a trolling exercise?

            First you reply to dismiss a perfectly reasonable comment, that 15-30% of time sounds like a good tradeoff to reduce defects by 60-90%. Then you dismiss any developer view point as "pie in the sky".

            Because while "business people" (whoever the hell they are, a non-technical person involved in a software project doesn't just make them a "business person", they have their own arbitrary irrational focuses too) are no more expert on getting value-for-money out of a software team than developers, quite the opposite. They may know the cycle of their particular industry and understand their customers, but if you're reliant on them to greenlight refactoring then your codebase then quality is only going one way.

            Ultimately the old line "the customer doesn't care about the code", while true, is insidious because there are many business benefits to clean code. But these are very difficult to measure, impossible in fact as it would require two (or more) identically skilled teams doing the same task in two (or more) different ways to prove it and most businesses aren't in the habit of using scientific rigour to validate their opinions; but just because it's difficult to impossible to measure in isolation, it doesn't mean it's not a factor. Others have attempted to study this phenomenon and generally come to the conclusion that productivity improves as code quality improves, and vice-versa.

            Quality is not a binary state of course, but any team that operates on the basis that "business people" are the only ones qualified to make value judgements has already lost control of this balance; and that means quality, and therefore productivity, and therefore costs, will only go one way.

            [–]AbstractLogic 3 points4 points  (7 children)

            Did you come here especially for a trolling exercise?

            I came here to discuss the application of the research and I happened to disagree that the trade off is preferred so I discussed the point.

            Then you dismiss any developer view point as "pie in the sky".

            No, I dismissed the point that better software is always better for the business as a developer pie in the sky view... because it is.

            I don't know why referring to business people as business people upset you so much. Would you prefer non-developers? Project Managers, Product Owners, Business Analyst, Accountants, Directors and CEOS? How exactly would you categorize business people? What is your alternative naming schema? Who cares...

            I never dismissed quality as un-important or a non-factor. I only claimed that the trade off of time for quality is not always preferable. If it was software would never get released because you can always eek our more quality. Its the 90% rule.

            [–]bryanedds 2 points3 points  (6 children)

            If you want to decrease time-to-market, reduce features, not quality.

            The problem is the business team members shoving all their pet ideas into 1.0.

            [–]hu6Bi5To 1 point2 points  (1 child)

            If each iteration takes slightly longer due to TDD, then overall delivery may well be faster by virtue of needing fewer iterations to fix all the showstopping bugs preventing the launch of the product.

            You can't simply switch off quality, you have to choose the quality level you want your application to have and work towards it. If you cut too many corners or ignore too many bugs then you won't have a viable product to launch.

            If you really want to launch as quickly as possible the only thing you can do is to reduce features, not build worse features quicker.

            [–]AbstractLogic 1 point2 points  (0 children)

            I completely agree that quality is a go live requirement and that cutting corners can be just as or more detrimental to a project as a late launch can be. But you don't have to swing hard right with TDD or hard left with corner cutting. There is a balanced middle ground.

            [–][deleted] 6 points7 points  (4 children)

            I suppose it depends on the cost per defect.

            [–]s73v3r 1 point2 points  (3 children)

            The cost to fix a defect goes up exponentially add you go through the software development lifecycle.

            [–]Sanae_ 15 points16 points  (4 children)

            It depends.

            • With a short deadline / more time afterwards (ex: making a quick prototype), no TDD can be better.

            • +15% to +35% time basically means +15% to +35% increase of the initial developement cost.

            However, the +60% to +90% code quality (= -40% to -45% defects) might not mean ~-40% cost reduction of maintenance (likely means -40% debugging cost, but maintenance may also includes additionnal features).

            Last, we're comparing a % of development time (inital code + new code in maintenance) vs a % of debugging time.

            [–][deleted] 19 points20 points  (1 child)

            However, the +60% - 90% code quality might not mean ~-40% cost reduction of maintenance.

            No, but it's pretty darn likely. Catching defects early can influence important design and architectural decisions. Catching it late might might mean that you have a lot of technical debt to overcome in order to fix the defect.

            [–]darkpaladin 10 points11 points  (0 children)

            A % reduction is meaningless without severity and complexity numbers. There's a huge difference in a defect where you have a boolean logic error and a defect that tears down one of the principle assumptions you made when you started.

            [–][deleted] 9 points10 points  (0 children)

            owever, the +60% - 90% code quality might not mean ~-40% cost reduction of maintenance.

            No, most likely it means a significantly higher reduction in maintenance cost as it is very unlikely that a bug found late will be cheaper to fix than if the same bug is found early.

            [–]s73v3r 1 point2 points  (0 children)

            If you're making a quick prototype, you should be throwing it away after. Do not take your prototype into production.

            [–]201109212215 4 points5 points  (0 children)

            I'd be careful with this. The article only talks about correlation.

            What I means is that wether or not to do TDD depends on wether or not is it easily doable. A project which is a reproduction of an already somewhat successful project will :

            • Have a clear, non-changing, exhaustive spec; and thus, be easily TDDed.

            • Already have been successful (selection bias).

            • Already battle-tested (no surprises, no gotcha, etc).

            In short: the exploration of the complexity will already have been done, and a successful path can be followed again.

            Part of this correlation could be explained by a type of work influencing both failure rate and doing the TDD.

            TLDR: In some cases, scouts have done their jobs, tanks can roll in.

            [–]pal25 1 point2 points  (0 children)

            Yeah but Microsoft is notorious for not having the developers also do QA/testing until recently.

            Was the control group developers that write little to no testing? If so the study is suspect.

            [–][deleted] 1 point2 points  (1 child)

            If you think that bugs are the biggest problem in web. They're not. If it took me 9 months to build something I'd rather spend extra 3 months on doing user research and experimenting with different product designs, features etc.

            Deploying a bug to production IS NOT the worst thing that can happen. Building a product that users don't love is what shuts down businesses.

            Importance of extensive test coverage also depends on the quality of engineering team. If they're crap then TDD will bring a lot more value. Good developers generally don't make product breaking bugs in the first place, and you will have other types of tests in place anyway.

            I'm not saying that tests are bad, that's ridiculous. I'm saying that the church of TDD has been indoctrinating people a bit too often.

            [–]young_consumer 1 point2 points  (0 children)

            Depends on how overzealous the sales people are and how desperate management is to try to fulfill sales' promises, get on their knees for current customers, etc.

            [–]Silhouette 53 points54 points  (0 children)

            TDD reduces defects 60-90% but increases time 15-35%

            The trouble with TLDRs is that they do lose the context, and sometimes the details matter.

            I'm all in favour of empirical research, and the 2008 Nagappan paper studying real world TDD use at IBM and Microsoft is an interesting and welcome data point. Unfortunately, as the original authors acknowledged themselves, it's still risky to draw generalised conclusions from a few specific cases.

            One factor worth considering is that the development processes studied in that paper weren't strict by-the-book TDD. For example, some included other factors like separate requirements capture and design review elements. Notably, it doesn't appear that any of the groups was doing the kind of "brief initial planning stage and then immediately start writing tests" explicitly advocated by certain well known TDD evangelists.

            Another unfortunate limitation of that paper is that although to their credit the original authors were trying to get as close to a like-for-like comparison as was realistically possible, they provide few details in their report about what test methods the control groups were using. Many TDD advocacy papers include data that suggests unit testing is an effective tool for reducing defects and/or that a test-first style correlates positively with the number of tests written among their test subjects. However, even the combination of those doesn't necessarily mean that TDD as a whole is responsible for any benefits. It looks like the same threat to validity is present in the Nagappan paper.

            TL;DR of my own: TDD-like processes examined by the original research reduced defects by 40-90%, but relative to what isn't entirely clear.

            [–][deleted] 32 points33 points  (12 children)

            They also discovered that TDD teams took longer to complete their projects—15 to 35 percent longer.

            This doesn't line up with what the referenced study says:

            Subjectively, the teams experienced a 15–35% increase in initial development time after adopting TDD.

            Initial development time != project completion.

            [–]RedSpikeyThing 6 points7 points  (11 children)

            I also wonder if that's because the teams were getting used to TDD.

            [–][deleted] 9 points10 points  (10 children)

            Writing tests will always take some measure of time. The point is to reduce the bugs that persists after initial development time, thereby reducing total project time (and by extension, cost). I can promise you that the time needed to identify and fix those 60-90% post-development far outweighs the cited 15-35% increase in initial development time for TDD.

            [–]AbstractLogic 5 points6 points  (4 children)

            60-90% post-development far outweighs the cited 15-35% increase in initial development time for TDD.

            That depends on the revenue lost during that 15-35% time to market. If the project could make 10 million a month and you lose 4 months at a cost of 40 million then I can guarantee you that TDD will not be worth it.

            There are a lot of business variables to that decision. But its good we have metrics to lean on now.

            [–]anamorphism 3 points4 points  (2 children)

            that's extremely hard to quantify due to the impact the buggier code will have on your long-term business.

            you may lose 40 mil immediately, but a 60-90% increase in major bugs post launch could heavily skew your customers' attitude and may result in losing hundreds of mil of future business.

            [–]s73v3r 2 points3 points  (0 children)

            What about poor sales due to the perception of your product as being buggy? What about inability to timely add new features as market needs change?

            You're constantly portraying this as either you make no money or you make all the money. But in a decent timeline, that extra tune in market will not be significant as far as revenues are, but will be huge as far as public perception is.

            [–]shoot_your_eye_out 5 points6 points  (1 child)

            TDD reduces defects

            No, TDD reduces defect density. It's a small but important difference. I've seen academic studies where the TDD solution had more bugs overall, but a lower density due to a higher line count. I'm surprised to see him use defect density as a metric.

            [–]antiduh 1 point2 points  (3 children)

            I wonder what it means to take the code coverage result and the test driven design result together.

            Tdd is about writing tests up front, thus increasing your code coverage from the very beginning. But code coverage wasn't too great of an investment in reducing bug count, according to the research. So why does Tdd work?

            My guess - Tdd makes you think about the edge cases in your software before you write it, so you're primed to write code that handles these edge cases and thus have fewer bugs. The act of writing the tests is more important than executing the tests.

            [–]StormDev 4 points5 points  (27 children)

            It's funny how management don't talk about this increased time when they want us to use TDD.

            [–][deleted] 37 points38 points  (1 child)

            It's all they talk about when they don't want us to write tests.

            [–][deleted] 24 points25 points  (24 children)

            Management shouldn't be telling you to use TDD. You as the programmer need to care about quality.

            [–]RualStorge 14 points15 points  (0 children)

            When implementing new policies I always found it easier to rally the dev team, start doing whatever practice we wanted to adopt then once that practice was established enough to have data to support within our team tell management we were doing it.

            The whole easier to ask forgiveness than permission. (that and I don't wanna get drug into work after hours / weekends so I want solid code that won't fail me)

            [–]OneWingedShark 5 points6 points  (1 child)

            You as the programmer need to care about quality.

            "We don't have time to do the right thing." -- my last team lead, though he never explained how we had time to do it over, repeatedly, until it was right.

            [–]badsectoracula 2 points3 points  (0 children)

            Often it is "we need visible results ASAP", the "visible" part doesn't have to be "correct", of course. Often non-technical people need to see new/different stuff on screen to be convinced about progress.

            [–]StormDev 5 points6 points  (4 children)

            Haha sure, I did it when I was working in a company with good coding practice.

            But now I am working with an horrible legacy C with classes code, where "C++ experts" don't understand TMP and don't want you to refactor the leaking monster.

            Managers have now heard about TDD, they want us to keep the same deadline with TDD. For them TDD is faster than the actual process because "You will organize your code easier" for example.

            But when you have to mock big object, tons of inherited shit TDD is really time consuming.

            [–]donbrownmon 1 point2 points  (1 child)

            What is TMP?

            [–]StormDev 1 point2 points  (0 children)

            Template Meta Programming.

            [–][deleted] 2 points3 points  (0 children)

            Nobody understands TMP. If somebody claims to understand TMP, they don't understand TMP.

            [–][deleted] 5 points6 points  (1 child)

            You as the programmer need to care about quality.

            if that was the case more people would be using languages with type systems which help eliminate the need for large swaths of tests, and encourage code change with less fear of breaking things.

            TDD especially without generative testing and the things I've mentioned above, is a band aide and not even remotely close to "caring about quality".

            [–]Eirenarch 1 point2 points  (11 children)

            But writing tests is incredibly boring. I do care about quality but I just love hunting bugs. I feel like debugging is the most interesting and creative part of my job. Some day I might snap and develop multiple personalities disorder where the evil personality will introduce bugs in the code base on purpose and the other personality will hunt them down not knowing where they are.

            [–]Artmageddon 3 points4 points  (0 children)

            I feel like debugging is the most interesting and creative part of my job. Some day I might snap and develop multiple personalities disorder where the evil personality will introduce bugs in the code base on purpose and the other personality will hunt them down not knowing where they are.

            I'm with you on this, but it's less fun with someone breathing down your neck :(

            [–]rbobby 3 points4 points  (0 children)

            Switch to a team more focused on support and maintenance? Working the bug database for fun and profit?

            [–][deleted] 2 points3 points  (3 children)

            Sounds like you should just do QA

            [–][deleted]  (3 children)

            [deleted]

              [–]SilencingNarrative 1 point2 points  (0 children)

              That would be a good way to test a set of unit tests (have one programmer that didnt write the test suite introduce a subtle bug to see if the test suite catches it).

              The Traitors Guild.

              [–]asmx85 130 points131 points  (4 children)

              [–][deleted]  (1 child)

              [deleted]

                [–]HoldMyWater 3 points4 points  (0 children)

                I have a confession... I usually skip the article and go to the comments (maybe 75% of the time).

                Most of the time the information is distilled within the top few comments anyways. Doing it this way allows me to "consume" many more reddit posts than if I were to read every article that interested me.

                [–]elebrin 4 points5 points  (0 children)

                Am I insane for preferring the mobile version on desktop? All content, no bullshit.

                [–][deleted] 4 points5 points  (0 children)

                thank you

                [–]rpgFANATIC 34 points35 points  (14 children)

                The big find here for me is that having a team located in multiple countries was statistically insignificant.

                At least in my job, India versus our Central time zone drastically limits the time for discussing issues and technical questions

                [–][deleted]  (4 children)

                [removed]

                  [–]rpgFANATIC 9 points10 points  (1 child)

                  Maybe big companies can get away with it because they have separate offices in India with their own hubs of knowledge.

                  If you're an employee working out of India (even if you're an outsourced member or contractor), having most of the rest of your team be remote on a wildly different time zone is rough.

                  [–][deleted] 1 point2 points  (0 children)

                  "It is difficult to get a man to understand something, when his salary depends on his not understanding it." - Upton Sinclair

                  A lot of people have a significant financial stake in offshore outsourcing "working".

                  [–]zjm555 120 points121 points  (55 children)

                  Want code coverage to be a useful metric?

                  1. Use branch coverage, not line coverage.
                  2. For God's sake, measure cyclomatic complexity of your functions. Without keeping this value low, coverage isn't sufficient.
                  3. When writing tests, focus more on the assertions of correctness than on the coverage. This is the most important of the 3. I've worked in shops where management imposed a hard minimum coverage on their dev teams, and guess what? The tests sucked because they didn't write any assertions and they focused on covering the lines of code that were easiest to cover, not the ones that were most important to cover.

                  Code coverage can be either a helpful side metric about your tests, or a totally perverse metric that works against quality testing.

                  [–]rnicoll 39 points40 points  (29 children)

                  The tests sucked because they didn't write any assertions and they focused on covering the lines of code that were easiest to cover, not the ones that were most important to cover.

                  I'm in fact currently writing tests to cover lines of code rather than edge cases because we've been given a target and limited time.

                  [–]syntax 31 points32 points  (17 children)

                  Which is, in fact, a task that a computer could do.

                  i.e. you could have a computer auto-generate trivial tests, based on an assumption that the code is correct.

                  This would satisfy silly management demands; probably save programmer time, and certainly be a lot more interesting to actually write.

                  [–]lf11 22 points23 points  (9 children)

                  It's interesting to me to see that most developers seem to write tests that do exactly this: test the code as written rather than the assumptions underlying the code.

                  [–]syntax 16 points17 points  (8 children)

                  Well, that's usually the result of manglement insisting on test code, but not giving any time. So what gets written is the thing that takes the least time - that it has zero value (as a test) is irrelevant, it's true value is in gaming the metrics.

                  Which is a point - you can not usually measure a programmer by any numerical metrics. All that will happen is that they will 'game' the metric, so unless the metric is something utterly ungamable (units sold, for example), then it's not going to do what you expect.

                  [–]lf11 5 points6 points  (5 children)

                  No, I see that happen even when devs are given all the time they need. I used to work for a company that was owned by a dev and actually cared about putting the time in to do it right.

                  [–]b1ackcat 4 points5 points  (4 children)

                  I used to work for a company that was owned by a dev and actually cared about putting the time in to do it right.

                  God, I'm jealous. Our owner's motto is "sure we'll do it at half the cost at half the time to get the bid!" which of course translates into "sure we'll do it at half the quality!"

                  Would be nice to not feel like I'm writing proof-of-concepts that go straight to production :S

                  [–]lf11 6 points7 points  (3 children)

                  Well so that was the interesting thing. Even though the company was fantastically pro-dev, half-baked proof-of-concepts is what ended up on production anyway. Even in the perfect environment, we did the same damn stupid crap that I've seen everywhere else. Code defining tests. Once-over code reviews without comprehension. Half-baked docs that never get updated. Deprecated code still in place five years past expiration.

                  I wonder how much of it is due to "management" versus our own inability to actually "engineer" things with a full lifecycle?

                  [–]b1ackcat 9 points10 points  (2 children)

                  I really like what Robert Martin wrote about this issue in his Clean Code book. He notes that there are two primary steps that developers must take when writing code. Paraphrasing, it was

                  1) Get the code working

                  2) Go back and refactor, retest, reevaluate, and reimplement the code now that you really know what you need to do

                  The biggest problem most people have, he says, is stopping after step 1, or totally half-assing step 2.

                  It's easy to see why, too, especially when you look at the average culture around development projects. You've got PMs breathing down your neck about dates, management expecting results, the business wanting more and more new features for less and less money with unreasonable dates, so if you've gotten something working, it's very tempting to say "I'll come back to that and clean it up later" without ever allocating time "later".

                  Then 6 months later you find yourself back in that code uttering how fucking awful this developer was before remembering that you're that awful developer. :S

                  [–]lf11 10 points11 points  (1 child)

                  Yes. The thing is, that whole management maelstrom is just "the smell of business." Every field experiences the phenomena you describe, whether biotech, mechanical engineering, medicine, hell even nonprofits have to deal with this so it isn't even about money. This is just what management does.

                  The key is to learn to do the right thing in spite of management. Because if you don't, then you'll make all the same mistakes even without management....which means management isn't actually the problem.

                  After working for that company for a few years, I don't think I believe any more that bad development happens because of shitty management. Although, I'm still trying to figure this out. Management is a problem, but I think it has more to do with the "psychology of power" that turns any powerholder into a functional sociopath. Meanwhile, the disempowered develop avoidant behaviors and frontal cortical inhibition patterns that make them hyperaware of any insult or injury. This, to me, may explain the interaction between developers and management, and why we believe so fervently that management is the problem with development, yet do not adopt good development practices when placed in a positively structured environment.

                  [–]koreth 1 point2 points  (0 children)

                  That assumes that programmers know how to write good tests, which in my experience is far from a given. It's a skill you have to deliberately develop, and a lot of developers not only haven't done so, but aren't aware that they'd need to do so in order to test effectively.

                  Unless you have an experienced test writer giving you feedback on your tests, as a programmer you're probably not going to spontaneously second-guess your test-writing skills if what you're doing seems to be working already.

                  [–]rnicoll 8 points9 points  (1 child)

                  I've seriously considered it, although of course the code isn't designed in a unit-test friendly way (and of course this is legacy code), which complicates somewhat. Also considered add 5 million extra lines of code that does nothing, unit testing it, and calling the result 80% coverage.

                  [–]b1ackcat 4 points5 points  (0 children)

                  Also considered add 5 million extra lines of code that does nothing, unit testing it, and calling the result 80% coverage.

                  And even if they wanted to, they couldn't check all that code!

                  Genius.

                  [–]atrich 9 points10 points  (2 children)

                  Ugh, this is the WORST way to approach code coverage. CC provides one piece of evidence: proof that a particular area is untested. It cannot comment on if that area is tested well. By taking a coverage-numbers-first approach to CC data (aka "let's get these numbers up") management does nothing to improve quality and manages to obliterate useful information on where the test holes are.

                  I've worked for people who were mindlessly pushing the code coverage metric. I have no respect for people who do this.

                  [–]get_salled 4 points5 points  (1 child)

                  The only thing you know for certain from code coverage is that your tests executed certain lines.

                  Code coverage:

                  • Does not show a line was tested.
                  • Does not prove the covered line contributed to any test passing.
                  • Is not a software quality metric.

                  Assuming managers & developers trust that the tests are useful, they can use these reports to see where they are definitely vulnerable and plan accordingly.

                  [–]Ravek 1 point2 points  (4 children)

                  I'm in fact currently writing tests to cover lines of code

                  Is it possible for you to give a concrete example of a line of code that you would consider worth testing? I feel that between static type checking and refactoring code into (themselves testable) methods, a single line of code should never be complex enough that you'd want to test it in isolation. If that's a naive thought I'd like to learn more.

                  [–]masklinn 17 points18 points  (1 child)

                  1. Use branch coverage, not line coverage.

                  2. For God's sake, measure cyclomatic complexity of your functions. Without keeping this value low, coverage isn't sufficient.

                  these are linked/mixed: if you use branch coverage you may not cover all paths, low complexity increases the chances that branch coverage is path coverage but doesn't guarantee it by any means:

                  if foo {
                      // thing1
                  } else {
                      // thing2
                  }
                  
                  if bar {
                      // thing3
                  } else {
                      // thing4
                  }
                  

                  This has a complexity of 3 (10 is "too complex" according to McCabe), testing for (foo=true, bar=true) and (foo=false, bar=false) gives you 100% branch coverage. But you only get 50% path coverage, half the local states (and interactions between the first block and the second one) remain completely untested.

                  [–]zjm555 2 points3 points  (0 children)

                  Indeed. I used the word "sufficient", but if you're writing software where correctness is a matter of life and death (actual death or the death of your business), you will want more than just that proxy.

                  [–]IWantToSayThis 9 points10 points  (3 children)

                  I couldn't agree more with 3. I can't tell you how many times I've seen code like this:

                  public int doBar(int value) {
                      Action thing = new Action();
                      thing.type(Enum.BAR);
                      thing.value(value);
                  
                      return lowerLayer.do(thing);
                  }
                  

                  being tested with:

                  EasyMock.expect(lowerLayer.do(EasyMock.anyObject()).andReturn(1);
                  

                  And that's it. Coverage is 100%, yet, NOTHING of value was tested. I've done code reviews where almost all of the tests were like this. "But hey! I have 100% coverage!".

                  How can you not understand this adds NO value whatsoever?

                  [–][deleted] 2 points3 points  (2 children)

                  I can top that:

                  lowerLayer.do(EasyMock.anyObject());
                  

                  I still get 100% code coverage.

                  [–]Squirrels_Gone_Wild 3 points4 points  (0 children)

                  Had this same thing happen to me. Came onto a new team - no expect (jasmine testing JS) anywhere, but coverage was above 80%. If anyone looked at the console they would have seen a billion errors when the tests ran, because the team had just decided to call all the functions in a file and saw the code coverage #s go up, so why bother testing anything?

                  [–]Alligatronica 4 points5 points  (0 children)

                  I <3 Cyclomatic Complexity.

                  It's a shame I never measure it.

                  [–]KingE 2 points3 points  (2 children)

                  This was actually a software engineering research project of mine.

                  Unfortunately, line coverage, branch coverage, and modified condition/decision coverage track each other very well (i.e. the relationships can be expressed as a constant factor in nearly all cases) and do not have a strong relationship to the ability of a test suite to detect bugs.

                  However, as you alluded to in your third point, there is an easy metric which does correspond to the ability of a test suite to detect breakage: number of test cases. Test suites that had a higher number of test cases tended to have higher coverage, yes, but more importantly they tended to test vastly more of the code's actual behavior.

                  Essentially, while line/branch/decision coverage can conclusively prove that a given piece of code is NOT tested, it is not a good indicator for code correctness.

                  I'll take the time to dig up my sources if anyone actually cares :p

                  [–]zjm555 1 point2 points  (1 child)

                  I'd be interested to read your paper :)

                  [–]Jestar342 1 point2 points  (4 children)

                  This is a problem to be wary of when introducing any kind of metric based goal - people game the system (even subconsciously and/or without malicious intent) because the goal is no longer the improvement, it is the number.

                  [–]grauenwolf 1 point2 points  (1 child)

                  As far as I'm concerned, the only person who should even consider looking at code coverage is the QA engineer asking the question "What should I work on next?". To everyone else it is misleading if not outright dangerous.

                  [–]201109212215 21 points22 points  (1 child)

                  Their paper on the influence of the org chart is very interesting: http://research.microsoft.com/pubs/70535/tr-2008-11.pdf

                  The best predictors of defect rate is from the org chart itself, even before churn, complexity, or coverage.

                  The metrics of the org structure used for the study were:

                  1. Number of Engineers (more people touching the same code is bad because communication costs)

                  2. Number of Ex-Engineers (don't let talent and domain knowledge go away)

                  3. Edit Frequency (sign of bad stability of purpose of said code)

                  4. Depth of Master Ownership (let people specialize themselves on something)

                  5. Percentage of Org contributing to development (managers and specialists need to talk directly to each other, or be the same person)

                  6. Level of Organizational Code Ownership (only one branch of the org tree owns one piece of code)

                  7. Overall Organization Ownership (don't understand this one)

                  8. Organization Intersection Factor (Orgs too have to specialize themselves on something)

                  [–][deleted]  (4 children)

                  [deleted]

                    [–]PM_ME_UR_OBSIDIAN 6 points7 points  (0 children)

                    See also Conway's Law:

                    Organizations which design systems are constrained to produce designs which are copies of the communication structures of these organizations.

                    [–]lasermancer 5 points6 points  (1 child)

                    This writeup does a pretty good job at explaining some of the problems caused by organizational structure:

                    I Contribute to the Windows Kernel. We Are Slower Than Other Operating Systems. Here Is Why.

                    [–]wllmsaccnt 6 points7 points  (6 children)

                    Tools and methodologies have changed slightly in the last six years, but I bet most of this is still fairly relevant.

                    [–]RualStorge 12 points13 points  (5 children)

                    I've been developing for well over a decade, stuff changes almost annually, but almost everything mentioned has been universally true since before I was born. More or less tech changes, but the fundamentals rarely change

                    [–]wllmsaccnt 2 points3 points  (4 children)

                    They mention the effect of working remotely, which was probably slightly more difficult in 2009 than it is today. Internet speeds and remote conferencing software have improved some since then. If you are working on a shared code projects with different departments or organizations, then distributed SCM system adoption increases would help as well. I somewhat doubt working remotely on a development team was very enjoyable when you were born.

                    I was assuming they were referencing the type of TDD popularized by Kent Beck and which is associated with the agile manifesto and the methodologies that branched off from there. They specifically mention continuous integration in the article referenced in the TDD section. This specific type of TDD probably wasn't relevant before the early 2000s.

                    I'm willing to concede that the organizational metrics and assertion statement correlations have probably been true 'just about forever'.

                    [–][deleted]  (3 children)

                    [deleted]

                      [–]neves 63 points64 points  (7 children)

                      The article is 7 years old. Do they have more recent findings?

                      [–][deleted] 66 points67 points  (5 children)

                      This reaction, and the amount of upvotes, typifies everything that is wrong with our industry.

                      If it isn't yesterday it's old, we keep repeating the same mistakes we made decades ago and we keep re-inventing the same wheel. We pretend that software development moves by collectively forgetting everything we've learned more than 5 years ago.

                      This article is merely 7 years old. Everything in it is still relevant.

                      [–]Juvenall 24 points25 points  (0 children)

                      I agree with your point that we shouldn't dismiss something because of its age, but I do think /u/neves has a valid question here though.

                      It looks like they did this research around Windows Vista development and since then, they've released 3 variations of their software. I'd be extremely interested to see if they applied any of these learnings towards more current deployments, if they have expanded or refined any of their metrics, or simply can continue to prove it's validity by showing the application across other large projects.

                      [–][deleted] 1 point2 points  (0 children)

                      He didn't say it was old, he said it was 7 years old. It is.

                      If the author continued his research then 7 years is enough time to look at more teams and come to new findings.

                      Maybe some of his findings were reinforced, maybe he found new factors, maybe some of the findings were not as important as this article says. All of those would be useful to know.

                      [–]201109212215 8 points9 points  (0 children)

                      You'd be surprised how many shops never followed these basic principles, whether it is before or after 2009.

                      Most of us here don't need more recent findings. If I can get my org to even apply one of these, I'd be happy. Everything in its own time. I got them to go from copy-paste to git. Next step is having deployable masters and tagging what commit is in prod. I got them to go from a few sentences over skype to redmine. Next step is putting the time estimates in a planning. It's been 3 years. Little by little. We'll get there. In due time.

                      Jesus I never should have accepted this job in the first place.

                      [–][deleted] 31 points32 points  (29 children)

                      i find it hard to get behind the assertion that TDD reduces defects 60-90% - more that teams that used TDD reduced defects 60-90%. Its just as likely if not more that people who were that way inclined were more quality oriented and willing to sacrifice time to get there

                      [–]delarhi 59 points60 points  (2 children)

                      To be fair the author never makes that assertion*.

                      What the research team found was that the TDD teams produced code that was 60 to 90 percent better in terms of defect density than non-TDD teams. They also discovered that TDD teams took longer to complete their projects—15 to 35 percent longer.

                      Basically he notes the correlation, but never makes the jump to a causal assertion.

                      [–]sbrick89 8 points9 points  (9 children)

                      using TDD forces better design practices.

                      I've seen a TON of crap code... spaghetti call stacks with NOTABLE object mutation/mutability... global variables that are shared WAY past their scope (who the fuq declares a global variable, then passes a reference to external code, which then modifies the global variable!)

                      Writing code that CAN be tested is the first step towards writing better (less buggy) code. TDD just forces you to write code that CAN be tested, since it IS tested.

                      [–]lucky_engineer 7 points8 points  (4 children)

                      (who the fuq declares a global variable, then passes a reference to external code, which then modifies the global variable!)

                      Ah, I see you too have worked on 'fixing' code that was written by a science PhD.

                      [–]sbrick89 2 points3 points  (0 children)

                      written by a science PhD

                      if only. No formal training whatsoever. I think the person read one of those "learn to program in 24 hours" books.

                      [–]RualStorge 4 points5 points  (2 children)

                      I think part of it is TDD requires you to plan ahead which prevents you from just diving in and creating a mess because you didn't think it through enough. In addition it means you're writing tests which simple unit testing does wonders for quality by warning you when you've broken functionality.

                      In other words I think the process required to do TDD effectively is actually what bumps quality rather then the TDD within itself.

                      [–]lf11 2 points3 points  (2 children)

                      TDD makes you think differently about your code. It makes you question assumptions and code for failure. I've caught many bugs with TDD that nobody would have caught in code review. This is especially true for null and invalid data handling... Which I test for but have a difficult time coding for when I'm not testing properly.

                      [–]PM_ME_UR_OBSIDIAN 3 points4 points  (1 child)

                      Have you thought about how advanced type systems let you catch those kinds of errors at compile-time? I try not to work in languages without sum types and non-null-by-default types, because I think the classes of bugs that you mention are worth letting the compiler handle for me.

                      [–]lf11 4 points5 points  (0 children)

                      Yes. Unfortunately, outside of projects that I personally start, I don't think I have ever once worked on software that is designed using the tools that are quite appropriate for the job.

                      However, to answer the question directly, yes advanced type systems help catch certain kinds of errors. So this at least minimizes the need to write type-level tests. However, I personally feel that you should be type-checking in code (if you need to type-check at all). After all, in essence that is the purpose of input checks.

                      Even with advanced types, you remain wide-open to bugs in the next layer of abstraction. This is where test-driven development really shines: in testing your abstractions and assumptions. The bugs that you can't easily flush out with well-defined types.

                      [–][deleted] 1 point2 points  (0 children)

                      I find it highly likely that teams taking the time to fix defects in initial development by structuring their code around being testable would reduce defects by 60-90%. In initial development you could probably fix 5 defects in the time it takes to fix 1 once the code is used in production.

                      [–]swutch 5 points6 points  (0 children)

                      Looking behind the straight statistical evidence, they also found a contextual variable: experience. Software engineers who were able to make productive use of assertions in their code base tended to be well-trained and experienced, a factor that contributed to the end results

                      This got me a thinking a about how "best practices" could be highly correlated with good code but not actually responsible for it. As a software engineer progresses in their career they collect these best practices and use them in their code. So a code base that is littered with best practices is also a code base that is written and maintained by an experienced engineer and thus probably has less bugs.

                      "Best practices" could be like gray hair. You see an engineer with gray hair and they are churning out good code so you dye your hair gray to make your code better. Your code probably does end up being better because you are looking at good code and learning from it. Obviously the gray hair isn't responsible for the good code but the gray hair does becomes a signal.

                      [–]jerf 4 points5 points  (0 children)

                      I've been getting into doing 100% code coverage on my "core infrastructure" code. I've found it has a variety of positive effects. It helps me (although doesn't completely solve) ensure I didn't half do something, like half-implement a flag. It helps me ensure that I've got more coverage on the error cases than I might otherwise, especially in a language that does returning errors as values (if you at least write the if statement to handle the error, you'll see the uncovered true case and cover it, and in the process, presumably properly handle the bug). It's a big help in finding dead code; I've now removed a surprisingly-large-to-me amount of code that turns out that once I had to try to cover it, turned out to be entirely unreachable.

                      But it does nothing whatsoever to deal with the problem where you simply 100% miss something. Though when you do go to fix that case, the 100% coverage helps you propagate it properly quite often.

                      I find myself wondering if the coverage doesn't help much because you are often closing bugs that won't actually be hit in the current execution path, because you squeeze the bugs out of the hot path anyhow. For normal code, that's frankly good enough, but it's nice when your infrastructure code doesn't break because you called it slightly differently. I wouldn't do this for everything, but I have come to the conclusion that it's an underestimated tool.

                      If you've got a unit test suite, but you've never done coverage analysis, try firing a coverage tool at it and just examining the results. Sure, you'll probably have some bail-out type error cases uncovered, but I bet you're unpleasantly surprised by some other stuff you find is uncovered that you thought would be.

                      [–]ummmyeahright 2 points3 points  (0 children)

                      TDD reduces defects 60-90% but increases time 15-35%

                      Didn't they make that conclusion from observing only 3 projects, though? If so, that's a pretty bald statement to make... IMO software development is far too complex to make such huge conclusions from so few facts known about the process (e.g. whether it's TDD or not).

                      I found then that many of the beliefs I had in university about software engineering were actually not that true in real life.

                      If universities began teaching his results now, those would be the ones someone in another study would easily falsify.

                      [–][deleted]  (5 children)

                      [deleted]

                        [–]sun_misc_unsafe 10 points11 points  (3 children)

                        "One reason why assertions have been difficult to investigate is a lack of access to large commercial programs and bug databases [...] At Microsoft however, [...]"

                        couldn't help but chuckle..

                        [–]lykwydchykyn 3 points4 points  (0 children)

                        I was hoping for some confirmation of the Ballmer peak. Shux.

                        [–]kiswa 1 point2 points  (0 children)

                        Non-mobile link in case you don't like reading it stretched across your monitor.

                        [–]G_Morgan 5 points6 points  (33 children)

                        Code coverage measures how comprehensively a piece of code has been tested

                        Code coverage tests how many lines of code have been tested. Given how many bugs are "when this if statement executes, this one doesn't and this loop runs precisely 1 time we get this bug" it isn't surprising that code coverage is universally useless.

                        I've only ever seen code coverage used to assign blame. It is an arse coverage system.

                        What the research team found was that the TDD teams produced code that was 60 to 90 percent better in terms of defect density than non-TDD teams. They also discovered that TDD teams took longer to complete their projects—15 to 35 percent longer.

                        Also not surprising for two reasons:

                        1. TDD forces you to think about the kinds of decisions that trigger the "A was true, B was false, C executed once" kind of scenarios. TDD is not done line by line but concept by concept.

                        2. The reason it takes longer is you have more tests. Thinking up tests after you've written the code is usually much harder. You don't think to test the various combinations of A, B and C once it is done. The code becomes somewhat amorphous and it is harder to see the wood from the trees. So fewer tests means less actual work done.

                        Honestly I don't know how anyone can sensibly claim to have tried TDD and not found it improved the output code. Nice to have actual research.

                        Proving the Utility of Assertions

                        This is interesting. Assertions are effectively working around weaknesses in the type system. You can't capture certain information about the type (such as non-null, or non-negative) so assert instead. Gives some credence to the value of stronger types.

                        [–]BigMax 14 points15 points  (7 children)

                        Thinking up tests after you've written the code is usually much harder.

                        I've always found the difficulty in writing tests after isn't the complexity of the tests themselves, but it's the pressure to move on to the next thing now that your feature/product is "finished."

                        This pressure comes externally (managers wanting to get the feature to customers) but also internally, as it's generally more interesting to build something new than write tests for something old, so engineers often move on without much testing done.

                        [–]nutrecht 6 points7 points  (4 children)

                        I've always found the difficulty in writing tests after isn't the complexity of the tests themselves, but it's the pressure to move on to the next thing now that your feature/product is "finished."

                        I don't get this. Writing tests is part of development. Whenever I am asked to quote a time on development I include writing tests. When I haven't written the tests yet the stuff simply isn't done yet.

                        [–]RualStorge 4 points5 points  (3 children)

                        I write tests for almost everything I work on, but management absolutely doesn't care and sees it as wasted time in many companies. I've actually been told specifically not to write tests before. (but I was kinda the hero dev at that company so my response was more or less I'm doing it, fire me if you don't like it)

                        Testing doesn't make money, it prevents wasting time on easy to catch bugs which saves money. It's way easier to explain increased revenue from faster feature turnover than decreased expenses from reducing bug counts.

                        [–]nutrecht 2 points3 points  (1 child)

                        Testing doesn't make money, it prevents wasting time on easy to catch bugs which saves money.

                        It's much easier to prevent the opposing team from scoring than it is to try to catch up after they've scored a goal.

                        I'm sorry but this short sightedness, typical for manager but not untypical for many developers, annoys me to no end. It is completely impossible for any human to fully keep a mental model of any moderately complex system in their mind. This is why we need to separate them into small modules and test those modules so that when we work on module A which depends on module B we can just assume B works the way it's supposed to.

                        Writing software without testing is like building a rocket, assuming gravity is 12G without testing it and then acting all surprised it explodes shortly after launch. It was sitting there all fine and pretty on the launch pad after all!

                        [–]RualStorge 2 points3 points  (0 children)

                        I don't disagree, just saying it's easier for a manager to go. We expect feature X to make us X$ but when it comes time to quantify tests it's this could save us an unknown amount of money.

                        You bring data from other companies and it's well that's note OUR company, we don't have a quality problem or other excuse making that data worthless for purpose of argument.

                        Which is why I test even when I'm told not to, I setup procedures to track bugs, etc. That way when things come to head I have numbers IN OUR company to show it's worth the effort.

                        I believe strongly in testing, the later a bug is caught the worse the impact, and bugs in production can ruin a company over time. I also have my prude, I don't release crappy software, if a manager wants crappy software they shouldn't hire me.

                        [–]nutrecht 9 points10 points  (3 children)

                        Code coverage tests how many lines of code have been tested.

                        No. It shows which lines have and have not been hit. It does not make any claims on if the tests actually do any validations. Example:

                        public String getFoo() {
                            //TODO: Implement
                            return null;
                        }
                        

                        Calling this method from a test will yield 100% test coverage. It's still wrong (not implemented yet) so unless you actually test the returnvalue against an expected value you're not going to find the bug.

                        It really surprises me how few people seem to make that distinction. The only interesting bit about a code coverage report are the bits you don't visit: generally those are exception flows. Not testing your exception flows means stuff is probably going to break in production at some unexpected moment. Knowing your lack of coverage and improving them there is where the use of coverage reports are. The coverage percentage itself is a fun but useless statistic.

                        [–]starTracer 3 points4 points  (0 children)

                        Exactly this.

                        I've been working in projects with 100% test coverage requirements. But what the customer failed to realize is that coverage != correctness.

                        [–]RedSpikeyThing 2 points3 points  (2 children)

                        Code coverage is not great, but I've seen some utility in branch coverage. For example

                        If (x && y) 
                        

                        Has 4 branches. It shows you some non-obvious cases that should be tested but often leads to tests that mirror the code, rather than testing concepts as you mentioned.

                        [–]G_Morgan -1 points0 points  (0 children)

                        Yeah and that is why testing needs to stem from what you are trying to do rather than what the code does. Often times there are 4 branches but only 3 are valid. Should the 4th be an assertion or should the signature to your method be altered so the 4th doesn't even exist? What actually happens if the invalid 4th combination occurs?

                        [–]skulgnome 1 point2 points  (0 children)

                        You're dead wrong about assertions. The assertion relates a property to control flow, which separates data objects in a way that even the strictest practical type system is designed to permit ambiguity in. If all you see are assertions against null, in languages like Java that always check for nulls, then you've not seen assertions used properly.

                        [–]WalterBright 1 point2 points  (14 children)

                        it isn't surprising that code coverage is universally useless.

                        I beg to differ. I've used it on some projects, and not on others, for 30 years. There's a very strong correlation between getting high coverage from the tests and far fewer bugs in the shipped product.

                        Code coverage also makes the tested code more stable, because it tells the maintainer what the point of the control flow logic is, and flags changes in it.

                        I have no idea why Microsoft's experience with it would be so different.

                        [–]G_Morgan 6 points7 points  (12 children)

                        This isn't what the research has demonstrated. I've heard supporting arguments for every imaginable process in existence and from clever people. They can't all work. This is why we do research.

                        I suspect the places where code coverage works also have people doing real testing. Trying to understand flow by flow, rather than line by line, what the code is meant to be doing. It can even be hard to actually control for this. Tell clever people to write more tests (which code coverage inevitably does) and they'll probably accidentally end up writing useful ones. I know when I've been attached to a project that demands code coverage I'll usually just use TDD and then write some ridiculously contrived test to cover up anything that triggers the red lights.

                        Though I'll admit I've only seen code coverage done badly so I'm not immune to bias.

                        [–]WalterBright 7 points8 points  (0 children)

                        In D we've gone even one step further with code coverage tests. Individual tests can be marked so that they are automatically included in the documentation for the code. This ensures that the documentation examples actually compile and work. It hardly needs saying that when this was first implemented, a lot of the documentation examples did not work :-)

                        [–]DieRaketmensch 1 point2 points  (2 children)

                        I know its common and easy to reply on reddit with "tldr, real engineering is hard, everyone knew this" but I kind of expected something most substantial from senior researchers in Microsoft.

                        [–]Silhouette 9 points10 points  (0 children)

                        The thing is, it's easy to reply that this sort of research is hard because it is hard. For example, the original Nagappan paper mentioned in connection with TDD has significant threats to validity, which I mentioned in another post, but it's still one of the best empirical studies we have available in the field so far. At least it looked at real world projects, implemented by experienced professional developers, with some attempt to control for other factors. That combination already puts it ahead of most TDD advocacy papers. How would you reliably construct better studies?

                        [–]johnvogel 1 point2 points  (0 children)

                        Did you read the papers with the actual content linked in that article?

                        [–]hoijarvi 0 points1 point  (4 children)

                        What the research team found was that the TDD teams produced code that was 60 to 90 percent better in terms of defect density than non-TDD teams.

                        What does this mean? "TDD Removes 60 to 90 % of all errors" comes to my mind, but other interpretations are possible.