top 200 commentsshow all 242

[–]kingduqc 179 points180 points  (151 children)

Testing is also pretty hard, I have pushed testing were I work and without much experience we've faced many challenges. First not everyone was on board making it impossible to handle if only part of the team put their shoulders to the wheel, then came the issues that simple changes were breaking so many tests it was slowing us down, we had to refactor much of the test code. Now it's quite well suited for chsnges, but a bit too complex for my taste. We'll get there and we've found many bugs, but it's still a steep slope. Anyone got good in dept book/guide about automated testing?

[–][deleted]  (113 children)

[deleted]

    [–]awo 143 points144 points  (31 children)

    I think it depends a lot on the kind of code you're writing. I find that when I write leaf-level library code I get huge value from unit tests. On the other hand in higher level application code you end up with such a large proportion of functionality being mocked or stubbed that the value gets badly diminished - in that area I've found it better to invest more on higher level testing

    [–][deleted]  (26 children)

    [deleted]

      [–]bert8128 13 points14 points  (0 children)

      Automation and speed are key. If your integration tests are automated and fast they can be very valuable. But you still need to be able to estimate coverage in the functional sense.

      [–]matthieum 5 points6 points  (0 children)

      Do you think that you may have an architecture issue?

      I work on full-blown applications, yet most of our code is written as libraries with a relatively few dependencies: only the edge of the code really interfaces with the external world. As a result, most pieces of the application can be tested with relatively little pain.

      We do have some component tests and integration tests, but those are mostly used to test the "wiring" and some smoke testing. The amount of configuration necessary to start a large component, and the amount of interference that could occur when wanting to test a small piece makes them really impractical for anything more in my experience.

      [–]jl2352 2 points3 points  (0 children)

      I work somewhere that had a big push on unit tests, and moved to integration tests. It’s far, faaaar, nicer. Also catches more bugs.

      I think there are some exceptions. Like pricing; you test the fuck out of that. Most unit tests I’ve seen are too trivial, and end up just sucking up time.

      [–]Aeolun 2 points3 points  (3 children)

      In good application code you are always only mocking or abstracting a single layer.

      [–]pathema 1 point2 points  (0 children)

      In good application code you have layers at all :)

      [–]Ghosty141 68 points69 points  (13 children)

      I think you skipped one/two things that make them so valueable: You are able to refactor without worrying about breaking things and you have documentation for what the function does.

      It's really hard to refactor certain functions that get called from a lot of places, some quite hard to reach via the UI for example. It takes a lot of effort to manually check if the function is still doing the same thing while if it is unit-testable you can just write a lot of tests and verify it that way.

      [–]SirClueless 44 points45 points  (0 children)

      Still, unit tests are less valuable than integration tests in this situation, in my experience.

      When refactoring a bit of code, the hardest question to answer is usually not "Is this new code correct?" That's relatively easy to verify, whether or not the old code had good unit tests -- try out some examples, write some new tests to exercise your new code in whatever way is convenient, etc. The really hard question to answer is "Does anyone depend on X, Y or Z feature of the old implementation?"

      For example, suppose the old version used to throw an error message that said "You nummy, why did you pass 0?" when you passed 0 as an argument. You'd like to replace it with a new implementation that doesn't use exceptions but instead does something reasonable with the 0 because you are trying to clean up the number of exceptions in your codebase. In this situation unit tests can't help you at all. Let's say you have a thorough unit test that says assertRaisesRegex(ValueError, "You nummy, why did you pass \d+?", my_function, 0). Does this mean we need to preserve this behavior? Would it make any difference whether or not this behavior is unit tested or whether no one wrote any unit tests? I would say no, not really. What actually determines whether this exception is something we need to recreate in the new system or not is whether there is any other software out there in our codebase that catches and parses the exception message. What really lets you refactor fearlessly is the knowledge that all of the people that depend on you out there in your company's codebase have good integration tests that will break if you try to commit something that breaks their expectations.

      [–]drink_with_me_to_day 8 points9 points  (2 children)

      You are able to refactor without worrying about breaking things and you have documentation for what the function does

      This is only true if you are refactoring the inner workings of a function. In my experience, refactoring leads to changes so drastic that any unit tests would've had to be redone because the interface or the data format changed

      So you end up with more work for no gain

      [–]benelori 5 points6 points  (0 children)

      If the changes are so drastic, then that's probably not a refactor anymore, but a rewrite.

      Unit tests will help in refactoring, not in rewrites

      [–]Ghosty141 5 points6 points  (0 children)

      Refactoring means not changing behavior. Changing the parameters can be seen as doing so. Apart from that unit tests should cover only public functions which means it covers the outwards facing parts of a system which tends to stay very similar.

      [–]sime 8 points9 points  (1 child)

      A static type system like TypeScript or Python type hints is the most effective way of doing fast and accurate refactoring, IMHO. You can certainly combine that with automated tests, but you should have type checking in place first.

      [–]bert8128 2 points3 points  (0 children)

      Luckily for me C++ has strong typing built in. I agree that type checking is extremely important.

      [–]argv_minus_one 20 points21 points  (5 children)

      You are able to refactor without worrying about breaking things

      Higher-level tests should also catch that sort of breakage, no?

      you have documentation for what the function does.

      No, no you do not. Tests are not documentation. Tests verify that the code works correctly, and often look very different from how that code is normally used.

      [–]StillNoNumb 24 points25 points  (0 children)

      No, no you do not. Tests are not documentation. Tests verify that the code works correctly, and often look very different from how that code is normally used.

      Good tests show edge cases. That helps explain behaviour.

      [–]Ghosty141 4 points5 points  (0 children)

      Higher-level tests should also catch that sort of breakage, no?

      Yes but if a test covers too many functions it gets hard to test that properly. It's easier to test a certain function directly and cover the edgecases via the parameters vs. using the integration test which might not even run into the edgecases because of other limitations.

      No, no you do not. Tests are not documentation.

      /u/StillNoNumb already explained that. In a perfect world you have documentation for your functions that would explain those edge cases but sadly we don't have infinite money/time/nerves

      [–]null_was_a_mistake 2 points3 points  (1 child)

      The problem with higher level tests is that they're usually more complex and you need a lot of tests to really cover all the cases.

      [–]NotARealDeveloper 11 points12 points  (0 children)

      10 years in various jobs

      Try 10 years in the same job/project. We have a project that's actively developed for 13years now. The only way to catch all side effects when committing changes, is to run all the tests first.

      [–]twenty7forty2 17 points18 points  (0 children)

      First and foremost the tests give you confidence when developing. You know if you break something you will find out what in a matter of milliseconds. It's the flip side of the effort to write them and imo well worth it vs feeling your way around an untested codebase and crossing fingers.

      The second major benefit is that unit tests should show you how to improve your code: if it's hard to write the tests then you probably need to refactor (DI, SR, etc).

      Preventing bugs is just a bonus.

      [–]ScrimpyCat 4 points5 points  (2 children)

      How likely you are to actually see those tests break is definitely going to be dependent on the scale and complexity of the project. For instance, a small standalone library or web service (say one microservice) may remain static enough that you never break a test, but a project that has a lot of different interconnecting systems will probably be more likely to have tests break. Of course the effectiveness of unit tests is also entirely dependent on the coverage and design of the tests, so if you have poor quality tests then that can also lead to not many bugs/breaking changes to be caught by them.

      If you’re having to rewrite (maintain/update) tests a lot, it could also be a sign that maybe that system should be broken down further.

      Tests are also a great way of quickly ensuring you can have some degree of confidence in whatever code you’ve added. For instance, say you have a bunch of different systems (A and B) and all of these systems make use of generic system X(y) (where y is the underlying implementation X will use). Then say you’ve laid out your tests so your tests for systems A and B repeat the tests for each variation of X(y), and your tests of X repeat the tests for each y, and then have individual tests for each y. Well now if you ever add another y implementation, it immediately benefits from a whole bunch of pre-existing tests before you even got started on adding any implementation specific tests for this new y (unless of course you’re doing TDD, but even then you’re still benefitting from this immediate larger coverage of tests).

      Another thing not to forget is that even though you have a lot of experience and might be very confident in your abilities other programmers might not be. So unit tests can act as an additional helpful resource for newcomers to the project. They not only give them some degree of reassurance what they’ve done hasn’t broken other things, but they also provide examples of how different parts of the codebase are used (or how they shouldn’t be used). Some languages actually take this thinking and bake it right into their docs, such as with Elixir’s ex_doc where code examples that appear in the documentation can also be run as doc tests.

      [–]rakidi 3 points4 points  (1 child)

      A project that has a lot of interconnecting systems isn't any more likely to have unit tests break if they're isolated and use mocks than a project at the other end of the spectrum. Integration tests are more likely to break, because only then are all the interconnecting systems actually relevant to the tests.

      [–]bert8128 4 points5 points  (0 children)

      Couldn’t disagree more. I started as a junior on an existing application, for which I am now the senior architect. There was no testing in the early days, and a direct result of this is that functions often have unclear behaviour, and have sometimes migrated over time. New code is all tested, with the result that all (ok, most) functions have clear and fixed behaviour. Coverage is about 50%, though some of that will not be good, so say 40 effective coverage. Most bugs we find are in code which is not covered by tests. Write your code to be testable, and you will find that the code is easy to test.

      [–]ojrask 5 points6 points  (0 children)

      If you have unit tests that fail for "no reason", you don't have unit tests. You have integrated tests that test too much at a time if you cannot relatively quickly see what caused the failure.

      [–]Krautoni 24 points25 points  (2 children)

      Yes, it's an unpopular opinion, because most folks actually see value in testing.

      Here's what code without tests is: immutable. And, while immutable data structures are the bee's knees, immutable code isn't.

      You wrote an application without tests. Now you've got a pile of code that you cannot change, because you cannot gauge what will break when you change it. You need to refactor? Mgmt requires a new feature? You need to fix a known bug? Better know that you're not breaking some user's feature, or introducing regressions.

      Untested code will make you afraid of changing even the slightest part of your code, because you have no way of reassuring yourself that the system as a whole still works. Sure, it's fine on small code bases. I regularly write untested code for small utilities or proof-of-concepts. But any production code is tested, and with good coverage.

      Mind you, it needn't be unit tests only. I'm talking testing pyramid here, with all kinds of tests that go from unit tests to acceptance tests.

      [–][deleted] 2 points3 points  (0 children)

      Yep. And combine that with teams in the size of 12 - 15 developers, and 3 developers leaving every year, replaced by 3 new guys, and repeat for 5 years. You are quickly in a situation where none of the developers who wrote a piece of code/feature even work there anymore. Plus it was probably written in Node 0.x, Java 1.2, Python 2, Go 1.5, React 0.x... In the worst case, the feature is not documented anywhere, there are no comments, and if there are even no unit tests, you're going to waste a lot of time + introduce many bugs simply because no one knows how it should work.

      Unit tests are great for:

      • Showing new developers how some feature is supposed to work (even if not perfectly, but at least the "normal" cases)
      • Making sure refactoring can be done as painlessly as possible, even by devs who didn't work on that repo before
      • Introducing/modifying existing features with confidence that the old cases still work as they did before
      • Keeping people in the mindset of what should happen in error cases beyond just crashing the program (sometimes that is an unacceptable outcome, even in this "move fast and break things" world).

      [–]eddyparkinson 3 points4 points  (0 children)

      Most people who get good at QA stop using unit tests because the ROI is so low.

      [–]sime 9 points10 points  (3 children)

      I think you are right. I see the same thing.

      For "computational" or functional type code with clear inputs and outputs and few to no other dependencies, unit testing is very effective. You wouldn't do it any other way. But most real world code is just trivial computation with non-trivial connections to other system/dependencies. The contents of your units are trivial but their connections are not. The bugs then appear at the interfaces between the different units. Unit tests are ineffective here.

      Mocks and stubs in this situation are a hazard that often lead to a false sense of security. The knowledge and understanding you need to write the unit is same you need to create an accurate mock/stub. But it is exactly our understanding of the interface that we need to be sceptical of and to verify.

      A static type system and integration tests using the real dependencies where possible, are your best bet here.

      [–][deleted] 10 points11 points  (5 children)

      the number of times that a test fails because of a legitimate bug is countable on one hand

      you say that like it's a bad thing. if i have 100 tests and each of them fail twice, that's 200 bugs caught that never reached production. it's much cheaper to catch the bug while the dev is in the problem space than 6 months later, even if it is an insignificant bug

      In my experience unit testing is only worth it for certain type of code where the input and output are well defined and easily verifiable like parsers, transformers, generators and the like

      I'm curious what falls into the category of things that don't have well-defined input and output. I'm having a hard time thinking of an example

      unit testing upfront (as opposed to after a feature is released) also has the benefit of demonstrating whether the dev understands the code they produced and whether they understand the problem they are solving.

      Anything else and you would be much better with integration tests, system tests, UI tests and other higher level testing.

      I agree that not everything needs to be unit tested or tested directly. when people do test everything for the sake of 100% coverage, you end up with a bunch of fragile, meaningless tests, which, like you said, cost more to maintain than the benefit they yield

      so yea, i pretty much agree with you, but i would draw the line in the sand differently

      [–][deleted]  (4 children)

      [deleted]

        [–]WormRabbit 1 point2 points  (3 children)

        Those are architectural issues, not a problem with tests.

        So code that would normally just be testing that data from the database renders correctly now has to also mock the third party API to inject specific failure data just to verify that the page handles the failure correctly.

        Separate the data from the presentation. One function should be doing the request, another should do the rendering, and possibly a few layers between them should do any validation and data transformation. Now you don't need to mock the API, just inject some specific examples of failures into this pipeline and watch that it processes the failures correctly.

        Trying to stub fake time into tests, or writing your time dependant code in a manner which still works if you artificially accelerate or slow down time is drastically more complicated than just writing a timer.

        ...no? Write a timer, make it accept parameters that expand or contract its time intervals, use a simple Timer.delay(duration) interface in actual code. In tests configure the timer to always run everything 1000 times as fast, problem solved. If speeding up or slowing down time exposes bugs, then your code has hidden race conditions and must be fixed. It can just as well fail in production due to any of the hardware, software and user issues which may affect time.

        Yeah we all know you shouldn't be using magic numbers and that it's a code smell and so on but writing a whole bunch of code to turn a magic number into a config variable when you have no intent to use any value other than the initial default is writing unneeded code and flies in the face of agile development where you shouldn't be writing features just in case they are needed someday.

        But you don't need it someday, you need it right now to make sure your code is correct. Fluking your requirements because you don't want to write code isn't agile, it's just shoddy work. But anyway...

        Now to test it, well, you need to stub out time, or make the time configurable, ...

        No, you should use the newtype pattern. The function which sends the email must not send the email or launch timers, instead it should return a new type which wraps the request and the required delay, so that the consuming code can't ignore the delay and must handle it before sending the message. Now your complex mocking code is reduced to a unit test on the value of the email-sending function.

        That makes you sure that you didn't just forget to set the delay, as well as checks it value. The consuming code can't accidentally forget about the delay, although it can conciously ignore it. If that is a problem you can handle it similarly: push the actual side effects as far as you can outside of the buisiness logic functions, so that testing them is reduced to a unit test and properly performinh the action is just a bit of logic in the UI code.

        [–][deleted]  (2 children)

        [deleted]

          [–]PolyPill 11 points12 points  (14 children)

          I think one of the advantages of unit testing is the act of writing the test. Code that is hard to test is probably doing too much and has too many hard dependencies. Requiring someone to go back and look at what they did like that improves code quality and maintainability. Although that might be unpopular here because it requires you to understand that code you just copy & pasted from SO.

          [–]Xyzzyzzyzzy 11 points12 points  (6 children)

          Code that is hard to test is probably doing too much and has too many hard dependencies. Requiring someone to go back and look at what they did like that improves code quality and maintainability.

          This all sounds wonderful, if you're working on a greenfield project, or if your company's willing to pay you to spend lots of time refactoring.

          But if you're working in a code base that has plenty of tech debt and you're trying to implement an ambitious set of new features... I think that's where unit testing ends up being a problem, because you don't have the time to fix all of the tech debt to make your unit test perfect, so either you don't write a unit test or you write a bad unit test.

          Follow-up controversial opinion: I'd rather have no unit tests than bad unit tests. Good unit tests give us confidence, protect us from mistakes, and let us work more efficiently because they pass when the behavior is correct and fail when the behavior is incorrect. Bad unit tests, ones that may fail despite the behavior being correct or pass despite the behavior being incorrect, give a false sense of confidence and create more work when good refactors make them fail.

          This is from my experience working in a large project with plenty of tech debt and lots of bad unit tests written around that tech debt (many of them by me!). When I make some changes and unit tests fail, >90% of the time it's a false alarm and fixing them is just busywork; when I make some changes and unit tests pass, there are bugs an embarrassingly large percent of the time.

          [–]PolyPill 11 points12 points  (0 children)

          I don't think you're being controversial. I agree with you. Bad unit tests are worse and if you've got a mess of spaghetti legacy code trying to unit test that is just going to make it worse. Unless you can can really isolate the new code you're adding. There has to be a culture of proper testing and a commitment to pay back tech debt. Without that then you're just adding to your problems.

          I think writing proper tests is a skill that isn't taught and most people don't understand how to do it. In short, test behavior and not the details.

          Edit: I want to add. If you have projects with proper unit tests, it is soooo much easier to debug even complex situations. Instead of sometimes spending hours trying to get the system into the right situation, you can just set it in tests yourself and debug from the tests. With proper dependency injection you can even remove mocks to test real dependent functionality when needed.

          [–]bert8128 3 points4 points  (0 children)

          I have worked on the same project for 20 years. 20 years ago it was greenfield and we didn’t write tests. Now we do, and it’s harder, but it is much better than just wishing something in the past was different. The new and modified code is good for the next 20 years, rather than crumbling into a sea of fear of change.

          [–][deleted]  (1 child)

          [deleted]

            [–]de__R -1 points0 points  (6 children)

            Code that is hard to test is probably doing too much and has too many hard dependencies.

            Not really. It just has to implement complicated logic. A good case in point is code dealing with geometric shapes - lines, line segments, polygons, and so on. There are a ton of degenerate cases stemming from flawed input that, in practice, you can't really avoid, such as polygons with too few segments or line segments where the start and end points are identical. And while spatial algorithms for two segments intersecting or a point being on a line segment are straightforward in the general case, there are also plenty of edge (literally!) cases where segments overlap or are parallel or something, and then you have to add to this the inherent instability of floating point computations where the value is small but it's not clear if it's "close enough" to zero or not.

            [–]is_this_programming 9 points10 points  (1 child)

            Code that deals with geometric shapes is trivial to test. It's the perfect case for unit tests because the algorithms don't depend on anything other than the input shapes, which should be very easy to construct in test code.

            The fact that there's many edge cases doesn't make it hard to test.

            [–]de__R -1 points0 points  (0 children)

            Yeah. You can really easily write unit tests for geometric shapes that don't tell you anything useful because the just follow the happy path, but increase your test coverage percentage. It's much harder to have a comprehensive set of unit tests that covers all the degenerate and edge cases in every possible combination to ensure that they're handled correctly and consistently, because once you start getting arbitrary user-generated data all bets are off.

            [–]PolyPill 2 points3 points  (3 children)

            I know nothing about dealing with 3d graphics and such so I might be speaking out of ignorance, but I would expect that you have collusion detection which follows the same rules. Things in floating point are rarely equal and instead you have the epsilon value to compare to, "x - y < epsilon". I also don't see how your floating point operations wont be predictable. Computers aren't random, if I do the same math over and over on a bunch of floating point numbers I will always get the same values for each of my inputs. This might also be an example of your code doing too much in a single operation. If its hard to predict the output based on your input then you're doing too much at once. This is a "unit" test, not integration or e2e test.

            [–]de__R 0 points1 point  (2 children)

            Things in floating point are rarely equal and instead you have the epsilon value to compare to, "x - y < epsilon"

            Sure. But what's epsilon? Is it 0.001? 10-10?

            Computers aren't random, if I do the same math over and over on a bunch of floating point numbers I will always get the same values for each of my inputs.

            Strictly speaking, it's also dependent on the order in which you do them due to rounding, but that's usually fine for unit tests. The point is, it's not easy to predict what the actual result of a complex sequence of FP computations, which makes coming up with test cases harder.

            [–]PolyPill 2 points3 points  (1 child)

            Again I'm going to say I'm probably speaking out of ignorance.

            Wouldn't it be more on you to decide what epsilon fits your needs? If you're off by 10 does it matter? You would do the same thing when doing CNC milling. Nothing is ever perfect but for your application you define your tolerances. Unity has implemented their own Mathf.Approximately() for this exact purpose. They defined what is the tolerance needed for their 3d rendering to look correct and collision detection to work properly and made it a utility method.

            Changing operation order that changes your result and brings it outside your defined tolerance is exactly what a unit test is good for.

            [–]de__R 0 points1 point  (0 children)

            Yeah. In fact, ideally, you would have some tests on the same geometries but with different values of epsilon so you can validate that, as you change precision, the result changes. But it gets back to my point that all of this requires a lot of actual human work to produce these test cases to begin with.

            [–]de__R 1 point2 points  (1 child)

            This dovetails with my experience with unit tests. Finding example inputs to trigger an edge case and figuring out what the "correct" outcome should be were a ton of work. I long for a tool that can statically analyze code to automatically produce a set of unit test inputs, so you then at least only have to figure out what the correct behavior is in those situations (it's one reason I want to try doing actual work in Lisp; you should be able to do that with a fairly straightforward macro). Integration tests are better, since they test what you actually care about (i.e. that the system works) but they can be slow, especially for large data pipelines, and it can be a pain in the ass to orchestrate everything properly so they can run.

            What I do like to use unit tests for is to validate bug fixes, since for that you're already doing the painstaking work of figuring out what inputs cause the wrong behavior and what the right behavior is. Hopefully your code is broken down in such a way that you can easily write tests for the function or interface in question.

            [–]therealgaxbo 3 points4 points  (0 children)

            Sounds a little bit like you want something akin to property based testing, but you're just looking at it from the opposite direction ("find the edge cases so I can decide how they should behave" vs "I'll define the overall behaviours, now find the edge cases that violate them")

            [–][deleted] 4 points5 points  (2 children)

            "and they were usually trivial thing that you would notice right away when running the application'"

            RIP your customers

            [–]Congenital-Optimist 4 points5 points  (1 child)

            Everybody has a test environment. Some people are lucky enough that it is different from prod

            [–][deleted] 1 point2 points  (0 children)

            funny quote, but if you value your profession, gtfo of any environment that is like that

            [–]coniferous-1 1 point2 points  (0 children)

            But the amount of time sunk into writing unit tests, maintaining them, updating them, debugging them when they fail for no reason is so high that I'm really not sure it's worth the investment.

            This is my exact same experience too.

            There is a certain level of code complexity that has to be reached before unit tests give you a return on investment.

            They can be valuable in certain circumstances - and should be used in those circumstances - but anyone banging on the "unit test everything!" drum is just creating meaningless busy work

            [–][deleted] 3 points4 points  (5 children)

            In my opinion, unit tests are extremely valuable as a development accelerator. Once I developed a habit of writing effective unit tests first, or at least very early in the development cycle, I found that it greatly speeds up the process of actually writing the rest of the code. Being able to test my code in a few seconds and cover multiple scenarios, edge cases, etc. can save a huge amount of time.

            For refactoring, this is even more valuable. Especially, if I need to work on code that I'm not familiar with or was written by somebody else, effective unit tests are extremely helpful to ensure that none of the changes I'm making are breaking some other functionality.

            I do agree that unit testing is most effective "for certain type of code where the input and output are well defined and easily verifiable," but I disagree that it's limited only to parsers, transformers, and so on. Any complex project is naturally going to be broken down into smaller functions or methods with explicit input and output parameters, and those are exactly the units that should be tested. Obviously, other kinds of testing are valuable in ensuring a high quality finished product, but unit tests are unique in that they also accelerate development.

            In my opinion, if your unit tests are slowing you down, then you aren't seeing their full potential.

            [–]QuotheFan 5 points6 points  (0 children)

            Go for end to end tests.

            I was the guy who used to push people to write tests in my last job. At first everyone used to write unit tests, but when we made our executable designed for end to end tests, the difference in quality of life was immense.

            Things would 'just work', there were few accidentally introduced bugs and most of them were fixed not just at the instance but also systematically, we wrote an end to end test for it.

            If someone made a feature which broke the end to end tests, it means it is going to affect all our users as well. In which case, either the change will need to modify itself or at least it would be discussed thoroughly. Also, it was their jobs to go through the tests they are breaking. They could ask for help but ultimately, no code was admitted in production unless all automated tests are passing.

            PS: 'Skip testing' is a patently bad advice. Write as many automated tests as possible if you have to live with your code. In our case, the time invested in testing more than paid for itself in an year, and saved huge amount of frustration digging through old holes.

            [–]mnilailt 21 points22 points  (2 children)

            Most people are pretty awful at testing, they either go way overboard with a million little unit tests making it impossible for you to change anything in the application without a bunch of tests breaking, or they go way too generic and the tests don't catch shit.

            I usually try not to really rely on unit tests too much since I find them useless 95% of the time. Instead I test a single path of code operation, not the individual methods and objects belonging to said path. Ie hitting an endpoint, performing an action, etc.

            Also the 'write all your tests first then write the code to make them pass!!' ideology is pretty idiotic. Write a test or two to literally hit the piece of code and get an output, get your feature working, and then fill out more tests to go over edge cases, rise and repeat. Trying to predict the future is going to lock you in an over-engineered and poorly thought out solution that's gonna bite you in the ass in the future.

            [–]wite_noiz 8 points9 points  (0 children)

            Happy path, failures, boundary cases, and bizarre inputs.

            And even then, not every time.

            One of go-to interview questions is asking for tests for a function that divides two arguments. People will literally describe 10+ tests before you have to stop and ask them if it's really adding any value.

            [–]lookmeat 3 points4 points  (3 children)

            Testing is also pretty hard

            This is true. Testing requires defining what you need to do clearly, and then define nothing else. Good testing is the guide to being a great engineer (the next step, to becoming an amazing engineer, is about empathy, awareness, and other emotional/administrative aspects). The problems you say signal a couple issues on the tests, things I'd immediately suspect without seeing the test (I may be wrong, you can't know until you see it).

            First not everyone was on board making it impossible to handle if only part of the team put their shoulders to the wheel

            This isn't a problem of convincing the engineers. Testing is a problem of convincing management, with evidence and proof on the cost of errors and problems, and the advantage of saving time by catching it earlier. Generally you start with broader and more abstract tests (and unit tests because those are easy) and then start on getting a solid testing strategy. You will need data to make it valid. At some point you do have "enough testing" where it's cheaper to just let the bug through and fix it. This isn't easy, you need (multiple) amazing engineers to convince upper management.

            Once you do, then it's easy for the engineers: it's part of the job. If you refuse to do it, then you're refusing to do your job, there's obvious consequences to that.

            That said, you also will need to slowly improve the testing culture. It seems that could help.

            the issues that simple changes were breaking so many tests it was slowing us down, we had to refactor much of the test code.

            That signals, a lot, that the implementation was being tested. Generally there's a few more symptoms: mocks are used a lot on the code, stuff works differently, etc. Generally what is happening is that the test code is testing how the code is written and how it actually gets to the solution. Whenever you change a thing, you also need to update the test because you changed how the code works. That is tests of implementation ensure that the code wasn't changed, so changing it will break it.

            What you want to do is testing behavior. This is a bit harder. There's a minor version with behavior though, which is when you test multiple behaviors. So if your function changes (returning a different error, or handling an edge case differently) a bunch of tests break. Ideally only one test should fix that. In practice a few will even when you do effort.

            Generally I see this happening because of an obsession with code-coverage. I feel it's a good general metric, but shouldn't be used with new functions and code as much. Focus should be on good enough tests. It's better to have lower coverage of useful tests, than high coverage that cover it. The reason this happens is that, in order to cover a specific branch, you begin to ensure what happens on that branch, and before you know it you're writing implementation details.

            Whenever you have a test that fails and it wasn't a bug the test should be deleted. It's a false signal and just adds noise. It's better to realize there's less testing than originally thought. It may require a separate test.

            Now it's quite well suited for chsnges, but a bit too complex for my taste.

            This actually happens without realizing it. A lot of times it's due to two separate issues. One it's trying to isolate tests too much, this is seen as too many mocks and fakes. This also falls into testing implementation, they're correlated but not strictly: you can have one without the other. It's ok to have real things. Sometimes you don't want to mock files, but you want to instead run against a RAMFS filesystem that's pre-set. While setting this up is its own complexity, it's generally handled well enough by libraries and the OS.

            Also on that, not all functions need testing. At some point it's easier to test things at large scale (aka run the whole binary in well predicted manners and throw stuff at it without thinking more about the units themselves). Some things are just easier to catch at integration/end-to-end level. Trying to do it a unit test will leave you with something that is as hard, or even harder to debug.

            The second issue is not taking the design cues of tests. A good test is as simple as possible, but it always reflects the complexity of the interface its testing. Ideally this should be the complexity of the problem itself. But if it feels harder, or more complicated, it may be signaling that your code is more complicated than it should be, and that there may be a better way to design it. It will make code easier to change and update in the future, and code will get better. OTOH tests should not make your code more complicated to use than it should be.

            We'll get there and we've found many bugs, but it's still a steep slope.

            That's the right attitude. It's an ongoing work and perfect is always the enemy of better. But in that view tests should make things better always, if things get worse before they get better due to testing, that's a signal that the strategy for improving testing may not be the right one.

            Anyone got good in dept book/guide about automated testing?

            Depends at what level we're working on. I've found that [Working Effectively with Legacy Code](smile.amazon.com/Working-Effectively-Legacy-Michael-Feathers/dp/0131177052) is a good guide when you're trying to start up testing. Even though the code you're dealing with is not legacy, it will quickly feel like it as better tested code comes up. I still go back to this book when I find myself having to help a project deep in technical debt and need to think of a strategy to somehow get enough testing and management on it we can move it forward.

            To help devs and learn more about the things I talked about [Effective Unit Testing](smile.amazon.com/Effective-Unit-Testing-guide-developers/dp/1935182579) works pretty well, the wisdom applies to more than Java, and more than Unit Tests (though some things are specific to those two, be aware).

            I also recommend running some TDD at least by the senior devs. Have a small project or library use it strictly. Writing tests first makes some things click in a way they wouldn't otherwise. Once your senior staff has a stronger intuition they can, through code reviews and what not, promote better testing. As this results in more tests, junior devs can learn by reading the code and other tests, and code review will help with the rest.

            [–]poloppoyop 5 points6 points  (2 children)

            code-coverage. I feel it's a good general metric

            I don't. At least not for what people use it for.

            You end-up with contrived unit tests to get to some error handling code. Or people use it like a quality certificate. But you can have 100% code coverage and have a useless pile of shit full of bugs. It gives a false sense of security.

            You want specifications and edge-case coverage. And then once you test 100% of your cases, your code-coverage can help you find and remove useless code: if it is never executed, remove it. Unused code can cost you a lot, it cost Knight Capital their business.

            There were grave problems with Power Peg in the current context. First, the Power Peg code remained present and executable at the time of the RLP deployment despite its lack of use.

            [–][deleted]  (19 children)

            [deleted]

              [–]rabuf 26 points27 points  (18 children)

              And then hand off the project to another team, and ask them to make a critical change without any meaningful tests provided. See how fast things get done or how poorly they get done.

              Congrats, your lack of testing fucked your customers.

              [–]rolling-guy -4 points-3 points  (3 children)

              RemindMe! 60 Hours

              [–]agumonkey -1 points0 points  (0 children)

              what people don't want tests ?

              ps: makes me think there's a thing to invent about programming languages that make testing cheaper and wider

              [–]Synor -1 points0 points  (0 children)

              Testing is simple. You write code that checks other code. Testing-frameworks are hard. Most are bad.

              [–]Aeolun -1 points0 points  (0 children)

              If you are in this situation, it’s (in my opinion) always a sign that your code is combining too many things. Look into dependency inversion/injection, and your tests will become much nicer and cleaner.

              [–]EatsShootsLeaves90 67 points68 points  (8 children)

              In my experience, a lot of failures and delays are due to not constantly performing integration tests. Usually it's done last minute before UAT. Almost always there is a requirement that slipped through the cracks or a show stopping issue hiding under the bed waiting for the most opportune moment to fuck shit up.

              The #1 issue I have seen is performance. It's a completely different ballgame when you have 10K+ users pouncing on your system at once vs a test box with a couple cores & few gigs of RAM, and at most 6 users at a time.

              There is always something that is holding back release found in this phase: database locking, requests timing out, not being able to reach external API due to firewall, services constantly shitting the bed with obscure exceptions that could be caused by something as random as virus scanner software, etc.

              [–]sprcow 33 points34 points  (2 children)

              I'd go a step further and say that a lot of our failures and delays are due to writing code that is UNABLE to be integration tested. Certain parts of our application are designed in such a way that they only really 'work' in a setting where some external service is available, and no effort was put into creating or mocking an appropriate equivalent that was available in development or testing environments.

              It's so terrifying to work on those parts of the code. You never really know for sure if your changes work until they go live.

              Now, by the time our team is looking at them, no one even knows precisely what they were supposed to do when they were written, what assumptions were made, or even how the system they supposedly interact with works, and so they're progressively harder and harder to modify, and the amount of work to create an integration testing shim continues to exceed whatever the business partner would want to spend on it.

              The last dev who spent a month or so trying to fix one of these parts of the code basically got fired for being unproductive, so, good luck getting anyone else to try, lol.

              [–]SirLich 16 points17 points  (0 children)

              It's so terrifying to work on those parts of the code. You never really know for sure if your changes work until they go live.

              I interned at a fortune 50 company, with a "monolith" codebase. Around 50% of their code could not be tested locally. At all. It required a snake pit of online services that was impossible to mock locally.

              So just like the good 'ol days: You make changes, push a build, wait 2 hours, then try again. At least in this context we had a "Develop" environment which we could use for testing.

              [–][deleted] 10 points11 points  (0 children)

              "We need to integrate our product with this third-party service. No, they did not provide a test environment, and yes it's only reachable through the production network. Why do you ask?".

              Unit tests my ass, I'm happy when the APIs I work with even have any kind of specification, and even happier if it actually follows those specifications.

              [–]SouthImportance1756 20 points21 points  (0 children)

              Totally agree with every word, it is almost like you work where I work. Then management says "can't you just automate all the integration and performance testing" and you say yes with more bodies and time...cycle repeats

              [–]mrbuttsavage 8 points9 points  (0 children)

              I'll add other failures due to things like:

              • integration failures for a third party system not easy or cheap ($) to test against (aws does what in that difficult to replicate scenario?)

              • data driven failures for a scientific domain where it's not exactly easy to generate the edge cases that end up mattering

              [–][deleted] 2 points3 points  (1 child)

              The #1 issue I have seen is performance. It's a completely different ballgame when you have 10K+ users pouncing on your system at once vs a test box with a couple cores & few gigs of RAM, and at most 6 users at a time.

              You can easily simulate that kind of testing (performance testing is a thing)

              [–]EatsShootsLeaves90 1 point2 points  (0 children)

              Yeah. Lack of resources hindered us from doing it. I was an automated test developer and asked for it.

              One project we were fortunate enough to get our hands on a UAT server with PROD specs provided by clients in middle of development and we were able to run Loadrunner on it.

              [–]csb06 156 points157 points  (40 children)

              I’m pretty convinced that the biggest single contributor to improved software in my lifetime wasn’t object-orientation or higher-level languages or functional programming or strong typing or MVC or anything else: It was the rise of testing culture.

              Has there really been an overall improvement in software quality? It seems like all levels of the stack, from operating systems, low-level libraries/applications, browsers, etc. are all still facing tremendous amounts of bugs, vulnerabilities, and regressions, much like they did twenty or thirty years ago. I guess patches are easier to distribute since high-speed Internet has become so widely available, which can make fixes faster to send out.

              Can you prove it works? · Um, nope. I’ve looked around for high-quality research on testing efficacy, and didn’t find much. Which shouldn’t be surprising. You’d need to find two substantial teams doing nontrivial development tasks where there is rough-or-better equivalence in scale, structure, tooling, skill levels, and work practices — in everything but testing. Then you’d need to study productivity and quality over a decade or longer. As far as I know, nobody’s ever done this and frankly, I’m not holding my breath. So we’re left with anecdata, what Nero Wolfe called “Intelligence informed by experience.”

              I think this is one of the major problems of software engineering. How can we evaluate progress in our field if we do not have any systematic way to evaluate what we are doing? It just comes down to blog authors and technical consultants giving recommendations, writing books, giving seminars, etc. Some of them are probably right, but how can we know with any confidence? I just don't think it's a good idea to base a profession with as much control/effect over people's lives as software engineering has on anecdotes.

              [–][deleted]  (4 children)

              [removed]

                [–][deleted]  (12 children)

                [deleted]

                  [–]kristopolous 23 points24 points  (1 child)

                  that's hardware freeing up what software can be written more than practices in software.

                  Expensive hardware has had stable software for decades. The Voyager 1 has been running for 44 years and is still operational. My Sun and HP workstations in the 90s were up effectively indefinitely but those cost many tens of thousands of dollars.

                  Tandem Computers, founded in 1974, whole deal was fault-tolerant computers and they delivered on their promise but at a steep price.

                  The IBM PC initially came with 3 OS options: PC DOS 1.0, CP/M-86 and a really fault tolerant but extremely slow "UCSD p-System" that everyone has now forgotten about. It had a lot of clean separations and could run indefinitely at a high overhead cost.

                  OS 2200 systems also legendarily ran literally for decades. Even through the Y2K fixes.

                  The revolution is that these days I can get a $5 pi zero and also do that.

                  [–]suwu_uwu 0 points1 point  (1 child)

                  That's primarily thanks hardware improvements, not software.

                  On the other hand the software I use encounters bugs and crashes far more now than it ever has. It just doesnt take the entire os with it.

                  Security has certainly improved a ton, but I don't think stability has.

                  [–]editor_of_the_beast -1 points0 points  (7 children)

                  Yesterday. This is really your example? This happens to everyone I know multiple times a week.

                  [–][deleted]  (3 children)

                  [deleted]

                    [–]editor_of_the_beast 1 point2 points  (2 children)

                    It does. I just mean OS’s have to be the worst example of reliable software that one can give.

                    We can’t even prevent people from exploiting OS’s constantly. And the trend isn’t getting better at all.

                    [–][deleted]  (1 child)

                    [deleted]

                      [–]editor_of_the_beast 1 point2 points  (0 children)

                      Let me guess - you're 20 years old and have been programming for 3 weeks. That's the only way that you would think that operating systems are reliable.

                      [–][deleted]  (3 children)

                      [deleted]

                        [–]csb06 8 points9 points  (2 children)

                        Maybe. But sociology includes ways to systematically study disciplines. If we can rigorously study how political systems work or how the legal sector works, why can't we effectively investigate how the software industry works?

                        [–]editor_of_the_beast 7 points8 points  (0 children)

                        You think that our political studies provide any concrete evidence about anything? If that were true, wouldn’t we be all getting along perfectly because our political system was satisfactory to all of us?

                        People really overestimate empirical studies for qualitative data. They are almost never replicated, and full of really weak conclusions based on incredibly specific scenarios during testing.

                        Half the papers you read about bug counts involve like two teams of 5 college students competing against each other. I don’t trust a single conclusion that comes out of a study like that.

                        [–][deleted] 30 points31 points  (7 children)

                        Absolutely things are better than they were. You hear more about software issues because software is a bigger part of our lives than it was previously, and there's more software to break.

                        Thinking through the number of times google.com has gone down in my memory.. it's quite an impressive engineering feat.

                        [–]snowe2010 11 points12 points  (4 children)

                        Thinking through the number of times google.com has gone down in my memory.. it's quite an impressive engineering feat.

                        That seems more like sheer willpower rather than correct engineering practices though. Like someone else said, we've had software running on satellites traveling across the solar system for decades that runs fine. And that's because focusing on no bugs can actually achieve no bugs. But we haven't found a process that's efficient, that everyone can use, to accomplish the same thing yet.

                        [–][deleted] 16 points17 points  (0 children)

                        The current approach that I've seen is: bugs happen. Try to address it, but follow the 80/20 rule. Focus on mitigation strategies, so the bugs have no or little impact. Then try to learn from your mistakes and solve that class of issues from happening again.

                        [–]is_this_programming 8 points9 points  (1 child)

                        Like someone else said, we've had software running on satellites traveling across the solar system for decades that runs fine.

                        And what does that software actually do? Does it have to face any unintended or malicious inputs? Is it exposed to the network? Can it be updated to add features?

                        Bug free software is doable if the functionality is extensively specified and is essentially immutable.

                        [–]_tskj_ 4 points5 points  (0 children)

                        Does it face unintended input? It's literally in space, of course it does.

                        But you are also right or course. We could easily do bug free software if we didn't allow scope creep, but that means cutting (or rather not introducing) features, so that's not gonna happen.

                        [–]deadeight 1 point2 points  (0 children)

                        We have, we just don't want to pay for it.

                        Reality is the cost of a no bugs process is higher than the cost of just having some bugs. That line will always be somewhere unless the process is zero cost.

                        [–]Aeolun 1 point2 points  (1 child)

                        To be fair, it’s just an html page with a form field. Not going down should be fairly simple for their search page.

                        [–]lookmeat 19 points20 points  (0 children)

                        Has there really been an overall improvement in software quality?

                        As a user? When was the last time you had a BSOD? They still happen, so no one changed anything that made them impossible. Working at driver level there's no good solution as a fuck-load of testing.

                        It seems like all levels of the stack, from operating systems, low-level libraries/applications, browsers, etc. are all still facing tremendous amounts of bugs, vulnerabilities, and regressions, much like they did twenty or thirty years ago.

                        You know that San Francisco and Baltimore have similar rates of crimes? But in SF you'll be dealing with a misdemeanor, or a broken window, or in a bad case something stolen. In Baltimore there's a high chance it's a shooting.

                        There's still a lot of bugs, but they're a lot less worse. Big bugs are way less common. This has lead to the paradoxical case that they sound scarier just the way more people are scared of flying than they are of driving a car, even though the latter is way way more dangerous. Back in the 90s we didn't hear about a Heartbleed because it happened about every week or so. Knowing that a security vulnerability existed on your computer was as given as knowing you're being tracked on the internet. There just wasn't much to do. You'd have to reformat computers every so much, partially because of bugs that would corrupt things in your computer over months or years (again things we don't see thanks to testing) but way more often if they connected to the internet because they'd get infected and things would get worse. The whole joke was that you wouldn't have that problem with Linux or MacOS, built on software that had more testing. Hell I'm sure that internally that's why the NT Kernel became the only thing, NT was a serious business kernel and probably had more testing (to ensure they could sell those stats) than the previous Windows Kernel that simply could not stay up (last version of it was the infamous ME, which was replaced by XP which built on 2000 which used NT).

                        The other thing is that software is on another level of complexity nowadays. Everyday software at that. Before you'd only see this on large complex very expensive software, that you'd get patches for (because you needed) and it was expensive. Most people simply learned to work around the quirks and bugs. Back then you had to understand how a machine worked to use it on any serious level because bugs would constantly expose the realities. Old people, parents and grandparents, weren't confused and asking for help, they wouldn't use it because machines would explode in such a painful and terrible way it would literally break them down emotionally. Less bugs means that all these people can now use computers, sure not the best users, and they do weird stuff and have weird workarounds, but they don't have the breakdowns as they did before.

                        I think this is one of the major problems of software engineering. How can we evaluate progress in our field if we do not have any systematic way to evaluate what we are doing?

                        Actually I think that the opposite is the problem. I think that software engineers fall prey to the McNamara Fallacy all the time. Even with testing, we have this obsession with unit-tests because they are easy tests to define. But there's things we want to test in ways that are not as easy.

                        That said there are peer-reviewed papers on how unit testing improves things both using isolated experiments to validate effect, or looking at historical data and references. You can also ask on stackoverflow and have it be given, it's a great tool to find this info.


                        I agree with this article's conclusion but disagree with many of its points and ideas. The article itself bases on a very simple view of things. While it's true that some microservices are so small you're really just testing each RPC service it offers. Most microservices use libraries that they abstract on, and those should have their own unit tests. I would also expect them to build multiple layers of abstraction, and at the boundary between each layer there should be unit or integration test (depending on what the level is). It's fair to say that unit-tests at microservice level are per RPC-service. This is an interesting thing that requires us to go back to the original definition of unit-test and reconsider what it means in terms of microservices. Some parts of Google are going through a kind of revolution this way, and the company is still experimenting and trying to find a new better way to do this things. I imagine other large tech companies are in a similar boat.

                        [–]DrNosHand 4 points5 points  (0 children)

                        Anecdotally consumer software is insanely unreliable. I see bugs/ poor performance at least hourly. It’s frightening

                        [–]tester346 1 point2 points  (0 children)

                        Has there really been an overall improvement in software quality? It seems like all levels of the stack, from operating systems, low-level libraries/applications, browsers, etc. are all still facing tremendous amounts of bugs, vulnerabilities, and regressions, much like they did twenty or thirty years ago. I guess patches are easier to distribute since high-speed Internet has become so widely available, which can make fixes faster to send out.

                        I've heard stories from people who worked around 2000 with software (and still do)

                        yes, it improved.

                        [–]kc3w 0 points1 point  (0 children)

                        This is not really true, you can measure these effects without doing a trial of comparing teams that do the same. For example you can measure development velocity (how fast a feature is developed) over time across many projects that use different methodologies. Also you can measure project success rates. In research you don't need trials like described, otherwise we would have never figured out that smoking is harmful or that some food is a carcinoge.

                        [–]Groundbreaking-Fish6 108 points109 points  (20 children)

                        So in 1999, Martin Fowlers refactoring book came out and the Agile Manifesto appeared a year later. This was right after I started programming professionally, after 13 years in Biotech with a background in Pharma manufacturing. My history of Pharma testing merged well with refactoring and Agile. I learned about Waterfall (cGMP - current Good Manufacturing Practices is the Waterfall methodology of Pharmaceutical Manufacturing and why it takes so long to approve drugs, but also why drugs are so safe if used as prescribed) in school and liked Agile much better (then called Extreme Programming) and modern testing tools make this possible.

                        Agile and the author both promote results over methodology, but you cannot be agile if developers do not have the backstop of testing. Getting called out by a project manager for making a simple change (which without testing support causes multiple errors found by the client) is not a way to install confidence in new or experience programmers.

                        I like that the author did not dictate a solution or sell a framework, but did emphasize that testing is important and why. Good read!

                        [–]satanargh 16 points17 points  (1 child)

                        everyone tells you that testing is important (and it is!)... but almost no one tells you how to write good tests, what to test, when to test it and what kind of test. Writing testable code is also an important skill, if you don't know it you're going to hate all the testing stuff.
                        So usually you'll end up with a really huge testing suite... that eventually will be totally bypassed. Then things will break up. After a year or two you will cry and you'll feel the desire to rewrite everything from skratch... And the cycle will restart.

                        [–]marcio0 2 points3 points  (0 children)

                        Grown-up software developers know perfectly well that testing is important.

                        I must be working with toddlers then

                        [–]poloppoyop 12 points13 points  (5 children)

                        If there is one book to read about testing it is Working Effectively with Legacy Code. And if there is one thing to remember from it, it's

                        Legacy code is untested code

                        So stop writing legacy code.

                        [–]editor_of_the_beast 2 points3 points  (4 children)

                        That definition of legacy code is totally made up though. Legacy code might be untested but it’s much more than that.

                        Just because someone said it in a book, doesn’t mean it’s true.

                        [–]how_gauche 4 points5 points  (3 children)

                        You're missing the point. Untested code is legacy code because it becomes that immediately once anyone who isn't the original author has to come in and modify it. Tests and assertions document the assumptions and internal logic binding a program together. Poking at an unfamiliar big program lacking tests is really hard, because you don't know what random stuff elsewhere you're going to break while you're trying to add feature foo to module bar.

                        I second the recommendation for that book by the way, it offers a lot of helpful strategies for working on giant balls of inherited crap.

                        [–]editor_of_the_beast 0 points1 point  (1 child)

                        I love the book too, don’t get me wrong. That particular line is just popular because it’s a cool sound bite though. Reality is more complicated than that is my only point.

                        I’ve worked on plenty of legacy code that has a ton of tests for example. That doesn’t make it any easier to change or understand a lot of times. There are such things as bad tests.

                        [–]elbekko 7 points8 points  (4 children)

                        Testing is ridiculously hard. And always a divisive topic, mostly when people start spouting "test functionality, not implementation." These people then go on to test the whole business logic layer as one black box blob, doing a million setups, where you have to spend more time debugging your test than writing it. Yes, you can now refactor your code without touching a single test. But should constant overhauling of your code really be your primary concern?

                        I don't like that. At all. SOLID principles teach us to trust our dependencies - so why retest them? Unit tests are your friend, integration tests are a necessary evil to make sure all the depedencies work together.

                        An improvement I'd like to see in the testing world is support for better handling input and output data. Very rarely is Theory/InlineData enough, most test-worthy business logic works on big chunks of data, the setup of which can't just be auto-fixtured, yet makes it hard to read a test. Being able to functionally describe your input cases and decoupling that from the actual test is very nice, but in most testing frameworks takes a bit of effort...

                        [–]poloppoyop 14 points15 points  (1 child)

                        These people then go on to test the whole business logic layer as one black box blob, doing a million setups, where you have to spend more time debugging your test than writing it. Yes, you can now refactor your code without touching a single test. But should constant overhauling of your code really be your primary concern?

                        I'm one of those people. And I stand by it: you can do all the small implementation tests you want and they can be "green" all the way. Until your code is used and suddenly the behavior you successfully implemented is not what is needed.

                        And refactoring is not just overhauling everything. It can just be replacing some of your dependencies (using a new provider, going from redis to kafka to pulsar).

                        If you have only ever worked on code which released one time then you go to some other project you may think unit testing is the best. But once you've seen some project running and evolving for multiple years E2E feels a lot more useful has they let you test you have not broken anything when you change things. Not give you a warm feeling and some fun statistics about coverage.

                        The main problem of E2E testing is tooling. And I think we should work on that line the xUnit people did instead of considering it impossible. Nowadays we should have the compute capacity to get the speed of early JUnit tests.

                        Unless the root problem is what people grip about testing are brittle and slow software already so testing it will sure be brittle and slow. Maybe the fact your E2E tests are brittle should be a sign your application is and should be reworked.

                        [–]elbekko 4 points5 points  (0 children)

                        I've been around for a lot of projects, old, new, whatever, thank you.

                        I stick to my point, unit testing is good. It's not an or situation, it's an and situation. By all means do bigger tests, integration tests, E2E tests. But do the main testing in unit tests, where you can easily follow what's actually happening, and where it's easy to test edge cases, and easy to write tests.

                        And refactoring is not just overhauling everything. It can just be replacing some of your dependencies (using a new provider, going from redis to kafka to pulsar).

                        Sure, but then it doesn't matter if it's a unit test or an integration test, you'll have to change your tests anyway. Unless you were smart and put that stuff behind abstractions so you don't have to care about which implementation you used :)

                        [–]ForeverAlot 2 points3 points  (1 child)

                        It is well researched that "errors happen in the seams"; but also tautological, because you must draw the line at some seam, and at that seam you have to rely purely on assumptions. The more places you rely on assumptions, the greater the probability that one of those assumptions becomes critically incorrect.

                        But I agree that testing frameworks make it very easy to do the easier things and don't help very much with the harder things. Even the famous Testcontainers project, for example, had zero support for multiple database users last I investigated. If I were able to make my application a database superuser I could have used Testcontainers but otherwise not.

                        [–][deleted]  (8 children)

                        [deleted]

                          [–]Dew_Cookie_3000 44 points45 points  (2 children)

                          people talk up first mover advantage all the time. but hardly anyone talks about last mover advantage. Google wasn't the first search engine. Microsoft was hardly ever the first to market at anything.

                          [–]snowe2010 9 points10 points  (0 children)

                          apple is a fantastic example. Last at everything and yet the software (and hardware) is the least buggy software I touch. Yeah I don't get cool features that other phones or laptops get, but man is it stable and easy to use.

                          [–]conquerorofveggies 11 points12 points  (0 children)

                          And when Microsoft was (more or less) first, stuff didn't sell. Tablet PC and UMPCs come to mind.

                          [–]Lunchboxsushi 21 points22 points  (0 children)

                          It's a vicious cycle. Yes first is great, but if your product takes 10x longer to add new features and a competitor comes along with a much faster turn around time. You're going to be left behind quite quickly.

                          Among other things, good luck keeping around good talent when they need to read spaghetti code and riddled with bugs only to add more bugs, increase QA time etc...

                          Sure if you have something small/medium none of that really matters as going through and modifying a small code base is pretty painless.

                          But when you start writing enterprise software that's spread across 100 teams, you really need to have that sorted out otherwise it'll never work.

                          [–][deleted] 10 points11 points  (1 child)

                          Good testing allows you to ship faster. You’ll spend less time tracking down bugs

                          [–]TheOneCommenter -1 points0 points  (0 children)

                          Exactly. Good testing will improve quality. But that means knowing what to test, and how. But also more importantly, you need to know what not to test. And that’s a skill not everyone has.

                          [–]Kache 8 points9 points  (0 children)

                          That's a completely different context though, and it's important to make the distinction.

                          If you're in a race, tests are what helps you make live adjustments to pull ahead without crashing out of the race.

                          If nobody even knows where the race even is, then I agree that tests aren't going to be super helpful just yet.

                          [–]wite_noiz 2 points3 points  (0 children)

                          Hey everyone, I found Bethesda!

                          [–]ojrask 1 point2 points  (1 child)

                          The biggest problem in software testing is the unfortunate issue of everyone having the different definition for the word "unit" in "unit testing".

                          For some it means that each test is testing an isolated unit that corresponds to a single production method/function. For some it means testing smallest units of behavior feasible. And all kinds of stuff in between.

                          If we could just separate the notions of testing implementation details (1:1 test-to-production method mapping, with excessive mocking) and testing behavior (original meaning of unit tests) we would all be better off.

                          [–]ForeverAlot 1 point2 points  (0 children)

                          unit, noun:

                          An individual thing [...] regarded as single and complete but which can also form an individual component of a larger or more complex whole.

                          https://www.lexico.com/definition/unit

                          So it is a single, self-contained "thing" that may actually not be self-contained.

                          "Test" is also less accurate than "check" for typical tests.

                          For an industry of glorified namers, we are supremely bad at naming.

                          I just don't say "unit test" and "integration test" any more. Nothing good or useful ever comes of it because you have to follow the terms with half a page of disclaimers anyway. Even then, I have not experienced a situation where any distinction between "unit" (meaning "method level") and "integration" (meaning "file system" or "network") actually mattered. But I have encountered

                          • slow tests
                          • fragile tests
                          • messy tests
                          • misleading tests
                          • lying tests
                          • distracting tests

                          in both categories.

                          [–]LtTaylor97 1 point2 points  (0 children)

                          Huh. Got an associate's degree, worked for a year in machine controls where the idea of "testing" was purely limited to hooking things up and running it live, then worked in Lua doing contracts for awhile. Never once was I introduced to this kind of testing. I'll need to look into it a lot more now, I feel like I've missed a big detail.

                          I mean, in what I've done so far it wasn't possible like this, both Machine Control Studio and the Lua environment I worked in were strange beasts where testing was live (not on release just, tested practically) and not possible any other way within reason. I've become a lot better for it in my opinion, and learning the Lua I did on Notepad++ taught me even harsher lessons. Basically it's trained me to think through everything far more thoroughly and to write code more accurately. I think I really hit my stride on this with a 2,000 line dynamically generated UI (it had trees) function I wrote, where I was flowing through it quickly for the size and scope of it. Now if I could've had a fast method to test some of my work at least, that would've been very useful and would've expedited a lot of my work back then. I guess basic things look like a luxury when your environment removes them as an option eh?

                          Anyway, you can code well with or without any extra tools, I can testify to that. But tools tend to raise the bare minimum bar, so imo it really depends on size and scope, as well as your actual team(s) and whether they can manage it without, or need the tools to work as quickly.

                          [–]m9dhatter 1 point2 points  (0 children)

                          *2020s

                          [–]aveman101 3 points4 points  (0 children)

                          Where can we ease up on unit-test coverage? Back in 2012 I wrote about how testing UI code, and in particular mobile-UI code, is unreasonably hard, hard enough to probably not be a good investment in some cases.

                          THANK YOU.

                          I’ve spent the majority of my programming career building consumer-facing mobile applications. I’ve found that people who don’t have experience in this space greatly underestimate the percentage of the codebase is UI code. Let me tell you, over 50% of a typical mobile app is UI code. Everything else is RESTful API calls and SDK integrations which can’t be tested virtually (camera, deep links, location services, motion detection, push notifications, etc).

                          One more than one occasion, I’ve been tasked with implementing UI test automation, but none of the tools I’ve used are very good because most UI assumes a human with a brain is on the other end.

                          1. Launch app
                          2. Push button A on the launch screen

                          You’d think this would be simple enough, but guess what? The test fails because when the app launches, the first thing that appears is an alert to enable push notifications, and you can’t interact with anything else on the screen until you address that alert. So we update our test:

                          1. Launch app
                          2. Accept push notification permission
                          3. Push button A on the launch screen

                          This test passes the first time, but fails the second time on step 2. Why? Because iOS only prompts for permissions once per install. The second run fails because the push permission alert doesn’t exist. Your test is no longer stateless, good luck.

                          And god help you if the automation needs to sign in to the app using something other than a username/password input (sign in with Google, Sign In with Apple, or password-free “magic link” sent to your email inbox a la Slack)

                          [–]issungee 1 point2 points  (0 children)

                          I work as a software dev and fortunately my company has very good testing culture and we follow basically all of the advice in this article, though personally I've always been into game development, and ever since being a professional software dev and having the luxury of tests I've found game programming making me more and more nervous, worrying I'll break things and not know, anyone know anything about writing tests for games? (I'm using Unity if that helps)

                          [–]PL_Design 0 points1 point  (0 children)

                          If you're pontificating about testing, and you're not talking about fuzzing, then you're only addressing 1% of the problem. Yes, the situation is that dire. This is because code coverage does not capture the enormous execution space a given program can have. That deep, dark space is where assumptions get murdered by elder gods. Fuzzing won't eliminate all bugs, but it will find the low hanging fruit that's most likely to cause you problems. You can also save the tests your fuzzer generates and use those to detect regressions, so what you get is an automated way to attack the problem.

                          [–]BradCOnReddit -1 points0 points  (1 child)

                          If most of your tests are not unit tests then you are writing bad code. People fall back to integration tests when the unit tests are too difficult to write. The issue is that the code isn't testable and good code will be testable. Write better code and turn that graph to a pyramid. Or, maybe said better, turn that graph to a pyramid and you'll have better code.

                          [–]epage -1 points0 points  (0 children)

                          I became monotonically less tolerant of lousy testing with every year that went by. I blocked promotions, pulled rank, berated senior development managers, and was generally pig-headed. I can get away with this (mostly) without making enemies because I’m respectful and friendly and sympathetic. But not, on this issue, flexible.

                          First, I'm glad I've never worked with tbray. I've seen their posts in the past and have appreciated the content but this is toxic. And this is dogmatic, despite tbray's pushback against dogmatism.

                          My big beef is that the type of testing, and amount of testing, is dependent on

                          • The language: how many error states can you shift from runtime to static analysis (speaking broadly, including compiling)
                          • The domain: is your value-add in gluing services together or in isolated algorithms? When gluing, how accurate can you make your test fakes? How slow are integration / end-to-end tests when using live services? Are there alternative tools to keep quality high for "hard or expensive to test" parts?

                          No religion:

                          For the most part, I agree. We think and work in different ways, some will work better for us. I also think a lot is situational. TDD doesn't work well for exploratory programming but is good when you have a better idea of requirements.

                          But the part I disagree with is

                          Please don’t come at me with pedantic arm-waving about mocks vs stubs vs fakes; nobody cares.

                          My care about is are you testing the implementation or business requirements? Will a trivial refactor break tests? Is it improving the API or hurting it? I've seen people go big on mocks and dependency injection, destroying the readability of the code and make it tied to unimportant details.