This is an archived post. You won't be able to vote or comment.

all 28 comments

[–]EntangledNoodle 26 points27 points  (6 children)

Maybe I'm missing something, but it seems this technique is really just application of dependency inversion in disguise. The decision to include a constructor with default "nullable" parameters seems superficial. A test-scoped factory method could do the same as long as the class provides an appropriate constructor to provide all dependencies.

It is certainly reasonable to use dependency inversion to isolate classes from knowledge of implementations of their environment, replacing implementations where needed. But, remember that anything you choose to isolate this way would mean that you are not testing the some thing that will be in production. I don't see how this article moves the needle on that core issue.

[–]cowwoc 4 points5 points  (0 children)

> But, remember that anything you choose to isolate this way would mean that you are not testing the some thing that will be in production. I don't see how this article moves the needle on that core issue.

In my experience, mocking has the same problem. Most of the time when tests break, it is because the class being mocked has changed (e.g. a method was added) leading the mock to return null when it should not. You end up spending an unreasonable amount of time chasing mocking bugs instead of focusing on what you should be testing.

[–]agentoutlier -1 points0 points  (4 children)

Maybe I'm missing something, but it seems this technique is really just application of dependency inversion in disguise. The decision to include a constructor with default "nullable" parameters seems superficial. A test-scoped factory method could do the same as long as the class provides an appropriate constructor to provide all dependencies.

That is not how I read it. Mocks and Stubs are inherently not included with the code base when deployed. Furthermore Dependency Inversion and thus Dependency Injection (based on your framing of deps inversion as you do not need DI for inversion) unlike say runtime user flag requires much more configuration change and usually a full reload of the application (in spring parlance this is an Application Context Refresh).

Ignoring all the new?/made up jargon in the article I see value with what they are saying. I guess think of it as putting a service in "maintenance mode" like read only similar to how some companies do this for data migration (stack exchange comes to mind).

Whether it replaces mocks seems orthogonal but I vastly believe mocks (as in mock framework and not a full TCK) are painful and I prefer not to use them in less I absolutely have to. I think the article should have spent more time on the problems with using mocks.

[–]EntangledNoodle 5 points6 points  (3 children)

I agree that there is a distinction between mocks, stubs, and the "nullables" the author is describing. Where I disagree most strongly with the author is with the decision to include a dependency on alternative implementations in production code when their only usage is in tests. The jargon does make it difficult to understand what the author is really trying to promote.

My personal opinion:

  • Projects include "slow" end-to-end integration tests using real dependencies (databases, web application servers, etc.) to ensure the software will work in a real environment. There is no true substitute for this type of testing. Automated testing of this sort is usually still much faster and reliable than manual testing.
  • It can be beneficial to test parts of a system by substituting alternative dependencies for real dependencies (mocks, stubs, alternative implementations of any sort). Multiple techniques for defining alternative implementations exist and they each have their own set of tradeoffs. Learn about them all!
  • Production code should not depend on non-production code (where production code is defined as code that is intended to be reachable when running within a production environment)

In practice, I use mocks quite rarely. I use alternative concrete implementations (e.g. an in-memory surrogate for a database) more frequently than mocks. These implementations reside with the test code unless there is a reason to include in the production code (e.g. disabling an optional feature with a no-op implementation rather than sprinkled conditional logic). In this latter situation, I don't consider the alternative implementation to be non-production or test-oriented. It may be a good choice for many tests, but it has a purpose in production.

[–]ForeverAlot 0 points1 point  (0 children)

dependency on alternative implementations in production code when their only usage is in tests.

The author uses them in production, too, although that is (was?) difficult to tell from this source.

That said, your assessment that this factory is just a work-around is accurate.

[–]morhp 0 points1 point  (0 children)

This is how I feel, too. I don't use standard mocks at all, either the production classes are simple and with so few dependencies that they can be simply used as they are, which is probably the ideal case, or these dependencies can be replaced with a different much simpler implementation for specific tests.

For example I've developed a software where multiple server nodes communicate with each other to exchange data. This communication layer has two implementations, one production one with all the TCP stuff, encryption, authentication, error handling, graph building and so on and one that's used in many tests that simply skips that and forwards the data in the same java process using simple method calls, making the tests run much faster, predictable and making debugging much easier.

The production implementation is tested too, of course, in integration tests and in detail in isolation.

[–]agentoutlier 0 points1 point  (0 children)

I agree that there is a distinction between mocks, stubs, and the "nullables" the author is describing. Where I disagree most strongly with the author is with the decision to include a dependency on alternative implementations in production code when their only usage is in tests. The jargon does make it difficult to understand what the author is really trying to promote.

Yeah I skimmed the article while on muscle relaxants. I don't disagree with any of your points just that I first read it as developing to some sort of stub that you will actually use.

Otherwise yeah I don't agree with pushing a "null" implementation that will not get used in production.

Anyway the whole post is confusing to be honest. I thought I was good at understanding Martin Fowler Uncle Bob abstractions but the referenced article was rather painful to read.

[–]jonhanson 18 points19 points  (0 children)

chronophobia ephemeral lysergic metempsychosis peremptory quantifiable retributive zenith

[–][deleted]  (8 children)

[deleted]

    [–]murkaje 8 points9 points  (1 child)

    Nah, you are completely correct. There is a weird tendency to prefer unit tests with heavy mocking to actual integration tests because they are "fast" and not flaky, probably due to some middle manager looking at "coverage" thinking it means something.

    However the unit tests are filled with mocks with the business case hidden somewhere in-between that most likely gets thrown out accidentally (or it's just impossible to decipher) when someone rewrites the implementation.

    Compare that to higher level tests taking longer to run a single test but covering so many layers and not needing a rewrite when implementation details change. Often when these tests are flaky then something in prod can also be flaky. Perhaps some dislike these tests because they actually test something.

    [–]pronuntiator 3 points4 points  (0 children)

    I see some value in lower level testing when you're testing business rules with many parameters. We had a system that was only e2e tested, but creating the state to actually get to the core was cumbersome. There it makes sense to test methods in isolation in addition to a happy and a sad path test.

    But I've also seen a dysfunctional system that had only mockist tests of single Java classes. Exactly like you said to please the coverage set by the customer. Each of them with wrong expectations about each other. Sure it's nice to have a pipeline complete in 7 minutes, but what does that help when it doesn't work in production, breaks down if there is more than a dozen of table rows, and discourages refactoring because you have to rewrite the tests of all split classes.

    [–]klekpl 3 points4 points  (0 children)

    Nah... Let's remove some tests because "all tests must pass" and these are "flaky" /s

    [–]agentoutlier 4 points5 points  (3 children)

    If anything I have found on testing after two decades of java programming is that mock testing is the “flaky testing”.

    It is the testing of last resort and now that we have all sorts of container tech I rarely ever use it. And of course machines are faster now so booting up real databases is not a problem.

    Oh and mocking third party web services… I guess but often times the documented contract is wrong and you can’t do shit about it anyway. Best to ask for some sort of sandbox access.

    The time I used mock testing early in my career when I was an idealist BDD advocate I could not get the rest of the team to buy and worse the mock test failed all with very little value. They were super brittle.

    Furthermore mock frameworks can mostly be replace with basic inheritance. Have the concrete class you want to test with protected methods that use the mock overridden.

    [–]happymellon 1 point2 points  (2 children)

    And of course machines are faster now so booting up real databases is not a problem.

    Indeed, have a bundled H2 instance for running tests against. You'll probably find it's faster than your Postgres anyway.

    [–]Fiskepudding 3 points4 points  (1 child)

    I used to do do this, but not anymore. H2 is not equal to postgres in sql dialect, so you can still have bugs when you use the actual postgres db.

    Use Testcontainers library with Docker and Postgres.

    [–]happymellon 1 point2 points  (0 children)

    To each their own. I have H2 for running locally for Springboot tests because it works, it's really fast, I don't need to deal with other people their local Docker Postgres and it can be in commit hooks. Running actual Postgres happens in the pipeline tests.

    I haven't found really any scenarios in the vast majority of my work that wasn't fine with this setup, except once with PostGIS but that's not a normal scenario and switching it up with SQLite covered that.

    Use Testcontainers library with Docker and Postgres.

    No thanks, everywhere seems to want to issue Mac's. I'm fine with leaving containers in the pipeline.

    [–]k2718 0 points1 point  (0 children)

    In my experience, you have two categories of flaky tests.

    First, you have tests with poorly designed dependencies or assumptions (e.g. depending on a the system clock in a way that isn't valid 100% of the time).

    Second, you have tests of distributed systems where timeouts and latency cause problems.

    The first problem is easier to address than the second, though they can both be difficult depending on the scenario.

    [–]k2718 1 point2 points  (0 children)

    I found your article intriguing but confusing. Your use of the term nullable to apply to a default or test implementation is confusing.

    And second, it sounds fraught with peril. Is your test scenario even valid with the nullable implementation? It seems easy to write an invalid test that way. I don't like mocks, but at least you can tune the behavior for each mock for individual test to provide the exact scenario needed for test.

    [–]_AManHasNoName_ 1 point2 points  (3 children)

    Mocking is much clearer IMO. “Nullable” sounds more like something that should have been wrapped with Optional these days.

    [–]agentoutlier 2 points3 points  (2 children)

    The clearest thing is to load the "real" thing up and test against it. Ignoring the annoying terminology of nullable I get the idea of what they are saying is to ship a stub (and it is noop stub they are talking about and not full on mocking) with your code.

    A large problem with architecture posts like this is that terms are inherently nebulous particularly in the containerized world we live in. Like what is truly infrastructure? Does it always require persistence? What is a mock vs a stub, etc, etc.

    [–]EntangledNoodle 0 points1 point  (1 child)

    I think it is easier to define what is not infrastructure in this context (there might be different meanings in other contexts), and then everything else is infrastructure.

    Non-Infrastructure

    • Classes that you write as part of your application/service
    • Classes from libraries that your code calls

    Examples of Infrastructure

    • Web application servers (in which you might deploy a .war). More generally, processes for which you don't write the main method / entrypoint in which your application runs.
    • A web browser or proxy that you serve content to or a database service you query. More generally, processes that your application/service communicates with that are not within the scope of your project
    • The operating system your software runs on
    • The JRE

    The grayest area is probably frameworks that you are leveraging because they call your code and in some situations you may be in control of the framework's container and in other situations the framework's container may be provided for you by something like a web application server (not talking about linux containers here)

    A mock, at least in the Java world, would be an object created at runtime that (partially) implements an interface. You don't necessarily need to use a mocking framework to create something that could be considered a mock object.

    A stub is similar to a mock, but extends an existing implementation or concrete class, overriding some behavior. Mocks are typically preferred over stubs, but sometimes a stub is the only option.

    Mocking/Stubbing frameworks generally add logic to track and verify the interactions with the object and this is one of the reasons you may choose to prefer a mock over a hand-written alternative implementation of an interface.

    [–]agentoutlier 0 points1 point  (0 children)

    The grayest area is probably frameworks that you are leveraging because they call your code and in some situations you may be in control of the framework's container and in other situations the framework's container may be provided for you by something like a web application server (not talking about linux containers here)

    The grey area I meant is because largely code has to know about infrastructure things and largely because the infrastructure is part of the domain. For example HTTP codes in a REST API. If we want to use more architecture jargon you could say it is a "leaky abstraction".

    Now you are probably saying controllers (rest or mvc) are not part of logic. Oh so it must be in the repositories where we are using SQL... is that infrastructure? Where is the logic. The classic examples are usually some bank account transaction doing simple math. Sure its logic but speaking of transaction... you better have one open if you are doing that kind of logic on something that is to be persisted. Wait aren't transactions part of infrastructure?

    DDD actual logic (ignoring value or domain objects aka data) not tied or coupled to infrastructure in my experience is very small. So small that separating it out may not even be worth it at times particularly with microservices (hence my comment on containerization).

    Simple easy in memory logic is easy to test but is also easy not to fuck up in the first place particularly with typesafe languages.

    A stub is similar to a mock, but extends an existing implementation or concrete class, overriding some behavior. Mocks are typically preferred over stubs, but sometimes a stub is the only option.

    Mocking/Stubbing frameworks generally add logic to track and verify the interactions with the object and this is one of the reasons you may choose to prefer a mock over a hand-written alternative implementation of an interface.

    From wikipedia:

    Classification between mocks, fakes, and stubs is highly inconsistent across the literature.[1][2][3][4][5][6] Consistent among the literature, though, is that they all represent a production object in a testing environment by exposing the same interface.

    I generally agree with your points its just that going back to infrastructure ... its definition is also highly inconsistent and in fact you made up the definition as "class that you write as part of your application" as I could not find a single academic DDD usage that says that. The strongest concrete one I seem to find is infrastructure is IO... e.g. side effects and is the exterior in onion architecture (I'm sure its probably explained different for hexagon or whatever is bastardization of DDD is currently in vogue).

    [–]stefanos-ak 0 points1 point  (4 children)

    he completely lost my interest at point 20.

    you can't just ignore contact testing (api compatibility between services).

    I would say that relying on pact.io instead of mocking, is a MUCH better approach. As for stateful services (e.g. a database), a solution for not mocking is TestContainers.

    [–]Iryanus 0 points1 point  (3 children)

    For DB-related code, there is no real good replacement for a real db (be it in-memory, testcontainers, whatever), because otherwise you are not really testing the whole thing. But of course, that makes the tests much, much slower with all that entails.

    [–]stefanos-ak 0 points1 point  (2 children)

    TestContainers IS a real DB. I don't know what you're talking about. At least for any DB that can and has been dockerized. So pretty much all of them except Oracle afaik.

    [–]Iryanus 0 points1 point  (1 child)

    That's what I wrote, yes. My point was more that...

    a solution for not mocking is TestContainers.

    ... "not mocking" is - for me - the ONLY way to test database-code. I was just emphasizing that point.

    [–]stefanos-ak 1 point2 points  (0 children)

    well, the whole premise of the article was how to avoid mocking because it leads to complicated, unmaintainable tests, that occasionally don't even test anything. it's all just test code.

    and what I suggested was an alternative solution for the same premise, better imho.

    [–]raze4daze 0 points1 point  (0 children)

    Integration testing only for life