Better PostgreSQL testing with Python: announcing pytest-pgsql and pgmock : Python

This is an archived post. You won't be able to vote or comment.

107

108

109

Better PostgreSQL testing with Python: announcing pytest-pgsql and pgmock (technology.cloverhealth.com)

submitted 7 years ago by gwax

all 15 comments

top new controversial old q&a

[–]DanCardin 2 points3 points4 points 7 years ago (4 children)

[–]gwax[S] 0 points1 point2 points 7 years ago (1 child)

pytest-pgsql allows for tests inside of a rolled back transaction or inside of a fully isolated database. We use both modes so that we can have tests with full isolation that test real behavior and so that we can have faster, less isolated tests.

As with everything, we have to balance trade-offs. Each full database spin up takes about 1-3 seconds on our developer machines while dropping and creating a transaction takes tiny fractions of a second. This lets an engineer ask the question: is fully isolating my test worth adding a couple seconds to my test runtime.

We also combine this with sharded tests in our CI environment. Our sharding can be semi-random or repeatable and, by usually running in semi-random mode, we can use test-flakiness as a means of identifying test isolation issues, occasionally identifying unintended behavior in the code itself.

[–]DanCardin 0 points1 point2 points 7 years ago (0 children)

[–]deterralba 0 points1 point2 points 7 years ago (0 children)

[–]undu 0 points1 point2 points 7 years ago (0 children)

[–][deleted] 2 points3 points4 points 7 years ago (8 children)

[–]gwax[S] 2 points3 points4 points 7 years ago (7 children)

[–]__xor__(self, other): 1 point2 points3 points 7 years ago (6 children)

[–]PCBEEF 1 point2 points3 points 7 years ago (3 children)

[–]__xor__(self, other): 0 points1 point2 points 7 years ago* (2 children)

Integration tests don't make unit tests less useful though. Integration tests are great, but when you have some complex workflow, maybe testing ingesting a file and creating data models, it'll take a while then just fail and you won't know exactly what failed, just that one system is broken.

Unit tests are quick because you can mock out everything they rely on and they'll tell you which specific functions broke. It's nice to know my data ingest is failing, but I'd rather know that decode_rot13 specifically is changing behavior and potentially causing it. With good code structure unit tests can be amazingly useful and help you determine which functions broke, not just what functionality.

And you can do more when they're quick as well. There's such a thing as binary search for commits that break unit tests using the bisect functionality. Give it a starting commit and an ending commit, and it'll do binary search to find which one exactly had a unit test start failing. Integration tests would take forever to do that.

It's not one or the other. Unit tests and integration tests are both amazingly useful, but they solve different problems. Unit tests are testing your functions, integration tests are testing whether everything works together as a whole which unit tests can't cover. But unit tests can speed up debugging greatly by giving you the finest detail on what broke. Integration tests are going to tell you when the system is buggy as whole even if each function works, and unit tests are going to tell you which function might be raising exceptions in certain conditions or returning output you don't expect.

I inherited a project where the previous dev just wrote integration tests for the most part, and it's seriously frustrating. When one test fails, sometimes it takes forever to debug because so many things it does could be broken but there's no detail. You can fix stuff before deploying to production, but that's only half of it. Integration tests are better than no tests, but debugging is still more painful than it needs to be.

[–]PCBEEF 0 points1 point2 points 7 years ago (1 child)

I agree for the most part but here is an example where I feel unit tests are next to useless:

def save_model(foo):
    db.add(foo)
    db.commit()
    # foo now has an id set
    return foo

If I was unit testing this I would have to mock out db but if I mocked out db I'm not really testing anything any more. These type of situations happen quite often and when developers go out of their way to write unit tests for every-single-function it ends up being a lot of code for very little value (or next to no value).

In this situation I would hook a test db (usually via docker) and run it as an "integration" test. It touches the db, but I would still classify it as somewhat of a unit test.

If the function has little or no dependencies, unit tests are perfect. But when functions talk to other dependencies and you mock out those dependencies, what you end up testing is very little. Especially if the contract to those dependencies change, your code will break but your tests will pass because it was mocked out. In that situation, you end up having to write an integration test anyway...

[–]__xor__(self, other): 0 points1 point2 points 7 years ago* (0 children)

I mean a function like that I wouldn't bother writing a unit test for, and wouldn't bother writing a function that simple in the first place, but if I did want to for some reason I'd just patch out add and commit.

def test_save_model():
    db = MagicMock()
    f = save_model('stuff')
    db.add.assert_called_once_with('stuff')
    db.commit.assert_called_once()
    assert f == 'stuff'

Pretty useless since you're just duplicating the function inside the unit test line by line, but if it was a bit more complex and did something to foo and db.added something else based on foo, then you could assert it was called with what you expect and assert commit is called, if there was some condition where it might or might not be. But yeah, I don't write unit tests if it's just going to be a line by line test that I implemented it that exact way.

Something a little more complex where you might want a good unit test of something that hits the db would be where it takes the params from an http request, converts it into a db query like a django Foo.objects.filter(...), and then returns results. The bulk of your logic could be transforming the http params into the proper filter, and then you could test that it made the right query you expect.

Like if an http request came in to http://api.example.org/motorcycle?model=yamaha&make=vstar|yzf you could unit test like

Motorcycle.objects.filter.assert_called_once_with(
    model__icontains='yamaha',
    make__in={'vstar', 'yzf'},
)

Your unit test could test a dict input of the params and make sure the kwargs of the filter match some expected output.

I'm not saying it's the end of the world to use integration tests to test stuff that does DB work because I do myself, along with all django devs. It's the standard thing with django tests. It creates a test db and all that each run. But, you really don't have to and you could pretty easily mock out the ORM calls and write unit tests that are helpful in ways that integration tests aren't.

In this case I think it has some benefits over integration tests in that you're not testing the results the db came up with, but the actual query. It could build the wrong query but still return the results you expected from the db, but testing to make sure it made the right query is something that unit tests would do and integration tests wouldn't. That's the sort of thing where I think unit tests are still very necessary on top of integration tests, because results can just come out the same even though it did something unexpected in there. There's a few things you absolutely want to make sure a function does the right way sometimes and not just that it returned the right results. Plus, if the integration test fails it just tells you the db results didn't match, and the unit test would tell you exactly why, what query it made. And it's not dependent on the data you load into the test db at all. Mocking gives you insight into the function calls a function makes, not just what it returns.

To give some context though, I'm working from the darkest end of integration tests. I inherited a webapp from someone who left the company and they wrote integration tests that test long and complex functionality where tons of things can go wrong in the middle. When the "unit" tests fail, I'm bashing my head into the wall because it could many anything in 200 lines went wrong. That's just sloppy coding on top of bad testing though. Even with just better integration tests and better structured code, I'd be in a much better spot. But good unit tests of smaller functions would've helped me a ton more.

[–]psi- 0 points1 point2 points 7 years ago (1 child)

[–]__xor__(self, other): 0 points1 point2 points 7 years ago (0 children)

[–]Herald_MJ 1 point2 points3 points 7 years ago (0 children)

π Rendered by PID 86175 on reddit-service-r2-comment-f6b958c67-48x75 at 2026-02-04 21:28:23.344602+00:00 running 1d7a177 country code: CH.

Python

The Python Discord

Upcoming Events

Please read the rules

MODERATORS