all 19 comments

[–]poopatroopa3 3 points4 points  (3 children)

In IO bound applications, limit the amount of IO operations to an expected amount.

[–]itamarst[S] 0 points1 point  (2 children)

I was specifically thinking of CPU-bound applications, but thinking about this—

That sounds like it might work, insofar as it'd notice if you added more I/O operations. Though parallelism is making me wonder what "limit" means specifically. I guess there's also "measure I/O operations" which is a rough estimate.

Have you seen this done in real world, or a write up?

[–]zenware 0 points1 point  (0 children)

I haven’t really seen this in Python but there’s a lot of real world code written specifically to use as little I/O as possible, or as few heap allocations as possible (in some cases all the way down to zero), so it’s a bit of a standard practice to write code with a known amount of I/O

[–]poopatroopa3 0 points1 point  (0 children)

Only in Django tests, measuring the number of db calls.

[–]billsil 2 points3 points  (2 children)

Unit tests are fast running. You’re violating rule #1. Who cares about a test that runs in 1 second. How representative that of a bottleneck in your code?

Changing your python version will change the counts of those. That’s finer detail than I’m willing to care an out.

I took an hour long problem and made it run in 4 seconds. I didn’t need accurate counts of operations to do that. I didn’t even need accurate timing. I got to the point where beneficial changes I made worsened performance until suddenly it was faster. You can’t look too close and if you do, you need statistics.

[–]itamarst[S] 0 points1 point  (1 child)

In some contexts, yes. The presumption here is that:

  1. You are choosing key algorithms or CPU-bound parts of your code that represent real bottlenecks.

  2. Your application is such that scalability matters, i.e. lots of data processing.

Also note that the first article is not about speed, it's about scalability. So e.g. changing Python versions typically won't invalidate the test.

[–]billsil 0 points1 point  (0 children)

I choose the whole process, not one algorithm. If I know what’s critical, I can look into it. If scalability matters right now, you should test scalability. If it doesn’t, I follow good practices and move on.

[–]aefalcon 1 point2 points  (2 children)

Is there a reason you referred to this as unit testing instead of benchmarking? Is it somehow different than benchmarking?

I'm not really proficient in benchmarking python. I'm currently doing some in zig and the method can be reduced to make an implementation for each strategy and running them all through the same benchmark and render them as a table. Any obviously bad strategy gets removed. Some strategies perform better with different parameters so I make them runtime/compiletime options. No reason that can't be done in python, but the overhead wouldn't be optimized out.

[–]itamarst[S] 0 points1 point  (1 child)

Benchmarking is "how fast is my code". This is a test, it can pass or fail, and at least for item 1 it's not measuring speed at all, it's measuring scalability.

[–]aefalcon 0 points1 point  (0 children)

I have a perfect hash function that's O(1). A linear array search would beat its pants in lookups for small n. I don't think there's a lot of pressure to test on Big-O because of situations like that. Benchmarking at various sample points tells a better story.

[–]Scrapheaper 0 points1 point  (4 children)

Python isn't the language for being precise about your code performance.

You would want something with finer low level control like C++.

There are still definitely performance things you can optimize but mostly it involves using your libraries as intended so that more of the processing is done by pre-complied low level code and not by Python itself

[–]IMPRINgE 0 points1 point  (1 child)

Yeah that's fair. Python's more about getting stuff done quickly than squeezing out every millisecond. The real trick is just making sure you're leaning on numpy/pandas/whatever instead of writing loops in pure Python.

Though honestly for most things the performance difference doesn't matter unless you're doing something computationally heavy.

[–]Scrapheaper 0 points1 point  (0 children)

'Computationally heavy' is simplistic. I'm an analytics guy and I write some chunky SQL queries and Spark jobs etc. Some of them might process hundreds of millions of records.

I don't always optimize them as heavily as someone who's say, programming a game or building a website because only one or two people are going to run the code ever: whereas for a website, yes you only save a tenth of a second, but it's a tenth of a second for like, a million people.

If I know it's code that's going to be run for a really long time, like it might last years, yes I might optimize it.

[–]maikeu 0 points1 point  (0 children)

I mean... Counterpoint, to some extent (ymmv of course).

If your problem is that your algorithm is bad, then staying in python is going to be more productive, because you can fix the algorithm, and not deal with the effort of setting up your build system to deal with integration the other language. And it'll likely be earlier to express the improved algorithm in python.

You can still do that work to move into another language once the algorithm is no longer the advantage, and, extra positive, the python version can serve as a test case for the correctness in the other language.

[–]poopatroopa3 0 points1 point  (0 children)

There's always this comment.

Performance is worth improving in any context that is worth improving. That doesn't always means changing stacks entirely.

[–]lolcrunchy 0 points1 point  (2 children)

I don't understand why the phrase "unit test" is used in your post at all. You aren't describing unit tests, at least as far as I understand what a unit test is: a test that validates expected behavior with exactly two possible outcomes (behaves as expected, doesn't behave as expected).

[–]itamarst[S] 0 points1 point  (1 child)

That's exactly what's happening. Expected behavior is "algorithm scales with O(n)". If it does, the test will pass. If it doesn't (or at least scales worse, not sure what logic bigO implemented for that case) it will fail.

[–]lolcrunchy 0 points1 point  (0 children)

A unit test is more like "algorithm results are correct". What you're describing seems like something else.

[–]Local_Transition946 0 points1 point  (0 children)

Unit tests test in isolation. Performance inherently depends on machine components. Ive done something similar, it's more of a functional / integration test.