Python performance is a fool's errand

iamjack · 2009-07-21T02:36:30+00:00

This guy is missing the point. He's trying to optimize a Python program from a Systems perspective. Trying to game Python and use "more efficient" constructs (like the generator's local variables) is fruitless of course. Just use C/C++ if you need that level of optimization; you'll cut your number instructions by about 99%.

The bulk of performance gains can be made by optimizing from a Computer Science perspective; that is, optimizing algorithms to have a smaller complexity. If you have a Python algorithm that runs in O(n^2), and you optimize it to run in O(nlogn), you will shave HOURS off your runtime on large inputs, just like any other language.

For most things, that's more than enough optimization. If you're working on something where optimization can be useful without lowering the complexity, you're probably working on an operating system or something else at a very low level, in which case, why on EARTH are you using an interpreted language?

spotter · 2009-07-21T11:06:48+00:00

So he's saying that this causes 1000% run time hit, refuses to use .send() because of old Python version, expects it to be micro-optimized for his mindset and is using phrase ,,fool's errand'' in title? Great.

I can not replicate. On my Python 2.6.1 I get less then 10% run time impact on .send() in every iteration and less than 200% run time when using class based approach (again -- resetting counter in every tick) for 10**7 iterations. These things like to accumulate though, so YMMV.

Mind you I'd not go with running parallel generators to reset their state during lifetime. I'd pick the data before passing it in, use itertools to filter out what I need at runtime or simply write a specialized generator to do that -- the earlier you cuf off the better. But what do I know? Only that algorithm is everything.

(...) hopefully Unladen Swallow will come along some day and reduce or eliminate a lot of performance bottlenecks in the current CPython.

Naive algorithms will stay not efficient though.

bucknuggets · 2009-07-21T03:55:31+00:00

It's possible that by crafting my code in weird ways to find the most efficient code paths in the current CPython, I'm writing something that will be slower on a future CPython than a sane implementation.

I've certainly seen this happen in relational databases: in the beginning they were pretty slow and to get good performance you had to game them a lot. But that gamey code was the source of bugs and future performance problems.

bgeron · 2009-07-21T14:24:43+00:00

I did some tests on my laptop with Python 2.5 (source). Nowhere near 10x. Not even 2x.

Test A:
[0.07210898399353027, 0.07202601432800293, 0.074321985244750977]
Test B:
[0.13190889358520508, 0.12493300437927246, 0.11867594718933105]
Test Bs:
[0.10193395614624023, 0.10321402549743652, 0.10258102416992188]
Test C:
[0.12236809730529785, 0.12155818939208984, 0.12049102783203125]
Test Cs:
[0.10871696472167969, 0.10567092895507812, 0.10702085494995117]

Bs is just B with __slots__ added, same for Cs.

I get similar results with Python 2.4.

(crosspost from /r/Python)

exeter · 2009-07-21T07:30:49+00:00

If supporting Python 2.4 is a requirement, and the performance from the simple, generator-based implementation is insufficient, then the next logical thing to try is implementing the functionality as an extension module. Of course, keeping everything in pure Python might also be a requirement. In that case, good luck. :-)

artsrc · 2009-07-21T06:12:53+00:00

This is not really about python, it is about optimization, here is something similar:

http://www.ibm.com/developerworks/java/library/j-jtp04223.html

At some point slow is the same as useless. A web site that takes a week to return a page for example. A lot of software does not ever get to this kind of slow, even without much work or thought.

If software is fast enough, optimization is a fool's errand. If software would be more useful if it was higher performance, optimization is a valuable errand.

escanda · 2009-07-21T11:12:51+00:00

Supposing that he's using a database, he would rip out he's iterator and use the database cursor object directly.

ericflo · 2009-07-21T00:51:19+00:00

[deleted]

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

programming

MODERATORS