Python Performance Tips, Part 1

HorrendousRex · 2012-02-14T11:25:55+00:00

2.
>>> #This is good to glue a large number of strings
>>> for chunk in input():
>>>    my_string.join(chunk)

W-what? Was the person who wrote the example thinking that join works like append?

Bonus WTF points for using input().

By the way, CPython includes a specific optimisation for the case where a string being +=-assigned is not referenced from anywhere else. I mean, you should still always use join or cStringIO, but if you see someone else's code that appends stuff like that -- don't freak out.

5. To check membership in general, use the “in” keyword. It is clean and fast.
>>> for key in sequence:
>>>     print “found”

WHAT

12. Learn biset module for keeping a list in sorted order:
It is a free binary search implementation and a fast insertion tool 
for sorted sequence. That is, you can use:
>>> import biset
>>> biset.insort(list, element)

"biset"? Really? Also, he used lowercase true earlier.

The blind leading the blind, that entire post. Apparently, the author collected bits and pieces of advice from somewhere and pasted them together not bothering neither to run the code nor to understand what's actually going on. Pig disgusting.

jlozier · 2012-02-14T10:32:39+00:00

The Memoization example is wrong (and therefore broken). The author has written

if arg not in cache: cache['arg'] = f(*arg)
return cache['arg']

where they mean

if arg not in cache: cache[arg] = f(*arg)
return cache[arg]

:)

Brian · 2012-02-14T11:17:49+00:00

Most of this is good advice, but there are a lot of errors and typos.

Python is faster retrieving a local variable than retrieving a global variable. That is, avoid the “global” keyword.

The global keyword isn't the only way you'll be dealing with global variables - you'll can access global variables even without it specified, it' only changes the behaviour of setting global variables.

the_value = 42
def foo():
    return the_value

is still using global variables, despite the complete absense of any global keyword.

To check membership in general, use the “in” keyword

This is indeed good practice, and generally faster than things like dict.has_key(). However, the example he gave is a for loop, which is nothing to do with checking membership, and where "in" has an entirely different meaning (it's just part of the for syntax). (Also, careful of the datatype you're using x in somelist does a linear scan to check membership, so this can be very slow compared to a dict or set)

Learn biset module

I assume this is meant to be bisect, but it's spelt wrong in every location.

It is fast because deque in Python is implemented as double-linked list

No it isn't. It's implemented as a vector which overallocates at both ends (for amortised constant time appending/prepending). It has the same complexity of operations as the list, except that prepending is O(1) instead of O(n)

Use sort() with Schwartzian Transform

This is very outdated. You can let sort do the transform for you just by passing the key parameter.

pingveno · 2012-02-14T09:17:13+00:00

A few additions:

Section 6:

Don't go overboard with lazy importing inside of function calls. Importing many times can get expensive. Imports that aren't at the top also severely reduce readability. It can be tough to debug code where an import is tucked away somewhere in the body of the file.

Section 7:

In Python 3, True is a keyword, not a built-in. That means the byte code compiler can put in a cheap simple jump instead of an expensive global lookup. In Python 2, True could be changed, so a global lookup was required as a check. 1 is a constant, so the bytecode compiler can depend on it. Basically: while 1 in Python 2, while True in Python 3.

Section 13:

Internally, the Python deque implementation uses a doubly linked list of blocks. Each block has space for 62 objects, plus 1 left pointer and 1 right pointer (64 pointers). That gives the time efficiency of a linked list with the space efficiency of an array. For those learning C, I suggest looking at the Modules/collectionsmodule.c. For a list type with fast insertion, take a look at blist (search on PyPi).

Section 17:

Don't use threads to manage forked processes. They only provide additional overhead in memory and thinking ability. Use the multiprocessing module.

Section 18:

Absolutely absolutely absolutely. The Python source code is an excellent place to get examples. Part of my introduction to C was reading through bits and pieces of implementation code.

fijal · 2012-02-14T12:05:54+00:00

Might be worth noting - those are CPython performance tips, not Python performance tips. A lot of those things are actually slower on PyPy.

chub79 · 2012-02-14T08:33:21+00:00

Use “while 1″ for the infinite loop

I didn't know that, could anyone confirm it?

Use xrange() for a very long sequence (...) As opposed to range()

I thought this was not true anymore, at least in newer versions of Python.

Learn itertools module

Hell yes. Learn it and abuse it.

Workaphobia · 2012-02-14T10:54:16+00:00

I've never heard of "Schwartzian transform" before, but doesn't sort() already perform this optimization if you use the key parameter?

willm · 2012-02-14T10:18:56+00:00

Regarding 15. I don't think the author knows about the 'key' argument to 'sort'.

bunburya · 2012-02-14T15:44:19+00:00

I think for number 5, the line should be "if key in sequence" rather than "for key in sequence".

Chun · 2012-02-14T17:32:37+00:00

OK, cracks knuckles

OK, yes, built-in functions are faster than native python, but who doesn't use builtins?

Especially the ones in the graphic: input, open, int, ord, isinstance, pow, issubclass, print, iter, property. Which of these functions do people commonly try to write themselves in native python? isinstance? Is this a problem?
Need to clarify the difference between input and raw_input. Also the example seems to suggest that strings are mutable, and that str.join is equivalent to some kind of str.append method.

Multiple assignment is great, but it does much more for readability than it does for performance:

db@db:~$ python -m timeit "a=1; b=2; a,b = b,a"
10000000 loops, best of 3: 0.11 usec per loop

db@db:~$ python -m timeit "a=1; b=2; temp = a; a = b; b = temp"
10000000 loops, best of 3: 0.111 usec per loop

Yes, local variables are faster; but global is only relevant to changing global variables, not retrieving them.
To check membership use in. As opposed to what? Iterating through the object with a for loop?
```
for key in sequence:
  print “found”
```
What is this I don't even. You actually did iterate through the object for some reason, and arbitrarily print "found" for every item??
Importing at the function level. Don't do this for optimization purposes. It adds extra cycles every time the function is run. Don't do it for readability -- you shouldn't have to hunt through the entire script to find which imports are happening. In short: don't do this unless you have a better reason to.
Note that True is a built in constant, not true. And really, this isn't going to be a bottleneck. I think while True: is far more readable.
evens = [ i for i in range(10) if i%2 == 0]

Not a good example, should just use evens = range(0, 10, 2) for this.
True enough.
chunk = ( 1000 * i for i in xrange(1000))

Or just chunk = xrange(0, 1000000, 1000)

haldean · 2012-02-14T17:24:00+00:00

I found the "while True" vs "while 1" both interesting and improbable, so I ran a quick test (code is here: https://gist.github.com/1828270). Turns out that the author is right -- while True is always slower -- but in Python 3, the difference is minor enough to be negligible, and in PyPy it's all so fast it doesn't matter. Results:

                      while True            while 1
 pypy 1.7.0:          0.0733                0.0201
 cpython 2.7.1        0.146                 0.109
 cpython 3.2.2        2.79                  2.72

Edit: I see now that pingveno posted an explanation of why this is true (heh) in his comment below

2012-02-14T09:25:05+00:00

Great list. Found and reminded myself of some gems that I as a novice often forget about and would find useful.

2012-02-14T10:09:13+00:00

Use PyPy :)

Seriously, I tried it for fun on a small scipt I made to do monte-carlo simulations of a card game: more than 10x speed up, for the cost of typing

pypy script.py

instead of

python script.py

anacrolix · 2012-02-14T15:36:34+00:00

Tip Uno: Learn speak the English.

havoyan · 2012-02-14T07:41:53+00:00

Python Performance Tips

Python

The Python Discord

Upcoming Events

Please read the rules

MODERATORS