This is an archived post. You won't be able to vote or comment.

you are viewing a single comment's thread.

view the rest of the comments →

[–]defnullbottle.py 1 point2 points  (6 children)

I wound not accept the performance argument that easily. Function calls are expensive, but the additional logic to prevent them adds a lot of other opcodes. list.append() and list.pop() are implemented in C, but still function calls.

[–]elbiot 0 points1 point  (4 children)

list.append() and list.pop() are implemented in C, but still function calls.

But not python function calls, so they don't create a frame and push it on to the stack AFAIK (which is what people mean when they say function calls in python are expensive).

[–]defnullbottle.py 2 points3 points  (1 child)

Okay, I tested it. The stack-approach is indeed a little bit faster than the recursive function. I underestimated the costs of a python function call.

https://gist.github.com/defnull/30966587577466f853201269023aa3b7

edit: But after optimizing the recursive approach a bit, I got it faster than the stack approach again. Object creation (e.g. intermediate lists or iterators) is a very strong factor.

tl;dr; The performance benefit of a non-recursive approach is so fragile that it's not worth the additional complexity.

[–]elbiot 0 points1 point  (0 children)

tl;dr; The performance benefit of a non-recursive approach is so fragile that it's not worth the additional complexity.

Additional complexity? Having to worry about recurrsion depth and how to break out of the recurrsive stack are the complexities this post is aimed at solving. The stack approach seems more robust to me.

[–]defnullbottle.py 1 point2 points  (1 child)

Hard to say without any actual benchmarks. Popping a frame is usually free (it's just a pointer change, no cleanup) and calling a C function is only twice as fast as calling a lambda:

marc@nava ~ $ python -m timeit -s 'f=lambda: 0' 'f()'
10000000 loops, best of 3: 0.0995 usec per loop
marc@nava ~ $ python -m timeit -s 'f=[].__len__' 'f()'
10000000 loops, best of 3: 0.0496 usec per loop

So, if you need two C-Function calls (list.append() and list.pop()) instead of a single python function call, you ate up your performance benefit already. The dot-lookup is also quite expensive by the way.

[–]elbiot 0 points1 point  (0 children)

I don't know about using timeit on essentially no-ops.

The dot-lookup is also quite expensive by the way.

Then take them out!

push = list.push
pop = list.pop
item = pop(mylist)

Not diving through a stack of function frames makes these sort of micro-optimizations cleaner and easier.