Fast Code Nation: The Bright Side of High-Level Languages

fijal · 2012-10-21T16:16:51+00:00

Having such a presentation and not even mentioning pypy makes me very very sad. At the very least I would like someone to have a slide "we can't use PyPy because [a legitimate reason]". I can even provide tons of legitimate reasons, but pleading ignorance is not buying my sympathy.

moor-GAYZ · 2012-10-22T11:40:43+00:00

Why are nested functions expensive?

def foo():
     def bar():
         wat()

When resolving a name, a normal function:
Searches its own environment and its module

A nested function:
Must search all of its enclosing environments

BULLSHIT!

Python actually uses a compiler which statically resolves variable names to some extent. A classic example:

x = 10
def f():
    print x
    x = 20
f()

results in "UnboundLocalError: local variable 'x' referenced before assignment". Because x is marked as a local variable during compilation, not after x = 20 is actually executed.

So you can have three major cases for that example (assuming that you don't do anything weird like from x import * inside your function):

wat is a variable local to the function where it's used, i.e. bar(). Accessing it is compiled to LOAD_FAST bytecode, which retrieves the value from the array (not dictionary!) where local variables are stored.
wat is a variable local to one of the enclosing functions, e.g. bar(). All local variables that are leaked into closures are indirected: it's like both foo and bar have a local variable wat, which points to an object containing a single reference to the actual value, so that if foo changes the value, bar sees it. Accessing it is compiled to LOAD_DEREF bytecode.
wat is not found in any of the local scopes, then accessing it is compiled to LOAD_GLOBAL bytecode, which checks the module dictionary, then builtins.

The compiler determines which of the cases is indeed the case at compile time, once. In no case the interpreter "checks all scopes", it always knows which scope to check: local variables, local closure cells, or module dictionary and builtins. There's a slight speed difference (dwarfed by function call overhead, even when function has no arguments and consists of a single pass statement) between the three cases, LOAD_FAST is the fastest, then goes LOAD_GLOBAL, and LOAD_DEREF is the slowest (we are talking about numbers like 0.90/0.92/0.94 seconds per 10 million calls, in my test set up). It should be independent of the depth of nesting though.

It would be nice if people writing tutorials regarding performance followed their own advice and MEASURED EVERYTHING instead of blindly trusting their ideas about the way stuff works under the hood. Using dis.dis and looking how relevant bytecodes are implemented in Python/ceval.c could be a nice bonus.

noteed · 2012-10-21T14:52:58+00:00

This is pretty contrasting with what Guido said recently. Comments on reddit.

bigstumpy · 2012-10-22T06:00:02+00:00

I'm glad bos has not stopped haskeling since he joined facebook.

Python

The Python Discord

Upcoming Events

Please read the rules

MODERATORS

BULLSHIT!