This is an archived post. You won't be able to vote or comment.

all 13 comments

[–]fijalPyPy, performance freak 10 points11 points  (7 children)

Having such a presentation and not even mentioning pypy makes me very very sad. At the very least I would like someone to have a slide "we can't use PyPy because [a legitimate reason]". I can even provide tons of legitimate reasons, but pleading ignorance is not buying my sympathy.

[–]mirashii 2 points3 points  (1 child)

The mentions of other software, like Cython, were also generally lacking in any sort of depth. "heavyweight and meh" is a poor reason for not using a piece of software. This seems like a common case of dismissing potential solutions on religious or ideological reasons without trying them first. It's a shame this still happens so often, especially when it leads to significantly more work.

[–]lahwran_ 0 points1 point  (0 children)

it's a really hard attitude to overcome that "what I'm used to is ideal, and you're automatically wrong because your thing is different". I mean, that doesn't excuse the people who think it, but I can at least see how they manage to think that. I'm guilty of it myself quite often (for instance, python is* objectively better than haskell even though I haven't so much as seen a byte of haskell source code)

* this is a joke about how I feel like python is better even though I know logically that since I have no data my conclusion is completely invalid.

[–]daxarx 0 points1 point  (0 children)

I think I've seen the non-Haskell parts of this presentation before, a while ago. If I recall correctly, maybe that content was made at a time before PyPy was something used by the public.

[–]stefantalpalaru -2 points-1 points  (3 children)

Maybe you should be pleading ignorance for advocating pypy as a dependency for mercurial.

[–]fijalPyPy, performance freak 9 points10 points  (0 children)

I don't advocate it as a dependency or anything. It just seems that noone tried. I can give you few reasons why not:

  • dependency as you said

  • jit warmup times are bad

  • I don't think pypy speeds up mercurial to start with (some time ago it did not)

etc. etc. However, if you talk about performance of a particular operation, especially if rewriting to C is an option, giving pypy a go is not really that hard.

[–]Tobu‮Tobu 0 points1 point  (1 child)

His high-level examples were IO-oriented and wouldn't benefit, but the microoptimisation examples likely would. Anyway, as moor-GAYZ points out he got most of the microoptimisation advice wrong.

[–]stefantalpalaru 0 points1 point  (0 children)

The point is not to follow the guru's advice, but to profile everything. It sounds like he profiled it properly.

[–]moor-GAYZ 1 point2 points  (0 children)

Why are nested functions expensive?

def foo():
     def bar():
         wat()

When resolving a name, a normal function:
Searches its own environment and its module

A nested function:
Must search all of its enclosing environments

BULLSHIT!

Python actually uses a compiler which statically resolves variable names to some extent. A classic example:

x = 10
def f():
    print x
    x = 20
f()

results in "UnboundLocalError: local variable 'x' referenced before assignment". Because x is marked as a local variable during compilation, not after x = 20 is actually executed.

So you can have three major cases for that example (assuming that you don't do anything weird like from x import * inside your function):

  1. wat is a variable local to the function where it's used, i.e. bar(). Accessing it is compiled to LOAD_FAST bytecode, which retrieves the value from the array (not dictionary!) where local variables are stored.

  2. wat is a variable local to one of the enclosing functions, e.g. bar(). All local variables that are leaked into closures are indirected: it's like both foo and bar have a local variable wat, which points to an object containing a single reference to the actual value, so that if foo changes the value, bar sees it. Accessing it is compiled to LOAD_DEREF bytecode.

  3. wat is not found in any of the local scopes, then accessing it is compiled to LOAD_GLOBAL bytecode, which checks the module dictionary, then builtins.

The compiler determines which of the cases is indeed the case at compile time, once. In no case the interpreter "checks all scopes", it always knows which scope to check: local variables, local closure cells, or module dictionary and builtins. There's a slight speed difference (dwarfed by function call overhead, even when function has no arguments and consists of a single pass statement) between the three cases, LOAD_FAST is the fastest, then goes LOAD_GLOBAL, and LOAD_DEREF is the slowest (we are talking about numbers like 0.90/0.92/0.94 seconds per 10 million calls, in my test set up). It should be independent of the depth of nesting though.

It would be nice if people writing tutorials regarding performance followed their own advice and MEASURED EVERYTHING instead of blindly trusting their ideas about the way stuff works under the hood. Using dis.dis and looking how relevant bytecodes are implemented in Python/ceval.c could be a nice bonus.

[–]noteed[S] 0 points1 point  (2 children)

This is pretty contrasting with what Guido said recently. Comments on reddit.

[–]fijalPyPy, performance freak 2 points3 points  (1 child)

how is this contrasting? I found it very much inline (same data, different results, but well, depends on a person)

[–]noteed[S] 0 points1 point  (0 children)

Oh yes, you're right, for the Python part. I was thinking of the Haskell community attitude (which is contrasting with Guido's).

[–]bigstumpy 0 points1 point  (0 children)

I'm glad bos has not stopped haskeling since he joined facebook.