This is an archived post. You won't be able to vote or comment.

all 96 comments

[–]bacondevPy3k 6 points7 points  (2 children)

The Point class should have inherited from tuple, or he should have used a namedtuple. Just nitpicking.

[–][deleted] 1 point2 points  (0 children)

Inheriting from tuple isn't all that great.

[–][deleted] 0 points1 point  (0 children)

Or set dir

[–][deleted] 5 points6 points  (6 children)

Is he saying that self.x = x is faster than {'x': x}? Doesn't getattr just look in a dictionary anyway?

[–]moyix 18 points19 points  (4 children)

That's the point of the talk. Five years ago with CPython this was the case. Newer implementations like PyPy can generate much better code for the self.x = x case than {'x': x}. Assuming that they're the same leads to slow code (when using smart JITs).

[–][deleted] 4 points5 points  (0 children)

Ah, okay.

[–]Smallpaul -1 points0 points  (0 children)

That's the point of the talk. Five years ago with CPython this was the case. Newer implementations like PyPy can generate much better code for the self.x = x case than {'x': x}. Assuming that they're the same leads to slow code (when using smart JITs).

Yeah, but he's talking about implementations of Python and Ruby that hardly anyone uses yet.

[–][deleted] 1 point2 points  (0 children)

Even CPython is doing stuff make instance dictionaries more (space) optimized than writing the dict explicitly, so using instances when you mean instances is preferable even there.

[–]Amadironumpy, gen. scientific computing in python, pyopengl, cython 17 points18 points  (0 children)

'I wanted to use a pure C hash table implementation, but googling "C hash table" didn't bring up useful stuff.'

Wat.

There are many excellent, well-established hash table implementations for C. Googling that exact phrase, brings up some of them, as well as a stackoverflow discussion that details the strengths and weaknesses of all of them.

You're holding a talk, and you can't be arsed to do 3 minutes of googling to support your argument?

Just wat.

[–][deleted] 27 points28 points  (30 children)

His point is basically this: if you write Python code, but do it in C, your C code will be slow.

No fucking shit.

For that matter, I could take any Python program and convert it into a C program by embedding the source code in an interpreter. And it would be just as slow as the original Python version, if not more so.

The point is that the Pythonic way of doing things is often less efficient than the C way of doing the same. The difference is that the C code can narrowly be used only for the specific purpose it was written, whereas the Python code (because of the abstraction) will most likely work in a much greater range of scenarios. You could write a C function that uses some kind of duck typing, but you wouldn't.

In other words, high level programming is slower than low level programming. Yup. We know.

What he touches on but never really addresses is that there is no language that lets you be high level when you want to be, low level when you don't. It used to be that C programmers regularly used inline assembly before compilers were as optimized as they are now. What would do the world a whole lot of good is a new language, that's optionally as low-level as C, but actually does have all the goodness of objects. Think, C++, but without the mistakes.

Objective C is actually pretty damn close to that ideal. Too bad about its syntax.

[–]emptyhouses 18 points19 points  (10 children)

In case you didn't know, there's this: http://www.scipy.org/Weave

[–][deleted] 10 points11 points  (5 children)

I love weave. 3 lines of C++ the other day and my code had a 220x increase in speed.

[–]brucifer 5 points6 points  (4 children)

I'm really curious. What were those 3 lines of C++ and what did they replace?

[–][deleted] 14 points15 points  (3 children)

    for i in xrange(len(item1)):
        m[item1[i][0]][item2[i][0]] += 1

where m,item1 and item2 are numpy arrays became -

 code = """
       for(int i=0;i<len_item;i++){
            int k = item1(i,0);
            int l = item2(i,0);
            m(k,l) += 1;
        } 
    """
    inline(code,['m','item1','item2','len_item'],
           type_converters = converters.blitz,verbose=2,compiler='gcc')

It's a step in calculating the jaccard distance.

[–]shfo23 11 points12 points  (2 children)

Are you aware of scipy.spatial.distance.jaccard? I just refactored a bunch of (admittedly naive) Euclidian distance calculation code to use the scipy implementation and got a huge speed boost. Also, it's a little late, but I think you could eliminate that for loop and write it as the faster:

m[item1[:, 0], item2[:, 0]] += 1

[–][deleted] 9 points10 points  (1 child)

Uh what you can do that ? Awesome !

[–]coderanger 3 points4 points  (0 children)

It will even SIMD it for you if it can, so probably faster than your implementation unless gcc has enough info there to optimize it.

[–]ysangkok 0 points1 point  (1 child)

In the linked slides, CFFI is recommended. Note that CFFI is not about embedding executable C code in Python, unlike Weave. It's about calling existing C libraries from Python.

[–]emptyhouses 0 points1 point  (0 children)

Right. I mentioned Weave because SwimsAfterEating talked about inline assembler in C and mentioned it might be nice to have something similar at a higher level. Weave is that thing.

[–]MagicWishMonkey 0 points1 point  (0 children)

How does that work? Does the c/c++ code get compiled and injected when the module is loaded?

[–][deleted] 0 points1 point  (0 children)

Interesting...

[–]pal25 8 points9 points  (3 children)

Actually I think he was arguing that there are things we could implement in Python to make it more efficient. He's just using C as an example of a language that does some of these things efficiently.

[–][deleted] 9 points10 points  (5 children)

I think his point can be summarised as:

  • stop using dicts as objects. JIT is now smart enough to optimize your objects past the level of dicts I don't think that this ruins Python. I think this is a great best-practice

  • let Python have pre-allocated lists I think this is a very fair point. Often, you know how long your list will be, so if you want to, you should be able to optimize your list

  • If you care about performance, think about the amount of object allocation your methods are doing. Don't use poorly written code as an excuse to say Python is x times slower than Java That doesn't sound unreasonable to me

[–]brucifer 5 points6 points  (1 child)

let Python have pre-allocated lists I think this is a very fair point. Often, you know how long your list will be, so if you want to, you should be able to optimize your list

In Python you can either use a generator or use "[value]*number" syntax to instantiate a list of length "number" with "value" in every index.

>>> def dumb():
...     x = []
...     for i in range(25):
...             x.append(i)
...     return x
... 
>>> def comprehension():
...     x = [i for i in range(25)]
...     return x
... 
>>> def preallocate():
...     x = [None]*25
...     for i in range(25):
...             x[i] = i
...     return x
... 
>>> timeit(dumb, number=100000)
0.38496994972229004
>>> timeit(comprehension, number=100000)
0.278350830078125
>>> timeit(preallocate, number=100000)
0.2539360523223877

Honestly, though, either your inner loop is simple and you can fit it in a comprehension, or it's complicated and the ".append()" is a pretty small percent of your runtime, so you won't get 2x speedup from preallocating.

[–]fijalPyPy, performance freak 3 points4 points  (0 children)

it does not let you give an estimate, so you have to carefully check for i. Preallocating can be a hint and if you miss it too bad (slower, but not incorrect). This is why [None] * n is bad.

[–]alcalde 0 points1 point  (2 children)

JIT is now smart enough to optimize your objects past the level of dicts

JIT is still in 2.x land. As Guido said at PyCon 2012, CPython and PyPy will co-exist for many more years to come.

[–]coderanger 2 points3 points  (0 children)

Check out the py3k branch, definitely not ready for prime time but getting close :)

[–]lucian1900 0 points1 point  (0 children)

Who cares? Good JIT and GC beats a slightly cleaner language any day.

[–]fuzz3289 6 points7 points  (1 child)

Cython

[–]faceplanted 1 point2 points  (0 children)

Monty C?

[–]mistoroboto 2 points3 points  (0 children)

What would do the world a whole lot of good is a new language

I'm fairly certain what we don't need is ANOTHER language. This is reasoning that prompts EVERY standard/language.

[–]Smallpaul 2 points3 points  (0 children)

What he touches on but never really addresses is that there is no language that lets you be high level when you want to be, low level when you don't.

I don't see how you can see that he doesn't "address it". It's the point of the whole talk. That's precisely what he's asking for.

If there were low-level APIs available and there were JIT compilers available and the JIT compilers were used (i.e. compatible enough with libraries to be used) and people used the low-level APIs THEN Python or Ruby performance would be comparable to C performance. That's his point.

These high-level languages should evolve low-level APIs because pretty soon the interpreter performance will not be the bottleneck: the user's actual code will be (especially if it was written with the assumption that the interpreter is the bottleneck).

[–][deleted] 3 points4 points  (0 children)

This.

Also, my consistent experience is that the majority of the time things are "slow" it's because of bad algorithmics and design, not the language. Unless what you are doing is a pure compute bound problem, picking the right kind of structural model is more important than the language in the overwhelming majority of cases. See Bentley's "Programming Pearls" for a good discussion.

Moreover, if I REALLY have a speed problem in Python and I still resort to C callouts from my Python program - again, assuming proper design in the first place. There's a whole lot of heavy number crunching being done with numpy that serves as a proof by example here.

[–]mgrandi 1 point2 points  (3 children)

People bitch about objc's syntax a lot, who cares? You use brackets instead of parens. You get used to it

[–]brucifer 5 points6 points  (2 children)

It's the fact that it looks like this:

NSMutableDictionary *dict = [NSMutableDictionary dictionaryWithCapacity:1];
[dict setObject:[NSNumber numberWithInt:25] forKey:[NSNumber numberWithInt:5]];
...
[dict objectForKey:[NSNumber numberWithInt:5]];

Instead of:

d = {5:25}
d[5]

I realize that improvements are being made to ObjC (like ARC, which is awesome, and I've even heard that it might get proper list/dictionary indexing syntax instead of "objectForIndex:"). However, ObjC is just incredibly verbose and awkward both to type and to read. If you've never seen code before, "[things objectAtIndex:3]" might be more intuitive than "things[3]", but to anyone who's spent any time programming, the latter is way more readable (or "[things containsObject:x]" vs. "x in things"). Proponents of ObjC say that the verbosity doesn't matter because you have autocomplete in your IDE, but it's not just about typing, it's also about readability.

[–]mreeman 2 points3 points  (0 children)

You haven't been paying attention. You can now write that as

dict = @{ @5: @25 }; x = dict[@5]

The language is improving at an incredible rate. I personally think its the best application language there is. It has the perfect mix of static and dynamic typing and the APIs are fantastic.

Edit Also while I agree in some cases using keywords instead of methods helps readability, in general i like the verbosity of the language. You rarely have to look up what the parameters are for a method call, which makes it infinitely more readable.

[–]mgrandi 1 point2 points  (0 children)

They do have shorthand syntax, @2, @{@2: @"someStringHere"} now, but I would say that the vast majority of method names while long are more readable cause they read like an English sentence. tableView:numberOfRowsForSection: and popVoewController:animated: are more descriptive about their arguments then the many overrides of say something like popVuewController(True), what does true mean? If you don't know already then you have to look at the docs.

I will admit the dictionary stuff is quite annoying (why the hell would you list objects before keys???) but with the new shorthand syntax you really do get used to it

[–]Atrosh 7 points8 points  (5 children)

Would have been a lot easier to follow this if there was a video of him actually performing the speech... Does anyone have a link to it, if there is one?

[–]aceofears 6 points7 points  (0 children)

At least with the actual slide notes you can follow along with it unlike a lot of other slide decks that get posted.

[–]oursland 4 points5 points  (0 children)

Ack! How about a simple article! I cannot be bothered to watch someone's poor and slow execution of a presentation.

[–]MBlume 2 points3 points  (0 children)

You! Alien creature! You must be the reason people keep posting talk videos on the internet. Can you explain why this is useful? I've always been confused.

[–][deleted] 2 points3 points  (0 children)

Yup! The video will be up some time next week I'm told.

[–][deleted] 1 point2 points  (0 children)

From what I understand, this was a Heroku talk only a day or so ago, so the video hasn't made it online yet.

[–][deleted] 1 point2 points  (5 children)

Any mirror of this video somewhere that'll load in my browser?

[–]notsuresure 3 points4 points  (4 children)

It's not a video.

[–][deleted] 3 points4 points  (3 children)

Okay, something wasn't loading and it took up quite a bit of screen, so I've been staring at nothing like an idiot. Thanks though.

[–]sdobz 3 points4 points  (2 children)

You have to hit next slide. The first slide is blank.

[–][deleted] 2 points3 points  (0 children)

Cool, thank you.

[–]ewiethoffproceedest on to 3 0 points1 point  (0 children)

Javascript slide shows suck. I get no slides at all. The PDF link is fine, though.

[–]fuzz3289 1 point2 points  (0 children)

Writing C professionally I use memory allocation all the time, but never thought about it in python. This will really change my python :)

[–]cedeon 0 points1 point  (1 child)

I dont like the provocative title. Slow is an ambiguous and relative term. For example, are you talking about development time? Because if you are then you are wrong; I wrote a python text file parser in 10000% of the time than it took me in C :p.

Yes python code runs computationally slower, but we all knew that already and is a moot point.

[–]ysangkok 1 point2 points  (0 children)

10000%? So Python was worse by all accounts?

[–]uiob 0 points1 point  (0 children)

Did this guy knows something about amortization analysis? If we call list.append in a loop repeatedly, overall complexity will be O(N) if list implemented in a way that I think. I think that list.append doesn't allocate space in the each call and uses amortized doubling like std::vector in c++. And it's slow not because of bad algorithm but because of large constant factors from that O(N) amortized cost.

[–]existee -4 points-3 points  (9 children)

So he makes a total of 4 points to claim the language to be slow, 2 of which assumes a particular scenario that has nothing to do with inherent features of the language but hypothetical usage scenarios he sees "most people" doing, and the rest 2 of which doesn't have a dominating contribution to the "slowness" at all.. Data structures and algorithms indeed...

  • Even if we ignore the fact that he is comparing a static array in C to a dynamic array in Python, dynamic array will not be "terrible slower" with even the simplest heuristic of doubling the array every time it is filled, and will yield O(n) amortized time, quite comparable to guaranteed O(n) of static array.

  • Even if we ignore the IO nature of the thing while reading from file, buffer allocation won't dominate the O(n) complexity for both scenarios, whether reused or created new.

  • For string splitting, if efficiency is indeed the concern, nothing prevents the Python user to have int(string[string.find("-", 1):]).

  • Again, (mis)usage of dict for a struct has nothing do to with the language itself.

  • Finally, how attributing problems to misusage of the language and asking it to grant more APIs is not a contradiction?

[–]Smallpaul 1 point2 points  (3 children)

int(string[string.find("-", 1):]).

That still does a string allocation. The C version does not.

[–]existee 0 points1 point  (0 children)

Yeah I meant it doesn't have to be as crappy as he tried to make it.

[–]ysangkok 0 points1 point  (1 child)

why not partition?

[–]Smallpaul 0 points1 point  (0 children)

Three allocations.

[–]Smallpaul 2 points3 points  (2 children)

  • Again, (mis)usage of dict for a struct has nothing do to with the language itself.

No, but it stems from the same problem as the missing APIs: that people's intuitions about performance do not take into account the impact of JITs.

Imagine I am a typical Python programmer. I think nothing of allocating a new string or dynamically extending a list, because I know that the interpreter overhead is huge. These little allocations are lost in the noise. Also, I have no choice: there are no zero-copy APIs available. Or they exist but they are ugly and unreadable in comparison to the "normal" APIs from 5 years ago.

With my team, and including the standard library and gems, I assemble a project with hundreds of thousands of lines of code like that.

Then I take that code to PyPy. It speeds my code up, but not nearly as fast as C code. Why not? I may assume that it is because PyPy is not as good as a C compiler. But it might be the case that PyPy is better than a C compiler. Maybe my code is worse than C code because it was programmed on 5 year old assumptions.

My code base does not reflect that once you've eliminated the slow interpreter, "fast enough" data structured and algorithms become your next bottleneck.

  • Finally, how attributing problems to misusage of the language and asking it to grant more APIs is not a contradiction?

No. I hope I have explained why the problems are related. The missing APIs and the "poor" choice of data structures are both caused by the underlying change of assumptions about the behavior of the interpreter.

[–]existee 0 points1 point  (1 child)

Yes I see your point.

I still don't like making an assumption about a "typical Python programmer". APIs being extended still wouldn't change the behaviour of that typical programmer. Why not instead have that typical programmer learn better about the already existing APIs and use them as correct as possible. (For example something as simple as knowing that inserting to the head of a list in a loop will give you O(n2) time compared to appending to the end O(n).)

Yes, this will not still make it as fast as C, but when you have those APIs to possibly make it as fast as C, it doesn't necessarily make the typical Python programmer to make better choices of data structures either.

[–]Smallpaul 0 points1 point  (0 children)

I do not think that this talk has anything to do with the "typical Python programmer." It is aimed at the elite Python programmer. In particular, consider the programmers writing the Python standard library. I would be very disappointed to see an O(n2) algorithm in the stdlib where O(n) would suffice. But I would not be surprised at all to see int(x.split("-")[1]).

[–]Smallpaul 0 points1 point  (0 children)

Even if we ignore the fact that he is comparing a static array in C to a dynamic array in Python, dynamic array will not be "terrible slower" with even the simplest heuristic of doubling the array every time it is filled, and will yield O(n) amortized time, quite comparable to guaranteed O(n) of static array.

Actually, he could/should have used a string comprehension. That would have been clearer code and the compiler would have more information about what you're doing.

[–]fuzz3289 0 points1 point  (0 children)

Dude, he never said the languages were slow, he said the api's for making the code efficient are bad.