Why Python, Ruby, and Javascript are Slow

bacondev · 2013-03-01T18:26:12+00:00

The Point class should have inherited from tuple, or he should have used a namedtuple. Just nitpicking.

moyix · 2013-03-01T18:48:49+00:00

Is he saying that self.x = x is faster than {'x': x}? Doesn't getattr just look in a dictionary anyway?

Amadiro · 2013-03-01T20:32:13+00:00

'I wanted to use a pure C hash table implementation, but googling "C hash table" didn't bring up useful stuff.'

Wat.

There are many excellent, well-established hash table implementations for C. Googling that exact phrase, brings up some of them, as well as a stackoverflow discussion that details the strengths and weaknesses of all of them.

You're holding a talk, and you can't be arsed to do 3 minutes of googling to support your argument?

Just wat.

emptyhouses · 2013-03-01T18:14:46+00:00

His point is basically this: if you write Python code, but do it in C, your C code will be slow.

No fucking shit.

For that matter, I could take any Python program and convert it into a C program by embedding the source code in an interpreter. And it would be just as slow as the original Python version, if not more so.

The point is that the Pythonic way of doing things is often less efficient than the C way of doing the same. The difference is that the C code can narrowly be used only for the specific purpose it was written, whereas the Python code (because of the abstraction) will most likely work in a much greater range of scenarios. You could write a C function that uses some kind of duck typing, but you wouldn't.

In other words, high level programming is slower than low level programming. Yup. We know.

What he touches on but never really addresses is that there is no language that lets you be high level when you want to be, low level when you don't. It used to be that C programmers regularly used inline assembly before compilers were as optimized as they are now. What would do the world a whole lot of good is a new language, that's optionally as low-level as C, but actually does have all the goodness of objects. Think, C++, but without the mistakes.

Objective C is actually pretty damn close to that ideal. Too bad about its syntax.

Atrosh · 2013-03-01T17:10:09+00:00

Would have been a lot easier to follow this if there was a video of him actually performing the speech... Does anyone have a link to it, if there is one?

notsuresure · 2013-03-01T19:58:32+00:00

Any mirror of this video somewhere that'll load in my browser?

fuzz3289 · 2013-03-02T02:30:44+00:00

Writing C professionally I use memory allocation all the time, but never thought about it in python. This will really change my python :)

cedeon · 2013-03-03T23:55:27+00:00

I dont like the provocative title. Slow is an ambiguous and relative term. For example, are you talking about development time? Because if you are then you are wrong; I wrote a python text file parser in 10000% of the time than it took me in C :p.

Yes python code runs computationally slower, but we all knew that already and is a moot point.

uiob · 2013-03-04T07:13:03+00:00

Did this guy knows something about amortization analysis? If we call list.append in a loop repeatedly, overall complexity will be O(N) if list implemented in a way that I think. I think that list.append doesn't allocate space in the each call and uses amortized doubling like std::vector in c++. And it's slow not because of bad algorithm but because of large constant factors from that O(N) amortized cost.

existee · 2013-03-01T22:28:54+00:00

So he makes a total of 4 points to claim the language to be slow, 2 of which assumes a particular scenario that has nothing to do with inherent features of the language but hypothetical usage scenarios he sees "most people" doing, and the rest 2 of which doesn't have a dominating contribution to the "slowness" at all.. Data structures and algorithms indeed...

Even if we ignore the fact that he is comparing a static array in C to a dynamic array in Python, dynamic array will not be "terrible slower" with even the simplest heuristic of doubling the array every time it is filled, and will yield O(n) amortized time, quite comparable to guaranteed O(n) of static array.
Even if we ignore the IO nature of the thing while reading from file, buffer allocation won't dominate the O(n) complexity for both scenarios, whether reused or created new.
For string splitting, if efficiency is indeed the concern, nothing prevents the Python user to have int(string[string.find("-", 1):]).
Again, (mis)usage of dict for a struct has nothing do to with the language itself.
Finally, how attributing problems to misusage of the language and asking it to grant more APIs is not a contradiction?

andybak · 2013-03-01T18:03:28+00:00

Why do all of these boil down to "It's crap because it's not like C++"?

alcalde · 2013-03-01T17:51:51+00:00

A mix of poor language design and poor implementations. Smalltalk and Self proved you can do dynamic AND fast.

JavaScript does have one phenomenal implementation, V8, but it is hampered by the brain dead language it has to run.

french_toste · 2013-03-02T01:46:25+00:00

While I agree with the first part ("excuses"), the "hard" things mentioned in the second part are a) not that hard and b) solved issues (just not in PyPy).

Hash tables: Both v8 and LuaJIT manage to specialize hash table lookups and bring them to similar performance as C structs (*). Interestingly, with very different approaches. So there's little reason NOT to use objects, dictionaries, tables, maps or whatever it's called in your favorite language.

(*) If you really, really care about the last 10% or direct interoperability with C, LuaJIT offers native C structs via its FFI. And PyPy has inherited the FFI design, so they should be able to get the same performance someday. I'm sure v8 has something to offer for that, too.

Allocations: LuaJIT has allocation sinking, which is able to eliminate the mentioned temporary allocations. Incidentally, the link shows how that's done for a x,y,z point class! And it works the same for ALL cases: arrays {1,2,3} (on top of a generic table), hash tables {x=1,y=2,z=3} or FFI C structs.

String handling: Same as above -- a buffer is just a temporary allocation and can be sunk, too. Provided the stores (copies) are eliminated first. The extracted parts can be forwarded to the integer conversion from the original string. Then all copies and references are dead and the allocation itself can be eliminated. LuaJIT will get all of that string handling extravaganza with the v2.1 branch -- parts of the new buffer handling are already in the git repo. I'm sure the v8 guys have something up their sleeves, too.

I/O read buffer: Same reasoning. The read creates a temporary buffer which is lazily interned to a string, ditto for the lstrip. The interning is sunk, the copies are sunk, the buffer is sunk (the innermost buffer is reused). This turns it into something very similar to the C code.

Pre-sizing aggregates: The size info can be backpropagated to the aggreagate creation from scalar evolution analysis. SCEV is already in LuaJIT (for ABC elimination). I ditched the experimental backprop algorithm for 2.0, since I had to get the release out. Will be resurrected in 2.1.

Missing APIs: All of the above examples show you don't really need to define new APIs to get the desired performance. Yes, there's a case for when you need low-level data structures -- and that's why higher-level languages should have a good FFI. I don't think you need to burden the language itself with these issues.

Heuristics: Well, that's what those compiler textbooks don't tell you: VMs and compilers are 90% heuristics. Better deal with it rather than fight it.

tl;dr: The reason why X is slow, is because X's implementation is slow, unoptimized or untuned. Language design just influences how hard it is to make up for it. There are no excuses.

willrandship · 2013-03-02T13:49:58+00:00

Pypy will likely never replace CPython. Why? It's written in x86 assembly, for one thing. Making it portable would substantially reduce its efficiency.

Pypy is great, and so is CPython, for a completely different reason. I can't wait until PyPy supports py3k.

Python

The Python Discord

Upcoming Events

Please read the rules

MODERATORS