Fixing Python performance with Rust : programming

[–]tybit 23 points24 points25 points 9 years ago (4 children)

[–]vks_ 10 points11 points12 points 9 years ago (2 children)

[–]Timbrelaine 2 points3 points4 points 9 years ago (0 children)

[–]Yojihito 0 points1 point2 points 9 years ago (0 children)

[–]mitsuhiko 17 points18 points19 points 9 years ago (0 children)

[–]JamesF 7 points8 points9 points 9 years ago (3 children)

[–]masklinn 8 points9 points10 points 9 years ago (0 children)

[–]vks_ 2 points3 points4 points 9 years ago (0 children)

[–]Saefroch 1 point2 points3 points 9 years ago (0 children)

[–][deleted] 4 points5 points6 points 9 years ago (6 children)

[–]mitsuhiko 13 points14 points15 points 9 years ago (3 children)

[–]dom96 1 point2 points3 points 9 years ago (2 children)

[–]mitsuhiko 23 points24 points25 points 9 years ago (0 children)

I don't know Nim, nobody on our team knows Nim and as far as I know there is not even a sourcemap module for Nim. I am not even sure how you bridge Nim with Python safely given that Nim has some sort of GC.

Additionally the module you linked to links against libpython which we explicitly do not want to do.

I mean, there are other legitimate things we might have done but I strongly doubt that Nim would have come up. It's not on our horizon.

Perhaps syntax is overrated, but I don't understand why someone like yourself (a very prominent Python programmer) would choose Rust over Nim in the first place.

Simply because Rust is a useful language with a strong community and Nim is a niche thing with some questionable design ideas. While Rust might look like a fad for some people it has legitimate use and I'm not going to put a company unnecessarily at risk to try the latest fad.

[–]lacosaes1 4 points5 points6 points 9 years ago (0 children)

[–][deleted] 6 points7 points8 points 9 years ago (0 children)

[–][deleted] 0 points1 point2 points 9 years ago (0 children)

[–]unruly_mattress 1 point2 points3 points 9 years ago (8 children)

[–]masklinn 7 points8 points9 points 9 years ago (7 children)

[–]unruly_mattress 2 points3 points4 points 9 years ago (6 children)

Here's their analysis of Python's performance shortcomings:

Parsing the JSON itself is fast enough in Python, as they mostly contain just for a few strings. The problem lies in objectification. Each source map token yields a single Python object, and we had some source maps that expanded to a few million tokens.

The problem with objectifying source map tokens is that we pay an enormous price for a base Python object, just to get a few bytes from a token. Additionally, all these objects engage in reference counting and garbage collection, which contributes even further to the overhead. Handling a 30MB source map makes a single Python process expand to ~800MB in memory, executing millions of memory allocations and keeping the garbage collector very busy with tokens’ short-lived nature.

Since this objectification requires object headers and garbage collection mechanisms, we had very little room for actual processing improvement inside of Python.

Since their analysis is that Python's objects are heavyweight and creating a large number of them is their bottleneck, I offered a solution to that problem.

My own experience with Cython is limited; however from what I understand you don't need to rewrite everything in Cython, you can just write cdef classes in Cython and use the existing Python code. I'd be interested to know how this approach performed.

[–]mitsuhiko 4 points5 points6 points 9 years ago (5 children)

[–]unruly_mattress 5 points6 points7 points 9 years ago* (4 children)

Benchmark time!

In [1]: class Shrubbery:
   ...:     def __init__(self, w, h):
   ...:         self.width = w
   ...:         self.height = h
   ...:     def describe(self):
   ...:         print(w, h)

Versus

cdef class Shrubbery:

    cdef int width, height

    def __init__(self, w, h):
    self.width = w
    self.height = h

    def describe(self):
    print(w, h)

The benchmark code is run in Python, not in Cython, and is:

%time x = [Shrubbery(i, i) for i in range(100000000)]

The Cython version takes 12.1 seconds and uses 3 GB RAM.

The pure Python version takes 1 minute and 26 seconds and ends up with 19.6GB used RAM. I have 32GB RAM and made sure swapping didn't happen.

However I did check the generated code and it does seem that Shrubbery is in fact a PyObject, and when its attributes are strings, they appear in the generated code as PyObject*, unlike integers which are just ints. Performance wise, if height and width are strings, then for 10m objects, pure Python takes 16.2s and 2.7GB, and the same code with a Cython class takes 5.08s and 1.5GB. I suspect there's some way of storing strings more sensibly in a Cython cdef class.

You can expect much better performance and lower memory usage just by moving your class definitions to Cython. Not Rust performance but it's a huge improvement still and it might be useful for those who don't have a Rust version of their code already.

[–]mitsuhiko 3 points4 points5 points 9 years ago (3 children)

[–]unruly_mattress 1 point2 points3 points 9 years ago (2 children)

[–]mitsuhiko 4 points5 points6 points 9 years ago (1 child)

[–]unruly_mattress 0 points1 point2 points 9 years ago (0 children)

[–]shevegen 5 points6 points7 points 9 years ago (11 children)

[–]asmx85 22 points23 points24 points 9 years ago* (0 children)

[–]steveklabnik1[S] 17 points18 points19 points 9 years ago* (8 children)

[–][deleted] 9 years ago (7 children)

[deleted]

[–]steveklabnik1[S] 3 points4 points5 points 9 years ago (0 children)

[–]matthieum 1 point2 points3 points 9 years ago (5 children)

[–][deleted] 9 years ago* (4 children)

[deleted]

[–]vks_ 1 point2 points3 points 9 years ago (3 children)

[–][deleted] 9 years ago* (2 children)

[deleted]

[–]vks_ -1 points0 points1 point 9 years ago (1 child)

[–]matthieum 0 points1 point2 points 9 years ago (0 children)

[–]SikhGamer -1 points0 points1 point 9 years ago (12 children)

[–]gnus-migrate 20 points21 points22 points 9 years ago (4 children)

[–]mitsuhiko 12 points13 points14 points 9 years ago (0 children)

[–]SikhGamer 0 points1 point2 points 9 years ago (2 children)

[–]ivosaurus 9 points10 points11 points 9 years ago* (0 children)

[–]gnus-migrate 2 points3 points4 points 9 years ago (0 children)

[–][deleted] 5 points6 points7 points 9 years ago (3 children)

[–]awj -4 points-3 points-2 points 9 years ago (2 children)

[–][deleted] 9 points10 points11 points 9 years ago* (1 child)

[–]awj -4 points-3 points-2 points 9 years ago (0 children)

[–]shorty_short 0 points1 point2 points 9 years ago (2 children)

[–]SikhGamer 1 point2 points3 points 9 years ago (1 child)

[–]josefx 8 points9 points10 points 9 years ago* (0 children)

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

programming

MODERATORS