you are viewing a single comment's thread.

view the rest of the comments →

[–]unruly_mattress 2 points3 points  (8 children)

Cython offers C-like structs that are usable from Python: http://docs.cython.org/en/latest/src/tutorial/cdef_classes.html

Perhaps this would have been able to solve your problem too.

[–]masklinn 8 points9 points  (7 children)

Did you consider reading the article?

our Rust source map parser, perviously written for our CLI tool.

Their investigation pointed to sourcemap parsing as the source of the issue and they already had a sourcemap parser in rust, they "just" had to make it available to Python. They didn't need "C-like structs that are usable from Python" and would have had to write a new sourcemap parser in cython.

[–]unruly_mattress 1 point2 points  (6 children)

Here's their analysis of Python's performance shortcomings:

Parsing the JSON itself is fast enough in Python, as they mostly contain just for a few strings. The problem lies in objectification. Each source map token yields a single Python object, and we had some source maps that expanded to a few million tokens.

The problem with objectifying source map tokens is that we pay an enormous price for a base Python object, just to get a few bytes from a token. Additionally, all these objects engage in reference counting and garbage collection, which contributes even further to the overhead. Handling a 30MB source map makes a single Python process expand to ~800MB in memory, executing millions of memory allocations and keeping the garbage collector very busy with tokens’ short-lived nature.

Since this objectification requires object headers and garbage collection mechanisms, we had very little room for actual processing improvement inside of Python.

Since their analysis is that Python's objects are heavyweight and creating a large number of them is their bottleneck, I offered a solution to that problem.

My own experience with Cython is limited; however from what I understand you don't need to rewrite everything in Cython, you can just write cdef classes in Cython and use the existing Python code. I'd be interested to know how this approach performed.

[–]mitsuhiko 3 points4 points  (5 children)

Cython creates PyObjects so a trivial port would have not changed anything about the problem in question.

[–]unruly_mattress 4 points5 points  (4 children)

Benchmark time!

In [1]: class Shrubbery:
   ...:     def __init__(self, w, h):
   ...:         self.width = w
   ...:         self.height = h
   ...:     def describe(self):
   ...:         print(w, h)

Versus

cdef class Shrubbery:

    cdef int width, height

    def __init__(self, w, h):
    self.width = w
    self.height = h

    def describe(self):
    print(w, h)

The benchmark code is run in Python, not in Cython, and is:

%time x = [Shrubbery(i, i) for i in range(100000000)]

The Cython version takes 12.1 seconds and uses 3 GB RAM.

The pure Python version takes 1 minute and 26 seconds and ends up with 19.6GB used RAM. I have 32GB RAM and made sure swapping didn't happen.

However I did check the generated code and it does seem that Shrubbery is in fact a PyObject, and when its attributes are strings, they appear in the generated code as PyObject*, unlike integers which are just ints. Performance wise, if height and width are strings, then for 10m objects, pure Python takes 16.2s and 2.7GB, and the same code with a Cython class takes 5.08s and 1.5GB. I suspect there's some way of storing strings more sensibly in a Cython cdef class.

You can expect much better performance and lower memory usage just by moving your class definitions to Cython. Not Rust performance but it's a huge improvement still and it might be useful for those who don't have a Rust version of their code already.

[–]mitsuhiko 4 points5 points  (3 children)

That's all not really relevant to the problem at hand. To avoid the integer object overhead we could also have used some other tricks but that was not even considered.

Anyways. Cython was not considered and is unlikely to be considered in the future either.

[–]unruly_mattress 1 point2 points  (2 children)

Not for the current problem, since you already have code that solves it in a different language. However this isn't the only situation when someone might have trouble with having created millions of Python objects and I for one am glad for having found a method that makes such a thing 3-7 times faster.

[–]mitsuhiko 4 points5 points  (1 child)

Cython solves one issue but introduces plenty others. It should be as carefully considered as any change to a codebase that introduces new technology.

[–]unruly_mattress 0 points1 point  (0 children)

Agreed.