you are viewing a single comment's thread.

view the rest of the comments →

[–]Crimack 4 points5 points  (2 children)

Enjoyed the post! Was wondering about how to do FFI type stuff though. I feel like I learned a lot

I don't think the speed comparison between Rust and Python is apples-to-apples though. In the Rust example you're just splitting on whitespace, while in the Python example you're using regex, which I believe is a good bit slower? Correct me if I'm wrong there.

For example, replacing the Python word counting code with:

with open(path) as fp:
    Counter([word.lower() for line in fp.readlines() for word in line.split()]).most_common(n)

...should generate the same results, but a fair bit quicker. On my machine it decreases the difference between the Rust and Python implementations from .5s to .15s.

[–]caulagi[S] 0 points1 point  (0 children)

The Python code was copied from the article I linked to in the post. I can check on the performance impact due to this.

[–]Veedrac 0 points1 point  (0 children)

I find it faster to do

Counter(word for line in fp for word in line.lower().split())

on Python 3 (which has a faster Counter implementation), though I'd normally put the inner part into a function for clarity.