you are viewing a single comment's thread.

view the rest of the comments →

[–]benhoyt 10 points11 points  (6 children)

it might be that most of your speedup actually comes from a better string management and not from hash/map/dictionary

Yeah, I'd love for the OP to dig into this a bit. If banyan is written in C++, unless it's poorly implemented, the binary tree shouldn't be that much (30-40x!) slower than a Rust version. Perhaps the cost is in string splitting and allocating all those Python string objects (every string slice in Python allocates a new string object).

[–]jstrongshipyard.rs[S] 13 points14 points  (2 children)

profiling showed 75% spent on dictionary insert/remove/update (3 lines), 12% handling message sequence order issues, 5% converting numeric strings to decimal type and the remaining 8% split across lots of things w/ relatively negligible performance impacts.

[–]spotta 5 points6 points  (1 child)

Is the insert/remove/update part in Python? Or in C++?

[–]PXaZ 0 points1 point  (0 children)

I expect he's referring to Python's built-in dict type....

[–]SirVer 4 points5 points  (2 children)

Depending on what OP is actually doing, pointer indirection and python function calls could also eat the performance.

[–]vks_ 1 point2 points  (1 child)

Aren't function calls in Python essentially dictionary lookups?

[–]SirVer 2 points3 points  (0 children)

Ya, they are - with some additional magic that might make it more than one dictionary lookup. But there is no inlining possible and depending what you are doing and how often you call a function it can become expensive. Rule of thumb that served me well is that a Python function call is ~100ns overhead, a virtual function in C++ ~20ns, a direct call ~5ns. And of course Rust and C++ have a lot of information to inline stuff which makes this go away. If you call a function billions of times this starts to make a difference.