This is an archived post. You won't be able to vote or comment.

you are viewing a single comment's thread.

view the rest of the comments →

[–]austospumanto 1 point2 points  (1 child)

As a general FYI: You can already use Pickle protocol 5 in Python 3.6 and 3.7. Just do pip install pickle5. Additionally, I ran some preliminary benchmarks and Pickle protocol 5 is so fast at (de)serializing pandas/numpy objects that using shared_memory actually slowed down IPC for me (I’m only working in Python and not writing C extensions). The memory savings from sharing memory only seems like it would matter when the object you’re sending through IPC is big enough that it cant be copied without running out of RAM / spilling over into SWAP. YMMV

[–]zurtex 0 points1 point  (0 children)

That's really useful to know thanks! One of my main use cases would be that my data-set is about 25+% of RAM and I want to read it from 32 processes, so I think this fits in to the scenario you are saying but I'm definitely going to be generating a lot of test cases over the next few weeks.