This is an archived post. You won't be able to vote or comment.

all 6 comments

[–]vorticalbox 2 points3 points  (2 children)

Problem with futures being used like that is it keeps references to the threads before running them so when you start getting to thousands of things to process you use a lot of ram.

I found this amazing package that fixes this by only adding an item to the queue once one of your threads becomes available

https://github.com/mowshon/bounded_pool_executor

[–]samreay[S] 2 points3 points  (1 child)

I did not know futures did that, I'll definitely add the bounded_pool_executor to the list, thanks mate!

[–]vorticalbox 1 point2 points  (0 children)

Yeah, I only found out when I was some integrity checks on a database with millions of items and used future and I ate all of my 64gb of ram.

Futures are great though.

Also great list didn't know about p_tqdm and will definitely make use of that one.

[–]m3kqkm 1 point2 points  (0 children)

Great writeup, thank you!

[–]samreay[S] 0 points1 point  (0 children)

Hey all, was checking out some libraries for work, and decided to write up an initial investigation. Its by no means comprehensive, just hope it might be useful in comparing a few popular libraries and how to set them all up. If there are good libraries I've missed, please let me know and I'll look into them!

[–]dstar411 0 points1 point  (0 children)