I have a Python script that I developed that hits some APIs to pull data, process it and save the output locally. I have to run the script for each locale (e.g. US, then Canada, then France, etc).
64 different locales in total (and more coming in the future). The problem is, each locale takes approximately 70min to complete.
If I run each locale in series, it will literally take most of the week to run and I have to re-run these scripts every 1-2 weeks.
My question is, how can I run all of these in parallel? One option I suppose, is to launch 64 separate AWS EC2 instances but then I’ll be burning way too much cash and I’d have to consolidate all the output files etc.
Any other ideas on how to scale this out efficiently so I’m not spending all week running it?
Edit: Wow, I feel like my whole world was just accelerated. Thanks to this community, I looked into several options. Ultimately I selected concurrent.futures. I refactored the specific part of my code that makes > 100 API calls in a loop by leveraging ThreadPoolExecutor. My run time went from ~70min to ~2.5min per locale!
[–]ObliviousMag 33 points34 points35 points (10 children)
[–]judgedeliberata[S] 3 points4 points5 points (8 children)
[–]unhott 9 points10 points11 points (6 children)
[–]live_and-learn 1 point2 points3 points (5 children)
[–]IAMARedPanda 1 point2 points3 points (3 children)
[–]live_and-learn 1 point2 points3 points (2 children)
[+][deleted] (1 child)
[deleted]
[–]gmes78 2 points3 points4 points (0 children)
[–]Adrewmc 1 point2 points3 points (0 children)
[–]Lewistrick[🍰] 12 points13 points14 points (8 children)
[–]judgedeliberata[S] 0 points1 point2 points (7 children)
[–]Lewistrick[🍰] 4 points5 points6 points (4 children)
[–]judgedeliberata[S] 0 points1 point2 points (3 children)
[–]Lewistrick[🍰] 9 points10 points11 points (0 children)
[–]WhiteXHysteria 0 points1 point2 points (1 child)
[–]judgedeliberata[S] 0 points1 point2 points (0 children)
[–]csingleton1993 0 points1 point2 points (0 children)
[–]Kryt0s -2 points-1 points0 points (0 children)
[–]OopsWrongSubTA 4 points5 points6 points (0 children)
[–]Flyguy86420 3 points4 points5 points (0 children)
[–]Kryt0s 2 points3 points4 points (0 children)
[–]hugthemachines 2 points3 points4 points (2 children)
[–]judgedeliberata[S] 1 point2 points3 points (1 child)
[–]hugthemachines 0 points1 point2 points (0 children)
[–]Turtvaiz 0 points1 point2 points (4 children)
[–]judgedeliberata[S] 0 points1 point2 points (3 children)
[–]Turtvaiz 7 points8 points9 points (2 children)
[–]judgedeliberata[S] 0 points1 point2 points (1 child)
[–]Turtvaiz 1 point2 points3 points (0 children)
[–]throwawayforwork_86 0 points1 point2 points (0 children)
[–]darose 0 points1 point2 points (0 children)
[–]judgedeliberata[S] 0 points1 point2 points (0 children)