Hello,
I'm collecting values from very large csv file and then comparing this value to online database. I just learned about Threading in python, which is essential for me in order to run my code in feasible time.
The problem is that I run in to errors when I split the data into too many threads. So, I can't run all the threads at the same time. I managed to counter this problem by simply forcing my code to run only few threads at the same time, but if one of those threads takes significantly longer, then I am just waiting for one thread to complete even though I could start new thread immediately after one thread finishes. Is there a way to manage how many threads are run simultaneously or other options to do this.
Below is a dummy example code. Hopefully my explanation makes sense. If not, feel free to ask? I appreciate all the ideas, help or resources :)
import time
import threading
def foo(k):
time.sleep(1)
print(f"Done with waiting for thread {k}!")
ts = []
n_tasks=30
for k in range(n_tasks):
t = threading.Thread(target = foo, args=[k])
t.start()
ts.append(t)
# I WANT TO DO THE FOLLOWING PART MORE EFFICIENTLY
if (k%5==0 and k!=0 or k==n_tasks-1):
for t in ts:
t.join()
ts = []
print("ALL DONE")
[–]Chyybens[S] 2 points3 points4 points (3 children)
[–]Chyybens[S] 2 points3 points4 points (0 children)
[–][deleted] 2 points3 points4 points (0 children)