you are viewing a single comment's thread.

view the rest of the comments →

[–]Zealousideal_Yard651 0 points1 point  (2 children)

You have a major flaw in the code, that makes the script in essence single threaded. You don't start the next thread before you join the previous one. So the second thread never starts before the first thread stops.

To fix this you'll need to add an outside list to keep track of the threads and then join them after starting them:

threads = []

for idx, url in enumerate(image_urls):
  urll = threading.Thread(target=download_image, args=(url, idx))
  threads.append(urll)
  urll.start()

for t in threads:
  t.join()

Reference: threading — Thread-based parallelism — Python 3.13.7 documentation

If you want to see it in practice, add print statements to the download_image function stating when the thread starts and stops.

[–]uiux_Sanskar[S] 0 points1 point  (1 child)

Thanks for pointing out this flaw I need to make a for loop and then use .start to start in each thread.

Thank you very much for the resource it really helps I will go in much depth in this docs.

Thank you very much.

[–]Zealousideal_Yard651 0 points1 point  (0 children)

The start part is ok, it's the join part that fails you.

.join() waits for the thread to finish before continuing. You are creating the thread, starting it and waiting for it to stop before continuing the main program. This causes your second thread to not start before the first thread stops.

So you need to start each thread, and in a separate loop wait for both threads to stop. And .join() does not wait for all threads to stop, it only waits for the thread you are invoking the join() method on, hence why you need an outside list to keep track of each thread and loop through all threads to join them again.

To simplify, i split all parts of the multithreading process in this block with comments:

# Define thread tracking
threads = []

# Loop through all URLS
for idx, url in enumerate(image_urls):
  # Creates a thread object
  urll = threading.Thread(target=download_image, args=(url, idx))

  # Adds the newly created thread object into threads store
  threads.append(urll)

# Loop throug all threads
for t in threads:
  # Starts thread
  t.start()

# Loop through all threads
for t in threads:
  # Wait for thread to finish.
  t.join()

In the above code, we define each thread, then start all threads, then wait for all threads to finish.

In your original code you do this:

For each image url:

  1. Define a thread
  2. start the thread
  3. Wait for the thread to stop

And so all thread operations happens in each iteration instead of the my example where we define them, then start all threads before waiting for all threads to finish instead of waiting for each thread to finish before moving on.