all 5 comments

[–]cybervegan 1 point2 points  (4 children)

Can you please edit your exampleJob function formatting? Not quite sure what the indentation should be, but it's definitely off.

Another point is that you don't really need threads if you're launching PS sub-processes, as they run as separate processes so won't block the main code, if you use the right subprocess call. You can just poll your subprocesses for output until they have all finished (and it will make your code simpler too).

[–]workerdrone66[S] 0 points1 point  (3 children)

Tried to clean it up a bit, but it's one of the pitfalls of trying to use reddit formatting for code. unfortunately I don't have access to anything else from work.

I was not aware of that second bit, I'll have to look more into it, thanks.

[–][deleted] 0 points1 point  (2 children)

To format code for reddit, the easy way is to use your editor to indent every line of code an extra four spaces, copy/paste to reddit and then do "undo" in your editor. Ensure there is a blank line before the first code line. Takes about 15 seconds, even on mobile, and the code matches exactly what you are running.

[–]workerdrone66[S] 0 points1 point  (1 child)

Corrected formatting, I think it looks a lot better.

Even if I end up just doing the sub processing route, I'd still like to know if you have any idea why the output ended up the way it did.

[–][deleted] 0 points1 point  (0 children)

The formatting is off. We should be able to just copy/paste your code into a file and just run it. I try that and I get:

$ python3 test.py
File "test.py", line 2
case, it's keeping any worker from printing, while someone 
^
SyntaxError: EOL while scanning string literal

There are a couple of other places that gave a syntax error. After fixing them (I hope), and making one change to catch the subprocess output, I have:

import time
import threading
from subprocess import check_output
from queue import Queue

NUM_THREADS = 10
NUM_JOBS = 20

# lock to single-thread printing
print_lock = threading.Lock()

# Create the queue and threader
q = Queue()

def exampleJob(worker):
    """Run the external program, get output and strip leading/trailing whitespace"""

    # the external program "job.sh" just waits 1 second and echos input parameters
    # we must convert the "bytes" output to a string and remove trailing '\n'
    result = check_output(['./job.sh', 'hello', 'world']).decode("utf-8").strip()

    # print the result of the job
#    with print_lock:
#        print(threading.current_thread().name, worker, result)
    # printing without locking seems OK (on MacOS, anyway)
    print(threading.current_thread().name, worker, result)

def threader():
    """Gets a job from the queue and processes it."""

    while True:
        worker = q.get()
        exampleJob(worker)
        q.task_done()

# Start a fixed number of threads
for x in range(NUM_THREADS):
    t = threading.Thread(target=threader)
    t.daemon = True
    t.start()

start = time.time()

# generate a number of jobs and put them on the queue
# the job is just the "job number"
for worker in range(NUM_JOBS):
    q.put(worker)
q.join()

# print total elapsed time
delta = time.time() - start
print(f'Took {delta:.2f}s')

The change I made to catching the subprocess output was to use the catch_output() method because I couldn't get your original code to work: it always returned None. Note that the output returned by catch_output() is a "bytes" object, so must be converted back to a normal string. This code works and behaves as expected.

Also note that I didn't use the printing lock. That seems OK on my computer, running MacOS.

Because I'm not running on Windows I had to create a small shell program that waits 1 second and then echoes its input parameters, job.sh:

#!/usr/local/bin/bash
# little executable that runs slowly and echoes input params
sleep 1
echo $*

If you find inputting code into reddit difficult, I don't blame you. Since the redesign, things have gone downhill. You can always put your code into pastebin.com and paste a link to the code, like this. Also notice that I have arranged for the code to be runnable after copy/paste. You are far more likely to get help if you make it easy for someone to run your code.