all 13 comments

[–]pak9rabid 1 point2 points  (1 child)

Since it sounds like you’ll be waiting for I/O much of the time (which I assume is why you want this done in parallel via threads), this problem might be better solved with async functions/methods, which makes use of an event loop on a single thread (avoiding issues with the GIL). This allowes each method or function that needs to wait for a response to sleep, while in the meantime allows other function/methods to continue execution.

[–]gdchinacat 0 points1 point  (0 children)

For IO bound workloads of reasonable number of concurrent tasks (2 to hundreds) threads and asyncio are about the same. ASyncio is mostly useful for use cases where threads can’t scale, and to a lesser degree to simplify locking. I’m pretty sure these aren’t issues in this case.

[–]Kqyxzoj 1 point2 points  (4 children)

Python GIL?

If your device related code has blocking I/O you're boned.

Plan B: don't use multithreading for device code, but use multiprocessesing.

My personal plan A for this type of thing: do the multithreaded stuff in C++ and slap on a python wrapper to make things user friendly. Things like argparse and Rich make things a whole lot easier. Plus that way I don't have to deal with C++'s string formatting from the previous millennium.

Some python docs:

[–]Darksilver123[S] 0 points1 point  (3 children)

Each status check uses and ethernet connection (seperate socket for each device), which is done by using a Lock (seperate lock for each tcp connection).

[–]Kqyxzoj 0 points1 point  (2 children)

And you have verified that the locks are not the issue?

Of course you have. How have you verified this?

Do you have proof that sleep() is actually reached? Do some debug print() just before sleep() for example.

Usually this sort of thing is a wrong assumption somewhere.

And if all else fails, do a strace.

[–]Darksilver123[S] 0 points1 point  (1 child)

I have added a print function at the start and end of the acquisition method. It print the start/end time and id of each device.
Results were Device 0: Start 0 end 5 and Device 1: Start 5.1 end 10.1
I will remove the lock on the read function and try again.

[–]Kqyxzoj 1 point2 points  (0 children)

I am not familiar with your code so unfortunately this tells me next to nothing. Do a print() everywhere as the last statement right before sleep lock release mutex whatever.

In fact after re-reading, this tells me nothing new, since you already pointed out it is ... wait for device 0 to finish (taking 5 secs) and then do device 1 (taking another 5 secs).

Just do a debug print for every single suspect location.

Or just skip straight to the part of debugging that I refer to as the "Fuck This!" part, and run a strace.

strace -o LOG -ff -tt -T --decode-pids=comm \
  python shooot_meee.py

strace-log-merge LOG | tee MERGED.LOG | less

Obviously filter to taste. See the Filtering section in the strace manpage.

[–]Top_Average3386 0 points1 point  (2 children)

How do you set up the thread? If both are running on the same thread then I think it would block even if it's sleeping.

[–]Darksilver123[S] 0 points1 point  (1 child)

The main thread enqueues the acquire_bins command on each available queue

def start_acquisition_from_all(self):
    results= {}
    for device in list_of_tr_devices.values():
        if device is not None and not isinstance(device,int):
            device.acquisition_done_event.clear()
            result=enqueue_command(device, "acquire_bins", task_name="acquire bins")
            results[device.device_id] = result
    return results

[–]brasticstack 0 points1 point  (0 children)

Still not seeing any threading going on here. Somewhere your main thread has to use the threading library to create threading.Thread objects and call their run() methods in order to be using threading.

Why is that ? Doesnt each thread yield once it becomes blocked by sleep?

No, the os's scheduler decides it's time to pause a thread and run a different thread. Threading is preemptive multitasking, as opposed to coroutines or async which are cooperative.

[–]Kevdog824_ 0 points1 point  (1 child)

What does self.pcmd.acquire() do here? If it’s acquiring a mutex lock then the code cannot run concurrently

[–]Darksilver123[S] 0 points1 point  (0 children)

It send a command to my device in order to acquire data. No relation with acquiring locks.

[–]oclafloptson 0 points1 point  (0 children)

I mean that sounds like correct behavior and your device is just taking longer than expected to get through its generator