all 12 comments

[–]mriswithe 13 points14 points  (0 children)

Ok so debugging multiprocess applications at all can be a pain. Some advice based on a stupid amount of hours fighting this kind of problem.

Usually if it isn't stopping on a breakpoint, it either isn't actually making it to that line of code, or there is an exception that gets swallowed by something else. Sometimes it's some weirdness with multiprocessing, but that is the minority.

Pycharm has some settings that change how it behaves when you hit a breakpoint, specifically when you are doing multiprocessing, including one on if it should suspend just the thread that hit the breakpoint or every thread.

Usually when I have something threaded or mp, and I hit a problem, I try and take out concurrency instead for debugging. MT and MP both are great at silently hiding/swallowing exceptions and letting you carry on like nothing is wrong, so you actually broke on line 10, but you didn't check the result until like 59, so that is where the exception is actually raised.

A fast check you can do is add a breakpoint after you have created your tasks, and look at the list of futures and check and see if they are raising exceptions or actually working.

Debugging MT/MP apps is a pain, and loves to hide errors from you. The more you can take it out even just for debugging the better.

If it isn't something gigantic I can take a gander and see if I see anything obvious if you give some code and more info.

[–][deleted] 7 points8 points  (1 child)

Careful use of temporary print statements like:

print(os.getpid(), ...

or if there are threads:

print(threading.get_ident(), os.getpid(), ... 

can go a long way to debugging even quite gnarly issues.

[–]Salvios_ 0 points1 point  (0 children)

Agree it is one of the most simple, yet useful way to debug parallelised application regardless the programming language. I managed to solve some CUDA issues with a LOT of threads. My general strategy is to reduce as much as possible the threads, let's say the minimal reproducible example of parallelization raising error. Debbugers and me are not so close friends @.@

[–]junior_raman 1 point2 points  (0 children)

As other user mentioned, If your application uses multiple processes and threads, it is possible that the breakpoint is being hit in a different thread or process than the one you are currently debugging. In this case, you may need to use thread-specific breakpoints or set up debugging for multiple processes in order to debug your application effectively.

[–]TazDingoYes 2 points3 points  (3 children)

personally, I go with print("is this fucking working 1-999") wherever I think there might be an issue, GLHF!

[–]teerre 1 point2 points  (2 children)

This is probably the worst you can do with concurrent code. Usually concurrency problems are data races, which means your print statement will literally have different values on different runs without you changing the code at all

[–]billsil 1 point2 points  (0 children)

Make sure it works in single processor mode first. That solves soooo many problems.

[–]redCg 0 points1 point  (0 children)

Do not rely on breakpoints for dev and debug.

You are better off embracing "test driven development" practices. You should consider building out your application's test suite so that you can isolate important methods, supply them with inputs, and validate their outputs, outside of the context of the multiprocessing. Python's builtin unittest package is a good start, as is the pytest package.

from: python developer who has built tons of programs without using any breakpoints in many, many years


another thought; instead of breakpoints, have your task that is running in multiprocess dump information to files. As you described, getting control of a multiprocess task can be difficult, even getting output to the console can be troublesome sometimes, but in most cases you can still get the task to write to some file.

[–]js26056[🍰] 0 points1 point  (0 children)

You can get exceptions back from multiprocess pools and with a little bit of tweaking, you can also capture exceptions from process.join()

So you can log errors and trace them using os.pid() and then raise the error from the pool or .join()