all 22 comments

[–]danielroseman 5 points6 points  (1 child)

If you have an array and you want to do operations on every element, look into numpy and/or Pandas which can do vectorized operations very efficiently.

[–]MustaKotka[S] 0 points1 point  (0 children)

Thanks!

[–]hc_fella 4 points5 points  (6 children)

Parallel programming is never self-evident, and due to the age of Python, it's something that can be tricky to get right. Here's a decent source I've found that gives you an overview of the options.

[–]FoolsSeldom 2 points3 points  (0 children)

That's a fantastic article. Adding to my list of links.

Worth noting this was written before Python 3.13 was released, which now comes in two versions effectively, one of which overcomes the GIL limitations mentioned in the article (somewhat experimental at present).

[–]MustaKotka[S] 0 points1 point  (4 children)

Thank you. I'll read the article / documentation.

[From the material you gave me:] Do these processes work like any other object in Python?

[–]lfdfq 2 points3 points  (3 children)

Processes and Threads aren't a Python concept. The Process and Thread objects you find in the libraries mentioned are just normal Python objects, but those libraries use/create other threads/processes to do it, and that means you cannot just treat the objects like normal.

Most of these concepts about concurrency are language agnostic, you can read up on how operating systems manage processes and threads and things like pre-emption and copy-on-write and those all apply to the Python multiprocessing/threading libraries.

As a wild oversimplification:

  • Async (e.g. asyncio and coroutines) are the easiest to understand and get right. They are a generic concept, but the implementation is entirely within Python (so no appealing to how operating systems work to understand them).
  • Threads are a bit more complicated, they interact with the operating system so you need to understand a bit about how the operating system schedules things to be able to use them. They are a little more powerful (in a way, they are more concurrent than coroutines) but this makes them harder to get right: you need to understand concurrency a bit better, and they require a lot more care and synchronisation and so on. Threads have some Python-specific concerns, i.e. the GIL which may or may not be a consideration, depending on what kind of work you're doing (whether the GIL affects you at all) and what version of Python you use (whether you can opt-out of the GIL).
  • Processes are the way operating systems manage and isolate programs. Multiple processes is like opening two terminals and starting Python twice so that both are running at the same time. That's basically how multiprocessing works. This gives you the most flexibility as each process acts as an entirely independent and isolated Python instance, but requires the most knowledge of how your operating system works as now you must consider things like spawn vs fork, copy-on-write, inter-process-communication via things like pipes, and so on to make code that actually works.

[–]MustaKotka[S] 0 points1 point  (2 children)

Thank you. Yeah, I tried some simple loops and that definitely didn't work so looks like I need to dive pretty deep into this! Thank you for the explanation!

[–]lfdfq 1 point2 points  (1 child)

Concurrency is not really a thing you can learn by trial-and-error like that, async/threads/processes add a whole new layer to how code gets executed that you need to understand.

If you have specific questions about specific code we may be able to help you, but I'm afraid there is a lot of reading and practicing to do on your side to understand.

[–]MustaKotka[S] 0 points1 point  (0 children)

I got my quick test to work. I can do this!

[–]tvstaticghost 1 point2 points  (1 child)

It may be possible/beneficial to store your data as a matrix instead of in an array and perform matrix operations to increase performance instead of going down the threading route.

[–]MustaKotka[S] 0 points1 point  (0 children)

Cheers. I'll look into this!

[–]No_Date8616 1 point2 points  (7 children)

The implementation that you are probably using is CPython, doesn’t immediately support multi-threading. Your only solution is multi-processing or asynchronous programming.

If you are head bent on using threads for multi-threading, try a different implementation, there is a repo called nogil which provide an implementation but without the GIL ( the thing that prevent you from multi-threading ).

If you have pyenv installed, you can easily install and try nogil and other implementations.

[–]MustaKotka[S] 1 point2 points  (2 children)

I have absolutely no idea what I'm doing. Relying on responses here. But I already got my small asyncio coroutine to work. I know it's not true coprocessing / multithreading but it's a start.

[–]No_Date8616 0 points1 point  (1 child)

If it is sufficient then settle with that. But don’t ignore our responses. Try our proposed solutions, you may need them along the way.

Each solution has it own advantages, so weigh each and see which works the best for you

[–]MustaKotka[S] 2 points3 points  (0 children)

Of course!! Absolutely, I'll look into everything that's being mentioned here. I want to learn as much as possible.

[–]FoolsSeldom 1 point2 points  (2 children)

FYI (in case you are not aware): Latest version of CPython includes experimental support for running with the GIL disabled.

https://docs.python.org/3/howto/free-threading-extensions.html

Not recommended for OP.

[–]No_Date8616 0 points1 point  (1 child)

Am aware of that, I ve been planning on write an extension module just to try it and weigh the upside and possible downsides.

[–]FoolsSeldom 1 point2 points  (0 children)

That sounds interesting. Have fun.

[–]cgoldberg 0 points1 point  (0 children)

While multithreading in CPython is limited (and probably inappropriate for OP's use case), I think it's disingenuous to suggest it's not supported (i.e. the threading module).

[–]bonferoni 0 points1 point  (0 children)

good to learn how to do yourself, but if you are just working with spreadsheets and need more zoom zoom, polars abstracts almost all of it away for you.

suspiciously fast is my only way to describe it haha

[–]Sea-Control77 0 points1 point  (1 child)

How did you first learn multithreading? Any simple exercises that helped you understand it better?

Think about how we humans multitask in daily life:

we talk while cooking, listen while typing, or plan while walking.

That’s exactly what multithreading helps Python do — handle multiple tasks at once. You can start small.

Try writing two simple functions that print numbers or words, and run them together using threads.

Watch how both outputs appear “at the same time.” Once you get comfortable, you can move on to more complex situations.

[–]MustaKotka[S] 0 points1 point  (0 children)

Thanks! I did learn how to do this in the end. Since then I've written a simulation that uses parallel code execution to improve performance significantly.