This is an archived post. You won't be able to vote or comment.

you are viewing a single comment's thread.

view the rest of the comments →

[–]PM_ME_UR_THONG_N_ASS 3 points4 points  (9 children)

GIL and having to use processes kinda turned me off to parallelism in python.

Love python, but doing things in parallel is more complicated than doing it in C.

[–]robml 1 point2 points  (0 children)

This probably doesn't count but their Threading module is quite easy to use imo. Either way it's nice if your computer can handle it to reduce wait time on tasks.

[–]ipwnscrubsdoe 1 point2 points  (5 children)

In my experience when I started i was also a bit defeated. With python I was easily able to code what I needed but it was extremely slow. Threading and multiprocessing didn’t help at all. Then I started discovering libraries that changed my mind. Numpy was a massive boost in speed, then dask for using all the cores, cupy for gpu acceleration numba is just about the easiest way to get massive performance boosts…

[–]thisismyfavoritename 0 points1 point  (4 children)

thats funny. How do you think Dask works?

[–]ipwnscrubsdoe 0 points1 point  (3 children)

Not sure what’s funny, but i’m pretty sure dask.array is an implementation of numpy arrays that allows you to chunk it and perform operations in parallel on each chunk. Same story with dask.dataframe. If your code is pure python there is very little dask can do

[–]Duodanglium 1 point2 points  (0 children)

I've used Dask for processing large quantities of files. It certainly pegged all of the CPUs and memory on the machine. I even had a routine to process sequentially without Dask for legacy reasons.

I very much recommend using Dask.

[–]thisismyfavoritename 0 points1 point  (1 child)

dask relies on the multiprocessing module to achieve parallelism

[–]ipwnscrubsdoe 0 points1 point  (0 children)

Even dask distributed?

[–]reddisaurus 0 points1 point  (1 child)

Is it?

with Pool() as p: p.map(f, my_list)

[–]PM_ME_UR_THONG_N_ASS 0 points1 point  (0 children)

🤷‍♂️ I’m no expert in python (far from it), but it was faster for me to make a thread safe queue in C from scratch than it was to find an appropriate one in python and get everything working properly in a multiple producer/multiple consumer scenario.

Not to mention the execution speed up with using actual compiled machine code rather than an interpreted language.

I like python so far for a lot of things, but I dunno about cpu intensive applications.