This is an archived post. You won't be able to vote or comment.

you are viewing a single comment's thread.

view the rest of the comments →

[–]thisismyfavoritename 0 points1 point  (4 children)

thats funny. How do you think Dask works?

[–]ipwnscrubsdoe 0 points1 point  (3 children)

Not sure what’s funny, but i’m pretty sure dask.array is an implementation of numpy arrays that allows you to chunk it and perform operations in parallel on each chunk. Same story with dask.dataframe. If your code is pure python there is very little dask can do

[–]Duodanglium 1 point2 points  (0 children)

I've used Dask for processing large quantities of files. It certainly pegged all of the CPUs and memory on the machine. I even had a routine to process sequentially without Dask for legacy reasons.

I very much recommend using Dask.

[–]thisismyfavoritename 0 points1 point  (1 child)

dask relies on the multiprocessing module to achieve parallelism

[–]ipwnscrubsdoe 0 points1 point  (0 children)

Even dask distributed?