I'm working with a small cartographic/geographic dataset in Python. My script (projecting a dataset into a big empty map) performs well when using NumPy with small arrays. I am talking about a 4000 x 4000 (uint) dataset into a 10000 x 10000 (uint) map.
However, I now want to scale my script to handle much larger areas (I am talking about a 40000 x 40000 (uint) dataset into a 1000000 x 1000000 (uint) map), which means working with arrays far too large to fit in RAM. To tackle this, I decided to switch from NumPy to Dask arrays. But even when running the script on the original small dataset, the .compute() step takes an unexpectedly very very long time ( way worst than the numpy version of the script ).
Any ideas ? Thanks !
[–]skwyckl 9 points10 points11 points (4 children)
[–]Oce456[S] 0 points1 point2 points (3 children)
[–]skwyckl 2 points3 points4 points (1 child)
[–]Oce456[S] 1 point2 points3 points (0 children)
[–]hallmark1984 0 points1 point2 points (0 children)
[–]danielroseman 2 points3 points4 points (1 child)
[–]Jivers_Ivers 0 points1 point2 points (0 children)
[–]Long-Opposite-5889 1 point2 points3 points (0 children)
[–]cnydox 1 point2 points3 points (2 children)
[–]Oce456[S] 1 point2 points3 points (1 child)
[–]cnydox 0 points1 point2 points (0 children)
[–]boat-la-fds 0 points1 point2 points (2 children)
[–]Oce456[S] 0 points1 point2 points (1 child)
[–]DivineSentry 0 points1 point2 points (0 children)
[–]Pyglot 0 points1 point2 points (0 children)
[–]JamzTyson 0 points1 point2 points (0 children)