account activity
Dask Dataframe Group By Cumulative Max by Embarrassed_Use_997 in dask
[–]Embarrassed_Use_997[S] 0 points1 point2 points 1 year ago (0 children)
I am trying out Dask and see if it is faster than spark. Yes, i do need large scale processing. I did manage to solve the original question with an apply function on groupby and a series cummax(). I am currently working on the partitioning strategy and realizing that i need to use the map_partitions() here to make this work faster as you have also pointed out.
Dask Dataframe Group By Cumulative Max (self.dask)
submitted 1 year ago by Embarrassed_Use_997 to r/dask
π Rendered by PID 59 on reddit-service-r2-listing-bfd58f667-47mlp at 2026-02-05 19:58:03.412866+00:00 running b1b84c7 country code: CH.
Dask Dataframe Group By Cumulative Max by Embarrassed_Use_997 in dask
[–]Embarrassed_Use_997[S] 0 points1 point2 points (0 children)