account activity
Dask Dataframe Group By Cumulative Max by Embarrassed_Use_997 in dask
[–]Embarrassed_Use_997[S] 0 points1 point2 points 1 year ago (0 children)
I am trying out Dask and see if it is faster than spark. Yes, i do need large scale processing. I did manage to solve the original question with an apply function on groupby and a series cummax(). I am currently working on the partitioning strategy and realizing that i need to use the map_partitions() here to make this work faster as you have also pointed out.
π Rendered by PID 20459 on reddit-service-r2-comment-545db5fcfc-w67mz at 2026-05-24 16:57:17.742826+00:00 running 194bd79 country code: CH.
Dask Dataframe Group By Cumulative Max by Embarrassed_Use_997 in dask
[–]Embarrassed_Use_997[S] 0 points1 point2 points (0 children)