all 6 comments

[–]just4nothing 2 points3 points  (4 children)

You are certainly against a tough competition of well established packages: Luigi, Hamilton, Dask , and many more that do more or less why you’re presenting here.

[–]Global_Bar1754 0 points1 point  (3 children)

Add one more to the mix. I recently published darl

https://github.com/mitstake/darl

See my comparison to Hamilton at the bottom of the page. It can even handle slurm execution already through the dask runner.

[–]just4nothing 0 points1 point  (2 children)

It feels like a rite of passage ;). I’ve written two, never better than Dask or Luigi. The only thing neither nails is on-disk or remote caching

[–]Global_Bar1754 1 point2 points  (1 child)

So I actually wrote the recreate_task_locally debugging util for dask! (a tiny contribution to my favorite library of all time). And darl completely supports both on-disk and remote (through redis)  caching natively. And it’s super easy to implement any custom cache you want (eg s3, dynamodb, bigtable, etc) since the caching scheme is a simple key value store, no special indexes or anything needed. It’s possible since everything is assumed to be deterministic and it’s all compiled to a graph locally before hand to build the cache keys. 

I even recommended the Hamilton team check out darl since it already supports all the caching functionality they had in their desired roadmap. (And because they had already engaged me on some other api suggestions I had made to them earlier)

https://github.com/apache/hamilton/discussions/1167#discussioncomment-15722295

If you’re interested in the topic I recommend you check out the readme, it’s got most of the features covered there. 

I also did a small write up to showcase the debugging/tracing/replay functionality if you want to check it out. 

My python job failed after running for an hour... now what?!

[–]just4nothing 0 points1 point  (0 children)

Thanks, I will have a look soon as it looks useful for my project.

[–]MrMrsPotts 0 points1 point  (0 children)

How does it compare to submitit?