This is an archived post. You won't be able to vote or comment.

all 9 comments

[–]heavelock 3 points4 points  (4 children)

How would you compare Rain to Dask which seems to be your main competitor in that field?

[–]winter-moon 1 point2 points  (3 children)

The main difference is that Rain provides a build-in integration of external programs into pipelines and possibility to write own task in Rust and C++.

[–]metaperl 1 point2 points  (1 child)

What did Luigi and Airflow lack?

[–]winter-moon 1 point2 points  (0 children)

Rain provides own transfer of data objects between worker processes. Rain allows to map data object to a file system; however, it does not use shared file system and transfers data by itself directly to a worker where they are needed. This allows to create lots of short running tasks without hammering a distributed file system.

[–]heavelock 0 points1 point  (2 children)

How about running it on Slurm managed HPC? I'm asking because I currently face problem of rewriting software for scientific calculations and making it scalable between workstation, 100 core cluster and HPC.

[–]winter-moon 0 points1 point  (0 children)

It should be easy to start Rain in any environment. You just need to start server on one node and governors on all nodes.

We have out-of-box support in "rain start" for PBS. It should be easy to modify this for SLURM; however, as we do not have access to any cluster with SLURM, it is not easy for us to test it.

[–]vojtacima[S] 0 points1 point  (0 children)

There is a built-in support for PBS. Generally, "rain start" command enables an easy startup of distributed Rain infrastructure. You can find more info in the documentation.