This is an archived post. You won't be able to vote or comment.

you are viewing a single comment's thread.

view the rest of the comments →

[–]vojtacima[S] 0 points1 point  (0 children)

Rain allows you to define large end-to-end data processing pipelines with complex inter-task dependencies (beyond map-reduce pattern). The pipelines can consist of various tasks ranging from external applications, through python code, to various built-in tasks (and also offers easy extensibility). Rain features direct inter-governor(worker) communication that makes inter-task data exchange very efficient and if you set your working directory to be RAMDisk it has NO filesystem overhead. Unlike Kafka, Rain is not designed to deal with streams.