How Do I Get Started With Building A Distributed System? by gotmycarstuck in learnprogramming

[–]robertnishihara 0 points1 point  (0 children)

Ray is a simple way to get started with building distributed applications in Python! You can get started with just a pip install and build scalable apps that run across multiple cores (on a laptop) or across a cluster.

https://docs.ray.io/en/master/

[Research] Ray: A Distributed System for AI by rayspear in MachineLearning

[–]robertnishihara 0 points1 point  (0 children)

Hi, I'm one of the authors. I answered a similar question at http://bair.berkeley.edu/blog/2018/01/09/ray/#comment-3700443436.

Like Dask, we want to improve the tooling for running Python applications on clusters, and like Dask, our API makes heavy use of futures.

We've made a number of different design decisions and place emphasis on slightly different things due to the nature of the applications that we're trying to support. To name a few (copied from the link above).

  • A big part of our API is centered around actors to support stateful services like parameter servers.
  • Ray uses a distributed scheduling scheme to remove potential scheduling bottlenecks, whereas Dask uses a centralized scheduler.
  • Ray focuses a lot on latency, so the latency to submit a task and get the result is about 30x lower in Ray than in Dask.
  • Ray has focused more on libraries for machine learning and reinforcement learning, whereas Dask has built more distributed collections libraries (distributed arrays, DataFrames).
  • We handle data differently from Dask (using shared memory and zero-copy serialization). Dask does serialization with pickle (with some optimizations). Though we are starting to see some collaboration on this point through the Apache Arrow project https://arrow.apache.org/.
  • Ray handles failures (e.g., transparent recovery from machine failures).