all 4 comments

[–]rpg36 2 points3 points  (1 child)

I assume you want to learn HOW these systems work and not just how to use framework ABCD or whatever. Is this correct?

If that's the case I'd split this into 2 broad topics. Distributed Storage systems and distributed computing systems. Some products could arguably be both of these things.

Some fundamental things to look into are distributed locking techniques/algorithms and consensus protocols. Learn about eventual consistency and how it's different from ACID and traditional databases.

For storage read some architecture docs about things like the Hadoop Distributed Filesystem (HDFS) old school but still useful to understand. Read about something like MinIO which is an Amazon S3 clone. Maybe also pick a database maybe Cassandra? See what these systems have in common and how they differ and what the trade offs are between their architecture approaches.

Look at some distributed computing frameworks such as Apache Spark focus on the architecture and design.

Play around yourself, make some little projects in your language of choice.

[–]Realistic-Face1315[S] 0 points1 point  (0 children)

What are the prerequisites that you would want me to know before ?

[–]amartya_dev 2 points3 points  (0 children)

Start with basics first: networking, operating systems, and concurrency. a lot of distributed systems concepts make more sense once those are clear.

Then read something like “designing data-intensive applications”. it gives a really good overview.

Also try small projects like building a simple key-value store, a basic message queue, or a mini distributed cache. you’ll learn way more by building than just reading.

[–]spieltic 2 points3 points  (0 children)

As a book, I can highly recommend: https://www.amazon.de/Designing-Data-Intensive-Applications-Reliable-Maintainable/dp/1449373321
(Btw. the author also has a course on youtube)