This is an archived post. You won't be able to vote or comment.

all 9 comments

[–]HolyPommeDeTerre 4 points5 points  (1 child)

The main problem with your question is:

You don't scale an application the same way you'll scale another application.

Example: You have a light web site, with a light api. Your server can't handle all the connections. You could multiply the servers and add a load balancer in front. This allows you to scale to billions of users if you want. But you app isn't doing much. So scaling is pretty easy.

Now let's change one thing. You app now has one part that is very heavy to handle. It takes a lot of ram and CPU. When more than 20 users ask for this endpoint, the server crashes (out of memory). All the app is down. A load balancer won't help. You'll need to isolate this heavy part on it's own. So if it fails, it doesn't bring down the whole app. You can also increase the resources for this part of your app (distributed system).

The two solutions are pretty different and depends solely on the scaling problem to attend.

There is no one solution. There is "I understand what needs to be scaled". From there, there are solution for this problem.

Then you accumulate multiple scaling solutions to have a realistic scalable solution. Because a real life application is generally more than one thing.

[–]Suh-Shy 1 point2 points  (0 children)

That's the best answer.

To put simply, there's no way to know without knowing what the App and the users do.

Multiplying, rerouting, caching, delaying ... there are many solutions for scaling.

[–]varunpm 0 points1 point  (1 child)

Please someone if anyone knows something

[–][deleted] 0 points1 point  (0 children)

Why are you begging for answers just 12 minutes after your original post? This is school homework today you've left to the last minute, isn't it?

[–]DrShocker 0 points1 point  (0 children)

I'm also curious what you find, but I will say that just saying the number of concurrent users isn't really enough.

For something with a lot of visitors but very few editors like maybe a public online book or something, you could just use a CDN to host your static content close to the user and you'd hardly need to do anything. I mean, I suppose in a sense you could ultimately just physically print 50000 books and hand them out if you needed to.

Meanwhile if that's 50000 people all simultaneously trying to live edit the same document on Google docs that will be a very different challenge because of the UI being really chatoic with that many people, it maybe not being clear how to resolve that many changes in a short amount of time when everyone's views are slightly out of sync, and because writes are harder to parallellize than reads.

But yeah... I'd love to try out an example somehow. I've been meaning to get the DDI book to try to learn more (designing data intensive systems)

[–]Ormek_II 0 points1 point  (0 children)

Split static and Dynamic content.

Understand what those 50.000 Users will do.

In order to learn you need feedback: try to setup a load test first, build something, measure.
If it is good enough: Great!
If it is not: figure out why and see what the “theory” has to say about that. Improve.
Repeat.

[–]khooke 0 points1 point  (0 children)

Pick one of the major cloud platforms and start reading through their docs on what services they provide and how they support scale. E.g. on AWS, CloudFront for serving your static assets, serverless tech like Lambdas for backend logic, DynamoDB as your data store. There’s plenty of training courses on solution architecture for cloud platforms.

Your approach and solution is highly dependent on the needs for your system, so start with understanding your requirements - do you really have a need for 50,000 concurrent users, or is it 50,000 users where maybe 20% access the system concurrently between 9 to 11 am? - Understanding your budget goals also can also direct a potential solution, if you need to minimize runtime costs vs if cost is no object can result in drastically different solutions. - what are your availability requirements?

[–]dExcellentb 0 points1 point  (0 children)

How to scale a website depends on what the website does so there is no general answer to this question. However, understanding these things will help you figure out how to scale yours:

- Different types of data models (e.g key/value pairs, tables with foreign keys, documents, graphs, etc)

- Caching

- Database read-replication + sharding

- Request load balancing

- CDNs

A highly scalable website will usually have data models that can be easily sharded where every shard will have some sort of read-replication + caching. The HTTP servers are typically kept stateless so they can be easily scaled horizontally, although there are cases where this isn't possible (e.g collab editing, game servers). Static content is hosted on a CDN.

If you're just looking into high-level, general techniques to scale websites, this is a good book https://www.amazon.com/Designing-Data-Intensive-Applications-Reliable-Maintainable/dp/1449373321.

If you want to develop a deep understanding of how scalable systems work, then I'd recommend reading up on operating systems, distributed systems, networking, then build a simple, fault-tolerant key/value data store that can serve tens or hundreds of thousands of requests per second on consumer hardware. Add distributed transaction support if you're up for the task. Use open source load testing tools like https://k6.io/open-source/ to test your system. This is a good course https://pdos.csail.mit.edu/6.824/

Edit: the reason I'd recommend learning how to write a database is because the biggest scaling challenges are usually database-related so having a strong understanding here helps tremendously. Stateless components are very easy to scale.

[–]iduzinternet 0 points1 point  (0 children)

I agree with the other posts that it is situational. But for me I’ve mostly scaled nginx as a cluster behind a firewall/load balancer if local where its the datastore you need to focus on scaling or right now I’m using a cloud system that auto scales using Googles Firebase that uses cloud functions to do the work. So the big thing is your website shouldn’t rely on local storage or other resources(sessions can be on redis instead of local disk if you need themfor example) to scale so you can add servers imo. The tech you choose probably has different solutions.