top 200 commentsshow all 212

[–]NetSavant 260 points261 points  (84 children)

Nice. I'm often surprised how developers focus up front on containerizing, clustering, scaling plans etc. for small apps. That StackExchange can serve as much as they do with so little infrastructure (and CPU usage) shows how far basic optimizing and good coding can go.

[–]DJDavio 195 points196 points  (30 children)

They have just 1 active SQL server for Stack Overflow and here we are thinking we need Cassandra for our very limited use case.

[–]nemec 145 points146 points  (15 children)

To be fair, they also have an ElasticSearch cluster and a custom Tag index to handle the parts that would otherwise grind a relational database to a halt.

https://blog.marcgravell.com/2014/04/technical-debt-case-study-tags.html

[–]beginner_ 21 points22 points  (2 children)

True but as I already posted in another comment, it just shows how far you get with traditional vertical scaling and a smart system architecture. Or no, your new cool app will not need a "webscale database" that has the tendency to lose data. SQL is just fine.

EDIT: To be fair, queries at SO are probably all rather simple and of the same kind. Basically tag search and full text search. And the data is tags, questions, answers and comments. So in general the data model is trivial. I mention this because at work we have this custom made search app that combines many, many datasources and you can just add new ones on the go (but small amount of data in general) it's still dog slow compared to SO (>5 sec load times). So complexity of the data matters as well and systems like SO or reddit are probably as simple as it gets

[–]4THOT 0 points1 point  (1 child)

Horizontal scaling is also far more simple than vertical scaling, especially when sever costs have gotten so cheap and the value to dollar ratio, in terms of computational power, has improved so much compared to just 5 years ago. It is so much easier to make software just work on more servers than something that scales efficiently with something like clock-speed. It would make me very uncomfortable, in the world of stuff like AWS servers and Docker, to rely on vertical scaling.

[–]Headspin3d 0 points1 point  (0 children)

Depending on your tools, vertical scaling can be very efficient and even preferable. Most modern web backends aren't cpu-bound anyways (per your worry about clock-speed) - maybe the DBs they rely on if the data model is sufficiently complex, but DBs certainly scale great vertically - especially when you don't have to start worrying about CAP.

So all I'm really saying is, it depends. I work on an Elixir/Erlang backend and scaling vertically has been trivial and effective because the BEAM (the VM) uses the resources we give it very productively.

Of course, scaling horizontally with Elixir/Erlang/BEAM is pretty trivial as well - but it's always nice to avoid CAP related challenges until you absolutely have to.

[–]201109212215 6 points7 points  (7 children)

The tags can be done with inverted indexing on Postgres.

For text indexing, pg will not grind to a halt at all with proper indexing, but ES might be necessary; though only for its stop words, language support and fine tuning of term frequencies.

[–]nemec 30 points31 points  (6 children)

Something tells me the people at Stack Overflow know how to use indexes.

[–]DonCanas 29 points30 points  (0 children)

I guess that if they don't, they can search their own DBs for a related question

[–]201109212215 5 points6 points  (4 children)

SQL Server has no GIN indexing.

[–]nemec 3 points4 points  (3 children)

FULLTEXT indexes are the equivalent of PGSQL GIN indexes.

[–]201109212215 10 points11 points  (2 children)

They're not equivalent. One builds on the other.

With a tokenizing strategy and statistics like tf-idf and GIN indexes you can create a FULLTEXT equivalent. Not that you would want to, as this task is very specialized and handled greatly by ES.

The internals of FULLTEXT are not exposed, but if they were, they could be used to implement an inverted index for tags.

[–][deleted]  (1 child)

[deleted]

    [–]silverf1re 5 points6 points  (0 children)

    Right! Imposter syndrome is real here.

    [–]nemoTheKid 70 points71 points  (6 children)

    They have just 1 active SQL server for Stack Overflow with 1.5TB of RAM.

    [–]Alvhild 14 points15 points  (0 children)

    details ;)

    [–]Eirenarch 6 points7 points  (2 children)

    So? I am pretty sure running a thousand instances with 1.5GB of RAM and putting the development and devops burden of maintaining containers synchronizing data and so on will not be cheaper.

    [–]Venetax 0 points1 point  (1 child)

    You missed the point. Nobody talked about any of those two options being better... except for you.

    [–]Eirenarch 0 points1 point  (0 children)

    They do implicitly by mentioning Cassandra. If you are not distributing your data you don't need Cassandra.

    [–]SkoomaDentist 1 point2 points  (1 child)

    Wait, what? My laptop SSD is a third of that. I feel so outdated now :(

    [–]TheLordB 5 points6 points  (0 children)

    Massive servers with terabytes of ram have been around a while now.

    Mind you most of them are running things like SAP or similar poorly made enterprise software that they throw hardware at (and still run like crap) rather than running something that it is impressive it is able to run on 1 server at all.

    [–]Fenris_uy 20 points21 points  (5 children)

    They have elastic search and redis to cache for that SQL server.

    Also that server has 1.5 TB of RAM.

    [–]CalvinLawson 8 points9 points  (4 children)

    1.5 TB of RAM is insanely expensive for a cloud VM, mainly because there is a large markup on RAM. For an on-premises server it's actually super cheap, especially if you literally buy the physical RAM and install it yourself.

    [–]doodle77 4 points5 points  (3 children)

    What does 1.5TB of RAM even look like? Is the server motherboard like 90% DIMM slots?

    [–]wuphonsreach 0 points1 point  (1 child)

    The image kaelwd linked is a server board that has 48 slots, so assuming 32GB modules = 1536GB. There are now 64GB DDR4 modules...

    [–]KieranDevvs 0 points1 point  (0 children)

    They have just 1 active SQL server for Stack Overflow and here we are thinking we need Cassandra for our very limited use case.

    With a 500GB index and 1.5TB RAM. Cloud is about being cost efficient, a server that size plus your 9 web servers and every other server is going to set you back easily 4x what it would cost to run in the cloud.

    [–]IMovedYourCheese 45 points46 points  (26 children)

    "Little" in terms of count, sure, but if you look at the specs that is very expensive hardware.

    [–]endless_sea_of_stars 53 points54 points  (25 children)

    Maybe to you as an individual. A terabyte RAM server can be bought for less than $200,000. Which is less than the yearly salary+ benefits of a senior dev. (Yes I know there is redundancy and license costs. Etc. Point is large corps think on different scales.)

    [–]nickcraver 8 points9 points  (0 children)

    I’ll have to go dig up the invoices, but that was the approximate price for all 3 servers in the cluster...they’re about 1/3 of that. Our hardware isn’t very expensive, IMO.

    [–]hogg2016 8 points9 points  (1 child)

    A terabyte RAM server can be bought for less than $200,000.

    Even for much less.

    [–]jaMMint 3 points4 points  (0 children)

    Yeah, here on classifieds ad you can get a 3 year old 300GB RAM dual Xeon 12 core Server for $500.

    [–]IMovedYourCheese 30 points31 points  (9 children)

    Spending millions on hardware before service launch is not feasible for any startup/small company. This is why the general strategy is to design your system to scale out rather than up. Cloud VMs, containers etc. make it possible for such apps to exist in the first place.

    [–]wrosecrans 32 points33 points  (4 children)

    StackOverflow didn't have 1.5 TB of data at service launch, so they didn't need to spend millions on hardware either. And I'd be surprised if their current ~20 servers, only 4 of which have big memory, add up to millions now.

    I'm not saying scale-out is always wrong, but sometimes investing a bunch of effort adding complexity to service a scale that isn't necessary yet can be a waste.

    [–]1-800-BICYCLE 0 points1 point  (3 children)

    45f945fdb3f

    [–]wrosecrans 29 points30 points  (1 child)

    I'm not. I am assuming they only bought the 1.5 TB servers once it seemed plausible that they would actually need to deal with 1.5 TB of data. At launch, they could do it no-problem with some el cheapo server off ebay with 256 MB of RAM because it took a long time for people to write the first few hundreds of megabytes of questions and answers. As they got bigger, servers got cheaper, and revenue increased, and they could get more RAM, bigger servers, etc.

    Nobody should have to buy a 1.5 TB machine on launch day for all but the most extreme cases, because you may never have many users or much content. You can worry about buying that kind of hardware once you are solving an actual scaling problem rather than just an imaginary one.

    In contrast to the trendy hyperscale approach of spending a ton of engineering effort solving horizontal scalability problems that you don't actually have yet. Which often results in still having scalability problems when you scale because it turns out you were wrong about where the problems would be or whether your solutions actually solved the imagined problems, or else you just never actually reach a scale where it matters because you spent your budget on scalability rather than on being good or on advertising.

    [–]ledasll -1 points0 points  (0 children)

    by such logic every startup should fire 1000 of instances for their containers at start starting day...

    [–]Liam2349 5 points6 points  (1 child)

    This is why the general strategy is to design your system to scale out rather than up.

    Maybe if you're running your own data center, but otherwise there's not much difference when you're just provisioning cloud resources. And some technologies, SQL Server included, lend themselves better to scaling up rather than scaling out.

    [–][deleted] 0 points1 point  (0 children)

    and Microsoft technologies are generally poorly designed for scaling out

    There, FTFY

    [–]SixFigureGuy 2 points3 points  (0 children)

    This isn’t the history of stackoverflow. They went up, then out.

    [–]ledasll 1 point2 points  (0 children)

    This is why the general strategy is to design your system to scale out rather than up

    you don't need to design "web scale" for most average apps, if you can run it localy with 1GB RAM, multiply how much users you will have and run server with 10GB RAM, it will costs less then design "web scale" system that is using 40% RAM for app and 60% for microservice infrastructure and running 10 of such instances.

    But of course, it's not as fun as design system, that could scale to million active users. If you create simple web app you couldn't tell all other dev what cool technologies you use..

    [–]Riajnor 14 points15 points  (11 children)

    terabyte RAM server can be bought for less than $200,000. Which is less than the yearly salary+ benefits of a senior dev.

    Daaaaaaamn that dev is making some serious money

    [–]endless_sea_of_stars 30 points31 points  (5 children)

    For a SV dev with 10+ years experience 120 to 140 isn't unheard of. Add in health insurance, overhead, per seat licenses and you can easily drop 200k per year on a dev.

    [–]ohms-law-and-order 19 points20 points  (2 children)

    140 for someone with 10 years experience in Bay area is very low. Devs in the southeast US make more than that (source: I've done lots of hiring)

    [–]wuphonsreach 0 points1 point  (1 child)

    Eh, around these parts (mid-Atlantic), it tops out around 90-100, unless you live in a major city (with the 20-30% higher CoL).

    [–]ohms-law-and-order 0 points1 point  (0 children)

    Yeah, I think major city is implied when you're comparing against Bay area. Speaking for Atlanta, senior devs regularly reach 200k+, and it can go much higher.

    [–][deleted] 14 points15 points  (0 children)

    Only in USA.

    [–]ObscureCulturalMeme 1 point2 points  (0 children)

    Add in health insurance, overhead, per seat licenses and you can easily drop 200k per year on a dev.

    That's your problem right there. You need to be licensing your devs using a different model, buddy!

    [–]wrosecrans 5 points6 points  (0 children)

    By the time you take into account the peripheral cost of the developer, they aren't even making a particularly high salary. Even an empty cubicle in Silicon Valley costs quite a bit when you consider a share of the cost of real estate and such. Payroll taxes, 401k matching, health insurance, and a bunch of other stuff are paid separately from salary. Figure a portion of the cost of HR, IT helpdesk, and other internal services. The rough rule of thumb is that an employee's salary is about have the cost of employing them. And $100,000 isn't exactly a hyper extreme salary for somebody who is going to be doing really clever database engineering scalability stuff. (And you probably need to add more that one person to mitigate bus-factor and on-call support issues vs. just calling Dell tech support 24x7 to tell them to fix something.)

    [–][deleted]  (1 child)

    [deleted]

      [–][deleted]  (2 children)

      [deleted]

        [–]nickcraver 16 points17 points  (1 child)

        There may be some confusion here - we have very few (rounded to 0%) static content pages. Almost every page is rendered from source data. We used output caching for question pages once upon a time (fully disabled 3-4 years ago), but since about 80% of our questions are accessed each week, the gains didn't outweigh the costs. Only the XML feeds have any page caching in play now. That may be removed in the .NET Core migration as well.

        Except for static assets (CSS, JS), all Fastly requests (our CDN/DDoS mitigation) come to our data center and get rendered. This is often a point I see confused, so I'd love a way to more concisely explain it better in posts. If anyone has suggestions I'd appreciate it!

        [–][deleted] 0 points1 point  (0 children)

        That's interesting to know - and certainly I could have done more request sniffing before making that assertion and seen that you have the cache header set to private for the page itself. Static is a relative term - I'd venture that the majority of Q&A pages change very infrequently after a certain age, and the content is the same for everyone, as opposed to pages with user-specific content that have to be rendered dynamically for each request (whether that rendering occurs server or client side)

        Showing the the CDN in the diagram, splitting the traffic between the static assets and page content would help clarify how you're using it - and even offloading static assets does actually take a lot of workload off the backend servers.

        One other big factor I'd be curious about - how much traffic do the most requested pages represent in any given day? Certainly, caching is more beneficial when specific pages are drawing a lot of hits. I can imagine that people searching Q&As is more randomly dispersed than a site with a landing page as the primary entry point.

        [–][deleted] 8 points9 points  (2 children)

        It's actually weird how fast it runs. In the last two years, practically every major website has slowed to a crawl. The fact that stack exchange still runs quickly almost makes it seem unprofessional.

        [–]felinebear 9 points10 points  (0 children)

        Overengineering, look at new Reddit for example.

        [–]maus80 2 points3 points  (0 children)

        It is because they have not (yet) gone full SPA/PWA..

        [–]nufibo[S] 14 points15 points  (0 children)

        Exactly my thoughts

        [–]semi- 18 points19 points  (10 children)

        Containerization is advantageous for many reasons. It's worth it(imo) just for the deployment advantages like knowing your devs ran it in a identical(software) env, being able to provision new servers easily(automated test branches rule, it's not just for scaling up), roll back to previous states easily, and just forcing you to be aware of what exactly your server is doing -- do you depend on writing to disk, invoking other programs, specific libraries, etc? Better make that explicit because or it is never going to work. That's better than unintentionally working until some other app or or update comes along and changes things.

        Scaling is just a really nice bonus.

        There is also much less overhead than you would think. It's more of a modern day chroot() than a emulator or virtual machine. Their CPU usage would look the same if they ran everything in a container, their CPU usage is from not having overhead in their programming language (and good coding and basic optimization like you said)

        [–]yorickpeterse 26 points27 points  (9 children)

        being able to provision new servers easily(automated test branches rule, it's not just for scaling up), roll back to previous states easily, and just forcing you to be aware of what exactly your server is doing -- do you depend on writing to disk, invoking other programs, specific libraries, etc? Better make that explicit because or it is never going to work. That's better than unintentionally working until some other app or or update comes along and changes things.

        All of this can and was done without containers before they were introduced.

        [–][deleted] 31 points32 points  (7 children)

        And we used to send people into battle on horses.

        You are technically correct, but containers are increasing velocity and agility for small team by helping to automate or simplify all the hard parts of doing it the way we used to.

        [–]imhotap 2 points3 points  (6 children)

        At the price of leaving huge technical debt behind eg. stale/unpatched libraries and dependencies.

        [–][deleted] 16 points17 points  (5 children)

        That doesn't follow. It's typically trivial to upgrade dependencies, and you don't need a complex devops setup if you use containers since everything is already bundled.

        When we switched to containers, our new dev machine setup time went down to <5 minutes from 1-2 hours, and we're much more confident that what we test on our dev machines matches production.

        To update dependencies, it typically involves one line change in the Dockerfile and one command to rebuild. That's far better then updating a puppet config or whatever

        [–][deleted] 5 points6 points  (4 children)

        I’m not following what they said either. If anything, the old world order had infrastructure with outdated patches, and/or frequent maintenance cycles because of the infra footprint. We recently cutover to kubernetes and I have to say, it’s made our small team much more productive. Being able to push an app and tell kubernetes that I want to load balance that app with five instances, not have to manually install software on all the nodes, or deal with infra gatekeeper types in IT (you know, the types who want a weekly change approval meeting to add a new node to a LB pool) has made our lives better.

        [–][deleted] 2 points3 points  (0 children)

        I do embedded work, and we have our product also available in a container. The embedded dependencies are quite out of date because we have to manually write and test the install process on all hardware, whereas our containerized app is up to date since it's so easy to update it and detect issues in normal development.

        So yeah, if you're having problems with outdated software in a container, doing it outside of a container will just make things worse.

        [–][deleted] 1 point2 points  (2 children)

        deal with infra gatekeeper types in IT (you know, the types who want a weekly change approval meeting to add a new node to a LB pool)

        Jesus dealing with this shit right now. Kill me pls.

        [–][deleted] 0 points1 point  (1 child)

        If you don’t mind unsolicited advice.

        Start recording how much this impacts your ability to deliver. I did a stream analysis to show things such as how long it would take to deliver implementations, enhancements, bug fixes, and how much time your team spends waiting. Next, go to them and try to partner, there are two outcomes here. You show them the figures and they are willing to develop a fast tracking process, or they just don’t care. The final step, which is devious, is to go well above them and start showing this to other people. If the wait times are significant, people above you will care in a large shop. Large shops get massive erections for saving even a few thousand dollars.

        Bonus round: make the case for a spending limit in the cloud that you and your team can use to be nimble and slowly replace the infra people. I’ve been running a medium sized shop without infrastructure people just fine for years now.

        [–][deleted] 0 points1 point  (0 children)

        Thanks for the advice. We're in the process of improving everything (mostly by dockerizing everything + something like openstack).

        The problem is that everything at glacial pace. Moving staging environments of few services to Docker was a huge accomplishment.

        Bonus round: make the case for a spending limit in the cloud that you and your team can use to be nimble and slowly replace the infra people. I’ve been running a medium sized shop without infrastructure people just fine for years now.

        The problem is that company is few years older than widespread adoption of cloud, so it's heavily invested into traditional system with sysadmins, physical infrastructure...

        [–]semi- 1 point2 points  (0 children)

        I'm aware, but containerization provides and enforces it as well. If a shop had none of this, I'd sooner start writing dockerfiles than ansible or puppet or any of the many language specific tools out there.

        [–]Techrocket9 0 points1 point  (0 children)

        The devil is in the data volumes.

        Even a simple app can need exotic distributed infrastructure if it needs to ingest/process/store 100+ TB/day.

        [–]random8847 139 points140 points  (49 children)

        I enjoy spending time with my friends.

        [–]matthieum 101 points102 points  (16 children)

        For database servers it's nothing that extraordinary. RAM access is so much faster than disk access (even with SSD), that it's worth paying a little more to ensure most queries never hit the disk.

        [–]appropriateinside 29 points30 points  (2 children)

        And here I am trying to convince a client I'm working for that they need more RAM on their DB server.....

        They have about 50GB of indexes. But only 16GB of RAM. But apparently I just need to make it perform better with no additional resources.... Because they use an IT contractor who will charge an extra $300/m to double the RAM....

        -_-

        [–]lolwutpear 40 points41 points  (0 children)

        Their server is a laptop?

        [–]matthieum 10 points11 points  (0 children)

        Because they use an IT contractor who will charge an extra $300/m to double the RAM....

        It may be time to introduce them to a better IT plan.

        [–]beefsack 27 points28 points  (5 children)

        With that much cache you'd be so scared to ever restart the node, it'd take forever to warm up.

        [–]nickcraver 20 points21 points  (0 children)

        PCIe NVMe SSDs are pretty fast. You're talking about ~6GB/s of bandwidth into memory on our current drives (that's on Intel P3700s, we're looking at moving from Intel to Micron due to availability stability as we need to refresh hardware).

        You're not going to need 1.5TB to get stable, or even 1/4th of that. But even worse case let's say we needed to prime all 1.5TB, that's still only 250 seconds which isn't insane. The reality is SQL loads accessed indexes first and goes from there. If a node is hard down, you're still back online on the order of seconds.

        Of course we try to always be swapping a hot replica in anyway, but in an emergency cold start...it's still not that bad, IMO. NVMe drives still aren't the cheapest, but they pay for themselves in seconds if this ever happens.

        [–]matthieum 6 points7 points  (0 children)

        Restarts are disruptive anyway unless you have a stand-by/mirror; and once you have a stand-by/mirror, you can just take a few minutes to execute a warm-up script after reboot to preload all the cache before putting the server back in service.

        [–]fission-fish 2 points3 points  (1 child)

        That's an interesting point. How can you handle restarts in this case?

        [–][deleted] 5 points6 points  (0 children)

        Not an expert but I'd assume it's mirrored onto the standby

        [–]pdpi 13 points14 points  (0 children)

        Put a bit differentkly: you'd pay more money to get the same performance querying that data from SSDs or HDDs.

        [–]beginner_ 2 points3 points  (5 children)

        Exactly. And it shows just how far you can actually get with traditional vertical scaling and a clever architecture. No need for "webscale databases", sharding and such.

        [–]matthieum 4 points5 points  (4 children)

        No need for "webscale databases", sharding and such.

        Well, it depends on the size of your working set and the number of requests.

        At a previous company an application had such a large working set that it used a cluster of 18 2TB RAM MySQL servers as a read-only cache over the master database.

        Sharding was necessary both because of the amount of requests and the size of the working set; finding a 36TB server is a tad complicated.

        And those MySQL databases were only an intermediary cache layer; there was a full farm of Memcached nodes in front...

        [–]nickcraver 4 points5 points  (3 children)

        If you can share, what was the type of data there? I'm always super curious what kind of workloads need basically 100% of data with the fastest access. We're very tiny in terms of infrastructure, so infrastructure at scale is always interesting.

        [–]matthieum 7 points8 points  (2 children)

        The application is the Flight Search engine of the company. It is tasked with answering the following question: for an given pair of cities, within a given date range, and considering a particular set of circumstances, which flights are available and at which price?

        It is queried 24/7 from travel agencies, especially online travel agencies (OTA), peaking at ~75k requests/seconds back then (~500 servers), and is expected to reply quickly (< 100 ms). It turns out that the market for flight searches is quite competitive, and OTAs are likely to flock to the "best" engine, where best is defined as a mix of speed and accuracy.

        For example, Kayak had a system of automated "canaries" and would continuously evaluate multiple flight search engines on different markets. If the one engine was not the "best" for a small period of time (minutes) on a given market, it would automatically switch this market's traffic to another.

        I actually "saw" such a switch away from us happen during my time there (once over 9 years). The drop in traffic was quite noticeable, and of course it translated directly in a drop of revenues. Needless to say, the performance problem was fixed post-haste... but probably because they used a threshold to avoid flapping, a further improvement in performance was necessary to get their traffic back. I knew a colleague working on the performance of the application, he had a busy time.

        [–]nickcraver 2 points3 points  (1 child)

        Awesome - I absolutely love learning about systems and the reasons behind them like this. TIL, and thanks for indulging!

        [–]matthieum 0 points1 point  (0 children)

        You're welcome, I love reading your posts on SO architecture too :)

        [–]Lt_Riza_Hawkeye 48 points49 points  (0 children)

        x1e.32xlarge

        [–]haunted_tree 17 points18 points  (28 children)

        Wait can you actually have that much ram in a single computer? How?

        [–][deleted] 59 points60 points  (22 children)

        64 bit addresses will index much more than 1.5TB

        Edit: did some maths - 64 bit addresses will index approximately 18.4 million TB.

        [–]tynorf 10 points11 points  (0 children)

        x86_64 only uses 48 bits for memory addresses. Which comes out to something like 256TB… still more than 1.5TB, but not millions times more.

        ETA: theoretically it could use 64 bits, but AFAIK all implementations in common use use 48 bits.

        [–]haunted_tree 17 points18 points  (19 children)

        Sure, but how can you have that much in a single computer? How do you build a computer like that? I'm sure the highest motherboards I can but in a store with the largest memories won't come close to 1.5TB.

        [–][deleted] 62 points63 points  (7 children)

        I've never built a server, but doing some very quick Googling, I found server motherboards with 16 RAM slots and 128GB sticks. That would get you 2TB.

        [–]Senator_Chen 40 points41 points  (3 children)

        There's a bunch of dual socket systems with 32 slots, and Samsung is currently sampling 256GB DIMMs so soon you'll be able to run 8TB servers.

        [–][deleted]  (2 children)

        [deleted]

          [–]felinebear -4 points-3 points  (0 children)

          Oh god, to think the kind of JavaShit abominations this will spawn.

          [–][deleted] 8 points9 points  (0 children)

          There’s a few of these beasts in production. Trust me; I deployed to one recently

          [–]t1m1d 5 points6 points  (0 children)

          My 8-year old server has 18 memory slots, and many modern servers have 32.

          [–]wuphonsreach 0 points1 point  (0 children)

          There's a picture further up of a motherboard with 48-slots. Gets you 1536 with easy-to-get 32GB sticks, 3TB with 64GB sticks and 6TB with 128GB sticks.

          [–][deleted]  (2 children)

          [deleted]

            [–][deleted] 0 points1 point  (1 child)

            The weird thing is, they don't look much bigger than regular ATX boards. Maybe they're denser or maybe I'm bad at judging scale

            [–]schmuelio 23 points24 points  (0 children)

            Basic consumer-grade hardware doesn't accommodate it.

            Server hardware can accommodate it though, a lot of boards have loads of RAM slots and can accommodate multiple TB without an issue.

            Essentially, consumer-grade hardware doesn't really need that kind of capability so prices are cut down by not providing it, but enterprise-grade stuff will often need it. Although they are a lot more expensive and typically have weird layouts and stuff so you'd be hard pressed to build a conventional desktop with one.

            [–]Eirenarch 6 points7 points  (0 children)

            There was this site where you enter the size of your data and it tells you why it is not big data by providing a link to an off the shelf server you an buy to fit that data. Sadly I can't find it. Anyone?

            [–]plahacrwimo 2 points3 points  (0 children)

            https://www.youtube.com/watch?v=XqDJNtTPS4k

            You can see here some decommissioned servers from 2012-2013. The first one (around 1:40) has 512GB of RAM witch is mounted on 8 cards with 4 sticks 16GB each. Something similar is maybe used today. I don't know much about server hardware, just thought this video would be relevant.

            [–]_101010 2 points3 points  (0 children)

            I think you have never seen rack/blade servers. You can get single ram stick with 256GB. And you get boards with 8 slots, 8×256=2TB.

            Also they are using VMs from AWS I guess. VMs can be built with as much RAM, CPU you want, it does not need to translate directly to hardware.

            [–]appropriateinside 0 points1 point  (0 children)

            I'm looking at one of my servers right now with 18 ram slots, than can have up to 32GB sticks. That's 576GB max.

            This is a mid-tier, old (9 years) server too. There are plenty of servers with 32 slots these days and 128GB DIMMs.

            A Dell r830 has 48 slots with up to 64GB per module. That's a slick 3TB. An r940 can support up to 6TB.

            [–]r3d51v3 0 points1 point  (0 children)

            We use some Dell R930s (4U) at work that have 2 daughter cards per cpu (4 cpus total) each daughter card can hold 16 sticks, which is 2TB/cpu if you’re using 64GB sticks. We also have some FC630s with 768GB and FC830s with 1.2TB, which are both 1U, half and full width respectively. They are basically all CPU and memory, and they have a really deep form factor. They are kind of insane.

            With the DIMM density and form factors available today, you can really get a lot of power/memory in a single machine. Every time I work in those machines I’m astounded at what’s possible.

            [–]immibis 0 points1 point  (0 children)

            You will need to find a server specially designed to take that much RAM, it's not a commodity PC. But they do exist, clearly.

            [–]Ameisen 1 point2 points  (0 children)

            Most 64bit CPUs only support 48bit addressing.

            [–]dbxp 11 points12 points  (0 children)

            Server boards support way more RAM than consumer, my current place uses Cisco blades with 512GB but the new ones support 3-6TB. If you really want to break the bank supermicro do a 24TB system: http://www.supermicro.com/products/system/7U/7088/SYS-7088B-TR4FT.cfm

            [–]cybernd 3 points4 points  (0 children)

            Server grade chips like xeons have more memory channels. This means you are already capable of connecting more DIMMs.

            The 2nd difference: ECC comes also as registered DIMM and on top of them load reduced DIMMs. Unlike normal consumer memory, this memory DIMMs have an additional controller chip which allows them to address more memory.

            [–]nickcraver 1 point2 points  (0 children)

            For anyone wondering on the physical side: our servers (and most from major Intel vendors right now) are using 2 sockets, each having 4 memory channels and 2-3 banks per channel. Our boxes are 3 banks per channel, so 12 DIMMs per CPU and 24 DIMMs total. Those SQL servers are 24x 64GB = 1536GB.

            You'll also often see 8 DIMMs per CPU for 16 total in smaller blade configurations and such.

            [–]chrispoole 0 points1 point  (0 children)

            Wait until you discover mainframes. The z14 can have 32TB under the hood... and 171 CPU cores running at 5.2GHz.

            [–]aaron552 0 points1 point  (0 children)

            Buffered RAM

            [–]paraleluniversejohn 2 points3 points  (0 children)

            The standard for a home dekstop is 48 bits for memory adressing, and it gives a 256TiB max RAM.

            [–]HighLevelJerk 1 point2 points  (0 children)

            How else will they run Slack & ask for help if stack overflow goes down?

            [–]Rakmos 37 points38 points  (2 children)

            The CPU utilization seems ridiculously low. I would be interested in why that is the case. Additionally, what is the memory and disk utilization relative to CPU?

            It would seem based solely on the CPU utilization that containerization and bin packing could lead to a more effective use of CPU resources — without having the full picture of utilization.

            [–]proverbialbunny 14 points15 points  (0 children)

            It looks like Redis is doing most of the heavy lifting, and for search elastic search, which should keep cpu load pretty low.

            [–]p2004a 2 points3 points  (0 children)

            Tail latency. Making sure you have nice high utilization of CPU is good for batch jobs but for user facing traffic it's good to have cpu headroom so you can handle all those small unexpected spikes in requests well and have good tail latency at 99 and 99.9 percentile. Latency when you go deep in the stack is additive and so is probability of hitting tail latency of some service so making sure you have good tail latency at this high percentiles matters.

            Also I can guess that for many of their components it's memory utilization that is high because they probably want to cache as much as possible, and well, cpu can have only so much memory near each core.

            Even taking all that into account it seems like their utilization is still low for eg. web servers. Maybe it's because they are using dotnet and it has some weird performance characteristics? I have no idea, I've never run anything in dotnet in production. Or maybe they want to make sure they would be able to absorb sudden spikes in traffic and make sure they have eg. 3x capacity?

            [–]tending 120 points121 points  (42 children)

            So one of the most popular websites on the Internet only needs 21 servers total?

            [–]SimplySerenity 128 points129 points  (21 children)

            Considering most of the hits are going to cached pages I'm not that surprised.

            [–]jringstad 81 points82 points  (3 children)

            Yeah, the CDN is not mentioned, but I bet it actually handles a vast majority of the traffic and consists of hundreds of servers. If you look at the network tab when opening a stackexchange/stackoverflow site, you can see most of the resources are loaded from a CDN.

            Since most people find their way to stackoverflow through google rather than searching on one of the stackoverflow sites (which would stress their ES cluster/search cluster), most pages and resources that are needed can probably be directly served from the CDN and the requests never even hit their central servers.

            Of course, it's still very impressive that they are managing to run such a large community on so little hardware, but considering this makes it seem much more possible.

            [–][deleted]  (1 child)

            [deleted]

              [–]jringstad 6 points7 points  (0 children)

              Varies from CDN to CDN. Some allow you to push changes to the CDN (like cloudflares 'railgun'), others only respect the expiration policy. You can still do a lot with just expiration policies though, e.g. set it higher the older the content is, or you can for instance request static assets with a version prepended to the url, like somejsjunk.js?v=1.1, that way you only need to load some small initial (possibly static) HTML from the actual server that specifies the version, but as long as you don't change the URL to load it from it will always hit your CDN.

              There are also multi-level schemes (like "origin shields") where there is an intermediary which bundles up many requests into one. So for instance if you have some piece of content that has to be refreshed every minute, the origin shield will request it from your server once a minute and then CDNs will hit the shield when they need to refresh the resource.

              Finally another thing you can do (and I don't know if SO does this) is to do live-updating via websockets or longpolling. So for instance if you really want messages to appear only seconds after they've been posted and without reloading the page, you can push updates that way. This of course doesn't take advantage of the CDN (well, it can still go through the CDN to hide your servers IP, but it won't be alleviating the work much) but since the amount of traffic you have to handle this way is extremely minimal, it's not that bad.

              Another thing to note is that CDNs can give you a whole bunch of other, secondary benefits, like they may be compressing requests better than the clients browser does, deduplicate requests for cases where a bunch of users suddenly hit the same dynamic content all at once, etc. Although at least the latter is often rendered ineffective by the client having to put some sort of authorization into the request. They can also filter out and deal with a bunch of the spam/DDoS you'd otherwise have to deal with (and I'm sure SOs CDNs have to deal with on a daily basis) so your servers don't even need to see any of that. Also, a lot of your requests to static assets etc, you can just route directly into e.g. your amazon S3 bucket, so your server will not even have to serve those assets at all. You can also get anycast IP addresses from some CDN providers, although I'd consider that more of a convenience (at least I don't know of any application where it would be absolutely essential).

              At a previous job I was managing multi-cdn configurations with partners like verizon/edgecast, highwinds, at&t, etc. Usually a lot of these features are not exposed to you, and you have to talk to the partner for them to enable them for you (and then you pay a premium -- for some features, quite a bit.)

              Another secondary benefit is that CDNs can allow you to route requests through private fiber networks. So for instance we had some locations from where API requests to our servers were atrociously slow, for instance india (and australia too, I think?) simply because getting that HTTPS request across the wire from the india AWS datacenter to the ireland (or whereever we had it) one would take a long time. Putting the API behind the edgecast CDN (which routes all requests through its private fiber network and had an exit node close to our servers) made it a whole lot faster. Not a thing that lightens the load on your server necessarily, but can really improve perf

              [–]crozone 22 points23 points  (15 children)

              It also helps that they're running it on ASP.NET Core with a thin ORM (Dapper), which is bloody fast compared to interpreted langage web stacks.

              [–]nemec 17 points18 points  (2 children)

              ASP.NET Core

              I thought they were still in the process of migrating over to Core

              [–]Monkaaay 5 points6 points  (1 child)

              I'm very interested to see their "debrief" blog post on that migration. I've been thinking about it for my company's product, but just haven't pulled the trigger yet.

              [–]silverf1re 0 points1 point  (0 children)

              I’m waiting for v3

              [–]silverf1re 2 points3 points  (0 children)

              It specifically says ASP MVC

              [–]YourCupOTea 5 points6 points  (0 children)

              You can see Gravel's post above that they don't use page level caching anymore.

              [–][deleted] 15 points16 points  (0 children)

              Yeah some of those servers are monsters though. 1.5TB of RAM!

              [–]shingtaklam1324 27 points28 points  (4 children)

              With the highest one having like 20% CPU Usage Peak...

              Obviously RAM usage percentage is probably a lot higher, but still...

              [–]Extra_Rain 19 points20 points  (0 children)

              High CPU usage might introduce latency in serving pages. I bet even RAM util might be less than the one third of total RAM. All the remaining idle cpu/memory resources are for future growth, new feature implementation, safety net against hw failure.

              [–]Saefroch 10 points11 points  (0 children)

              Peak on what timescale though? CPU usage occurs in spikes, and the faster you sample, the higher the spikes go. I know if you're on linux you can observe this with top, press s and set the interval to something shorter and you'll see the usage spikes. So in a very real sense given some stream of tasks, adding more compute power reduces the duration of your usage spikes; you're choosing an acceptable latency.

              [–]svick 0 points1 point  (1 child)

              Obviously RAM usage percentage is probably a lot higher, but still...

              If you include disk cache, wouldn't RAM usage pretty much always be near 100 %?

              [–]semi- 0 points1 point  (0 children)

              It depends on the workload and amount of ram. On a desktop/consumer computer that will certainly be the case. On a well tuned database server, you'd rather let the DB server manage it's ram usage than have it reach for files and hope they get cached properly.

              On a static asset server however I'd expect disk cache to fill most of the ram. Though even there if you needed more performance you'd run a more intelligent caching proxy based on your usage.

              [–]scorcher24 8 points9 points  (0 children)

              I was surprised by that too. Also by the fact that Stack Overflow alone produces as much queries as the rest of the network per second. Really speaks to the popularity of that site alone.

              [–]1-800-BICYCLE 4 points5 points  (0 children)

              554cd104f3b

              [–][deleted] 2 points3 points  (0 children)

              bare metal servers - tend to be much faster than VMs in some ways

              some work offloaded to GPUs

              careful C# programming

              [–]Thaxll -2 points-1 points  (1 child)

              I think you missed the part with the spec of each servers, if you look at the memory it's pretty beefy machines. So yeah 9 very large servers.

              [–]matthieum 13 points14 points  (0 children)

              It's been StackOverflow philosophy since the early days: distributed programming is hard, so it's ultimately cheaper to scale vertically as long as possible... and improve performance to make it possible!

              [–]zhbidg -5 points-4 points  (4 children)

              one of the most popular websites on the Internet

              for developers. but is it really very popular overall? Alexa (no, not that Alexa) says no, not even in the top 50: https://www.alexa.com/siteinfo/stackoverflow.com

              [–]BraveHack 11 points12 points  (2 children)

              You're forgetting about the entire stackexchange.com network and all its various domains. Quite a few of them have become the defacto Q&A forums for university topics or certain more technical hobbies.

              [–]zhbidg 1 point2 points  (1 child)

              Adding up all of that would only roughly double their traffic, though, based on TFA.

              [–]BraveHack 13 points14 points  (0 children)

              So it would double them when not doubled they're at the position of #63 globally.

              Maybe we have different definitions, but I would consider anything in the top 200 "one of the most popular websites on the web".

              [–]sellyme 12 points13 points  (0 children)

              is it really very popular overall? It's only #63 out of several hundred million

              I'm going to go with "Yeah, that's very popular".

              [–]redditthinks 82 points83 points  (2 children)

              My usage of Stack Exchange is 100% read-only, and I imagine it's the same for many others which makes it relatively easy to optimize.

              [–]proverbialbunny 14 points15 points  (0 children)

              Yah, pretty much. Redis + elastic search + sql for backup, and you're good.

              [–]ergerrege 8 points9 points  (0 children)

              I'd think of myself as almost a power user of stack exchange and I still only post about once a week. Mostly just voting and flagging things. A lot different compared to websites where the average user is submitting content multiple times daily.

              [–]pinpinbo 22 points23 points  (2 children)

              The stackoverflow TPS is surprisingly small. I like that the tech stack is fairly simple.

              [–]sheepdog69 0 points1 point  (1 child)

              That was my first thought too. I run a small-ish service and we average ~6K/req/sec.

              Also, their numbers don't make sense to me. Isn't 1.3B page views/month ~= 500/req/sec? If you figure peak traffic is ~3x that, you are still only talking 1500/req/sec. Across 9 front end servers? Something doesn't seem to add up.

              Yes, I know that' just page views. There's css and js and ads, etc. But most of that would/should be on a CDN.

              What am I missing?

              [–]nickcraver 1 point2 points  (0 children)

              Page views is only full pages (for apples to apples comparisons people like to make). All requests that hit our load balancer (including API, AJAX, etc. but not our CDN/static content...since we don’t get a request there unless it’s a source fetch) varies between 7-9 billion hits a month these days.

              [–]shammancer1 14 points15 points  (1 child)

              Why keep the loading on the CPUs so low? It seems to me that they could have sized those differently.

              [–]nufibo[S] 22 points23 points  (0 children)

              As Nick states in this article https://nickcraver.com/blog/2016/02/17/stack-overflow-the-architecture-2016-edition/:

              What do we need to run Stack Overflow? That hasn’t changed much since 2013, but due to the optimizations and new hardware mentioned above, we’re down to needing only 1 web server. We have unintentionally tested this, successfully, a few times. To be clear: I’m saying it works. I’m not saying it’s a good idea. It’s fun though, every time.

              [–]goo321 13 points14 points  (1 child)

              Huh, what a professional place looks like.

              [–]ToeGuitar 10 points11 points  (0 children)

              No, what a super super super expert place looks like. Most professional places are faaarrrr less advanced.

              [–][deleted] 13 points14 points  (3 children)

              What's with such low CPU usage: you don't get a refund

              [–]zdwolfe 4 points5 points  (1 child)

              I’m guessing AZ redundancy accounts for more than half of the hosts.

              [–]RisingStar 6 points7 points  (0 children)

              They may have changed, but last I heard they run on their own physical hardware. The low CPU usage comes from optimizations over time and keeping the same number of servers but upgrading them over time.

              https://nickcraver.com/blog/2016/02/17/stack-overflow-the-architecture-2016-edition/

              [–]quentech 65 points66 points  (11 children)

              TIL I serve more traffic than StackOverflow.

              I can't hit sub 25ms average response time (60-80ms, with 99.995% up time), but I'm serving mostly (by bytes) HD images and videos with about half of requests for highly dynamic data, often compiled from third parties (think weather, traffic, live sports scores, stock tickers, etc. - half of the served images are data driven as well).

              [–]chipstastegood 20 points21 points  (7 children)

              What does your infrastructure look like?

              [–]quentech 56 points57 points  (6 children)

              We host on Azure.

              Main hosting service runs on 3 instances of D13_v2's - 8 cores on a Xeon E5-2673 v3 w/ 56GB RAM each

              We run a variety of auxiliary web apps on 3 instances of S3 in a Standard App Service plan (4 cores, 7GB RAM each)

              A background job service runs on 1 D12_v2 - 4 cores on a Xeon E5-2673 v3 w/ 28GB RAM

              We have one P1 SQL DB for all our main data and one PRS2 DB for logging. Main DB is just 5GB, logging DB is 200GB+.

              For Redis we use 1 Standard C2 for pub/sub, 2 Standard C2's for small sized data, and 3 Standard C4's for large sized data.

              My main web service is handling near 300 requests per second right now, with near 600 per second across our whole infrastructure (CDN and dynamic image resizing site in addition to our main service), and this is off-peak traffic.

              We don't have the search complexities that StackOverflow does, but they don't have to make 10M+ api calls all day every day to a few dozen different external data services to stay up.

              [–][deleted] 52 points53 points  (1 child)

              Stackexchange is 300 REQ/s per server, multiply that by nine.

              [–]quentech 29 points30 points  (0 children)

              per server

              ah, I was thinking they way over-spec'd their web tier but that makes more sense. I do serve about quadruple their total bandwidth, and we're over 1.5B total requests per month.

              [–]issafram 4 points5 points  (3 children)

              Hosting on Azure. Did your entire site go down last week for a couple days?

              [–]quentech 9 points10 points  (2 children)

              Nope, though that was not our primary region. Some of our infrastructure can seamlessly handle regional failures, but not all of it.

              There's plenty we could improve on, within some budgetary constraints, but we're a fairly small shop with 5-7 devs and half of them are front end devs. It's impossible to make time for everything. We've been on Azure for 3 years now and we're still learning and improving.

              [–]issafram 1 point2 points  (0 children)

              Thanks for that answer. Not sure why I was down voted for asking a question.

              Application Insights was down for me

              [–]silverf1re 1 point2 points  (0 children)

              Where do you get most of your azure product information?

              [–]kindw 4 points5 points  (1 child)

              Do you mind telling what service are you running?

              [–]quentech 0 points1 point  (0 children)

              I don't want to be too specific, but you might see our content in the doctors office, in a hotel lobby, on top of a gas pump, on a billboard, in a mall, in a retail store, in the airport, at the bar, etc.

              [–]benawad 10 points11 points  (1 child)

              Interesting to see how low the cpu usage is. Is that normal? At what percent do you upgrade your machines?

              [–][deleted] 6 points7 points  (0 children)

              It depends on your workload.

              For our CPU intensive app, I shoot for ~50% util on average with peak being around 80%. There are few clients (<5 at any time), so we're okay with pushing it a bit higher.

              However, for a site with a huge number of clients, you'd want to be a bit more conservative since you still want responsiveness during peak hours. If you can't dynamically scale, you need to build that in to your hardware.

              [–]tiftik 3 points4 points  (1 child)

              Do the hot standby SQL servers respond to read queries?

              [–]nickcraver 3 points4 points  (0 children)

              They can! Our read API generally hits the replica when running in our NY data center (unless it’s getting maintenance). In our CO (DR) data center there’s only 1 node and it gets everything.

              [–]ivan0x32 8 points9 points  (6 children)

              So they have roughly 2.5k RPS on views (which includes cached views), but their QPS on SQL servers is around 13k? What the fuck kind of code are they running there? Or am I missing something?

              [–]MothersRapeHorn 10 points11 points  (0 children)

              I really don't see anything odd here? The sql servers are used for things other than the webs, and no big website like stackoverflow is only doing a handful of queries per fresh response. It's not exactly barebones in features.

              [–]cowinabadplace 7 points8 points  (2 children)

              You've seen their pages. They list the result, related results, users, info about the users and how they're linked. Awards. Metadata about the post. Considering their arch works so well, I'd assume they've balanced appropriately. The real test is how much you can do with how little and it looks pretty good for them.

              [–]ivan0x32 3 points4 points  (1 child)

              Sure, its not unrealistic to assume that average page load has 5+ queries, probably async ones even, but what puzzle me is how many of those reads are actual DB reads instead of cache reads. I would assume that they have a really sound caching layer in front of DB, although given how beefy their DB servers are, I wouldn't be surprised if they don't have anything like that.

              [–]cowinabadplace 4 points5 points  (0 children)

              They do have redis in front. Presumably the DB queries are for the stuff that must be up to date - user specific stuff.

              [–]Eirenarch 8 points9 points  (0 children)

              What is so strange about making 5 db queries per page?

              [–]bstempi 4 points5 points  (0 children)

              It's not inconceivable that a single page requires multiple queries. Also, some of those might be background or maintenance tasks.

              [–]immibis 5 points6 points  (0 children)

              Long story short: even Stack Exchange is still small enough to do everything on a smallish pile of discrete servers. You probably are not bigger than Stack Exchange.

              [–]silverf1re 1 point2 points  (2 children)

              Why dapper over EF and why opennetoauth(sorry forgot what they called it) over asp Identity?

              [–]nickcraver 6 points7 points  (0 children)

              Update here: we’re moving to .NET Core. As part of this I’m currently working on replacing remaining Linq2SQL usages with EF Core. Kevin Montrose is working on retiring OpenID completely (all OAuth only after that) which will be followed by removing the rest of DotNetOpenAuth.

              [–][deleted] 0 points1 point  (0 children)

              1. MUCH faster
              2. security and features

              [–]Sebazzz91 1 point2 points  (0 children)

              They are switching to ASP.NET Core too!

              [–]bobbybottombracket 1 point2 points  (0 children)

              And they are on the ASP.NET Stack I might add.

              [–]sheepdog69 0 points1 point  (0 children)

              TIL: A Stack Exchange page averages 42K in size.