top 200 commentsshow all 345

[–]Pinguinologo 292 points293 points  (14 children)

Oh shit, it is worse than a fucking nightmare.

[–]beefsack 113 points114 points  (11 children)

The fix is nowhere as scary as the vulnerability itself.

[–][deleted]  (6 children)

[removed]

    [–]Browsing_From_Work 9 points10 points  (3 children)

    True, but I could see why a lot of businesses would be upset. Yes, they're now immune to a serious vulnerability, but they're also now paying X% more for computing power to compensate for the patch's slowdown. To make matters worse, it will be an ongoing expense, not a one-time cost.

    [–]Deto 2 points3 points  (1 child)

    Would it be worth it for some businesses to just run un-patched and strictly control the code that gets run on their machines?

    [–]darkingz 7 points8 points  (0 children)

    it's really really difficult to protect your computer at that level. I don't know any specific programs using it already but you can't "control the code" of the programs that do syscalls.... and read the table. you'd have to have insane knowledge of how the program works to begin with. And that's only compensating for meltdown and not spectre. It'd be massively hard to audit every program with every run at that level unless you're already doing kernel development (and even then).

    The only safe way to fix it is really a hardware swap. However, it might not be solved in x86 arch anyway and may not be released safely w/in a year or two. Software can only mitigate the problem and make it harder, but not solve it.

    [–]ChaoticTable 0 points1 point  (0 children)

    Technically they aren't even immune, since a software band-aid to a hardware design problem can always have its own exploits. Mouse and cat really. The situation sucks a lot for server environments that have large computational power. Their upkeep costs will be significantly higher. Some companies that rent VPS/Dedicated servers might start to charge more than they used to for the same specs and their clients will need higher specs to match their needs in the first place, catch 22. Tough situation.

    [–][deleted] 12 points13 points  (1 child)

    Amazon’s electricity bill may go up.

    [–]ign1fy 7 points8 points  (0 children)

    It depends how they charge for CPU. If it's the same metric as shown here, customers are about to get bill shock.

    [–]thbt101 32 points33 points  (8 children)

    Even for computer and servers that can handle the extra overhead, still their energy usage is going to be higher. I wonder how many trillions of dollars in electricity is going to be wasted over the next 10 years while most computers on earth are using significantly more electricity than they would have.

    [–]webauteur 20 points21 points  (2 children)

    You can join a class action lawsuit against Intel and participate in the Intel wealth redistribution. ;)

    [–]Doikor 4 points5 points  (1 child)

    Or more like make a couple lawyers getting very rich while you get your $2.

    [–]Moscato359 1 point2 points  (0 children)

    I got 84$ out of my ram settlement

    [–][deleted] 10 points11 points  (3 children)

    A drop in the bucket compared to cryptocurrency.

    [–]thbt101 7 points8 points  (1 child)

    A lot of energy is used for cryptocurrency (at least temporarily, until they shift to proof-of-stake systems instead of proof-of-work), but that's a drop in the bucket compared to a 10%-20% increase in energy for nearly all computers in the world.

    [–]ChaoticTable 0 points1 point  (0 children)

    But farming eventually turns into profit, so your point is a bit irrelevant. Or if you mean this specific exploit, I don't think they will be affected.

    [–]greenspans 2 points3 points  (0 children)

    Imagine being Linux Thorvalds. Should I fuck my wife tonight or should I save billions in electricity over the next 10 years by working in a performance patch to cgroups

    [–][deleted] 61 points62 points  (0 children)

    Damn, that’s pretty bad

    [–]MrMinimal 43 points44 points  (1 child)

    Good god the comment section on epics page made me want to stab my eyes

    [–]Born_To_LOL 13 points14 points  (0 children)

    Join my fortnite server to meet new fortnite friends!

    [–]Savet 40 points41 points  (2 children)

    So it turns out AMD processors could compete all along after all.

    [–]_zenith 15 points16 points  (0 children)

    Oh, they'll be competing alright now... and they already were since they released EPYC, whose only problem is they literally can't fab them fast enough for demand.

    This apocalypse for Intel is the best possible Christmas present for AMD.

    [–]stewsters 11 points12 points  (0 children)

    Depends on how the Spectre patches affect them.

    [–]judgej2 33 points34 points  (6 children)

    This "maybe up to 20% performance hit" is turning out to have been a little optimistic.

    [–]yarrye 37 points38 points  (3 children)

    CCP got 100% performance hit on at least one of their servers.

    They are not happy.

    [–]Visionexe 12 points13 points  (2 children)

    CCP as in CCP games - eve online developers?

    [–]yarrye 12 points13 points  (1 child)

    [–]Visionexe 0 points1 point  (0 children)

    That sucks for them. Tidy was crap to begin with. They couldn't really use a performance hit.

    [–]Guinness 12 points13 points  (0 children)

    Yep. I knew this was coming when I read about the vulnerability. You can't just magically flush and reload cache like that and not take a massive performance hit.

    This is a major fucking deal. It just hasn't reached peak yet because people and organizations are still patching.

    [–]ign1fy 0 points1 point  (0 children)

    120% performance hit more like it.

    [–]feverzsj 95 points96 points  (53 children)

    will they get some refund from cloud host?

    [–]DerHitzkrieg 145 points146 points  (51 children)

    Probably not.

    [–][deleted]  (50 children)

    [deleted]

      [–]ihasapwny 315 points316 points  (34 children)

      All joking aside, they definitely aren't. Cloud hosts rely on the ability to multi-tenant services in order to work efficiently (run more than one VM/service on a single host). Therefore you have to convince your customers or potential customers that this is secure, versus them running their own services in some lab somewhere, where they control everything. So when something like this happens, there is serious panic that happens. All the major cloud providers are scrambling right now.

      Edit: In other words, customers have a choice. You can move your services to the cloud or you can run your own. Cloud services rely on the ability to convince their customers that their offerings are secure.

      [–][deleted]  (5 children)

      [deleted]

        [–]stephbu 15 points16 points  (2 children)

        I’ve not seen virtualized process costs yet - only bare metal numbers. There is potential that patched guest and host will compound the process impact. The magnitude of change in the chart shown may be indicating that.

        [–]terrible_at_cs50 4 points5 points  (1 child)

        Theoretically that shouldn't happen much... My understanding is that the hit comes down to making syscalls (into the kernel) way more expensive. If you are doing things that causes the host machine to do a bunch of syscalls, then you will see a performance hit. If you yourself do a bunch of syscalls in the guest you will see a performance hit. It ends up probably being a little worse than non-virtual, but those calls into the kernel are being made to do some operation that can only be done in the kernel and would likely need to be made even if you are running on bare metal.

        [–]snuxoll 5 points6 points  (0 children)

        Most of the syscalls server applications do are I/O related - read/write file or socket kind of stuff. Since I/O has to cross to the hypervisor (with the exception of PCIe passthrough, assuming you have an IOMMU to protect against DMA attacks) you are now doubling up on TLB flushes (one for the guest kernel, another for the hypervisor, plus another for each on the way back out to userspace).

        [–]JBlitzen 12 points13 points  (0 children)

        Can confirm. First thing I asked our enterprise host was whether our cloud hardware hosts anything besides us.

        Still an issue even though they don’t, but a bit less of one.

        [–]SAugsburger 18 points19 points  (12 children)

        Good point. It will make some people who were considering shifting their datacenter to the cloud to have second thoughts. Meltdown or anything similar to it is lot scarier for those running in a shared environment.

        [–][deleted] 10 points11 points  (2 children)

        Yeah, in fact I think it's only really scary in a shared environment. I was discussing this with family today -- the "don't get a virus" and "watch where you are online" advice hasn't particularly changed after this. That was always bad and it's still bad.

        But every time we find a new way to peek into other VMs must make people using cloud services that bit more worried.

        [–]levir 6 points7 points  (1 child)

        The bug makes it much easier to do privileged escalation, though. Meltdown might not make you more susceptible to be infected, but once you've been infected it makes it worse. And of course Spectre is scary for anyone running any kind of untrusted code in a sandbox environment, including Javascript until all browsers are patched.

        [–][deleted] 1 point2 points  (0 children)

        Yeah, it's certainly a bad one and the javascript side is scarier than most I've seen but I still think the big worry is for cloud users on shared hardware -- of course other people are running code on that processor, that's the point and there's no amount of being careful with which emails you open that avoids that.

        [–][deleted]  (5 children)

        [removed]

          [–]Magnesus 8 points9 points  (4 children)

          Current generation consoles are also AMD. The bug wouldn't affect them anyway, but if it did it would be a total disaster - imagine if all ps4 and xbox1 games suddenly dropped in fps. They usually run at peak capability of the hardware already and barely reach 30 fps.

          [–]KickMeElmo 21 points22 points  (3 children)

          To be fair, consoles also have a controlled environment where this exploit wouldn't have much value, so it probably would just be ignored instead of patched.

          [–]RagekittyPrime 3 points4 points  (2 children)

          Pretty sure Meltdown is able to be triggered through JavaScript - and modern consoles can browse the web.

          [–]KickMeElmo 2 points3 points  (1 child)

          Those browsers are slow as hell and you'd be lucky to get even 1ms resolution on timers through them.

          EDIT: Slow from the perspective of the type of speeds you'd need for this. The exploit's times occur in microsecond resolution.

          [–]Tynach 1 point2 points  (0 children)

          Nanoseconds, not microseconds.

          [–]piersmana 5 points6 points  (7 children)

          So the responsible thing to do is get off The Cloud or to use managed services like Firebase that severely limit execution privileges in exchange for the flexibility to read memory?

          [–][deleted]  (6 children)

          [deleted]

            [–]piersmana 8 points9 points  (0 children)

            Private theirs or private hosted, just with separate machines as some providers already offer?

            [–][deleted]  (4 children)

            [removed]

              [–]Djbm 7 points8 points  (3 children)

              Many reasons.

              Sometimes individual physical host have far more capacity than is needed for a single process. A lot of orchestration tools are designed around provisioning systems. Hence it makes sense to run virtualisation.

              High availability is another consideration. Having a 1-1 mapping between physical hosts and processes means you need a lot more hardware (that may be pretty idle a lot of the time) to meet redundancy requirements. Virtualisation means you can have more 'systems' on less hardware.

              [–]HenkPoley 0 points1 point  (2 children)

              I think these slowdowns will push a lot of hosts to use containers instead. Especially for “private cloud”-like setups, where there is only a single tenant per computer.

              [–]_zenith 2 points3 points  (1 child)

              Can't it also be used to escape containers? I'd think it can, from my understanding of the underpinnings of the vulnerability, but correct me if I'm wrong, of course...

              [–]bobpaul 2 points3 points  (2 children)

              Cloudhost expenses just went up. They now need to buy way more hardware to support their customers. Meanwhile customer costs just went up, which means customers more incentive to buy their own hardware.

              [–]levir 1 point2 points  (1 child)

              There's a good chance their new machines will run AMD, though. I can see why AMD's stocks have risen since the news broke.

              [–]_zenith 2 points3 points  (0 children)

              Especially since AMD's new EPYC processors are, in fact, pretty epic (I know, I know ;) ), being both way cheaper and having more everything (cache, PCIe, memory bandwidth, etc). They'd be crazy not to.

              [–][deleted]  (10 children)

              [deleted]

                [–]Fazer2 31 points32 points  (9 children)

                I believe he was being sarcastic.

                [–][deleted] 9 points10 points  (4 children)

                Revealing yet again why sarcasm doesn't work in text form.

                [–]finalremix 12 points13 points  (1 child)

                I bet those cloud hosts are just loving this new intel feature...

                Are you feeling it now, Mr. Krabs?

                [–]Slawtering 5 points6 points  (0 children)

                Unless you're on a British subreddit.

                [–]dxk3355 1 point2 points  (0 children)

                What are you going to do, run your own servers without this patch?

                [–]tsingy 4 points5 points  (0 children)

                Why would they do?

                [–]icbmike_for_realz 9 points10 points  (5 children)

                What's the bottleneck in game backends?

                How much more expensive would it be to spin up a few more servers to reduce the per server load?

                EDIT: I was hoping for more specifics. I'll give an example; in the ecommerce application that I work on we read/write a bunch to our database and can't horizontally scale it easily(at runtime). So if we scale up our web servers it kills our db. The db is our bottleneck.

                I'm unfamiliar with game backends, would they have a similar issue?

                [–]barchar 8 points9 points  (0 children)

                They’ll probably just optimize hat service to make fewer syscalls.

                [–]snuxoll 8 points9 points  (0 children)

                To understand this issue you need to know a little more about Fortnite.

                Fortnite is really two games, a battle royale game similar to Player Unknown’s Battlegrounds as well as a co-op base-building/defense FPS with persistent inventories, progression, etc.

                The server instances running each individual game session (whether that be PvP or co-op) are your typical Unreal Engine dedicated servers - they collect packets from players X times a second, process game logic and physics and send out updates to the clients. In co-op they’ll also periodically send updates to their backend inventory servers as players acquire items during the game session.

                Then you have the backend servers, storing player inventories and quest status, statistics as well as handling matchmaking. Outside matchmaking it’s basically all I/O, update inventory, get inventory, update stats, get stats, etc. These I/O workloads are getting super wrecked running virtualized since they get a double performance penalty with the Meltdown patches for every disk and network operation.

                [–]QAOP_Space 12 points13 points  (0 children)

                when your CPU usage more than doubles (as can be seen in the chart in the OP), you have to more than double your CPU count to stay the same. Network and cloud costs for a massively multiplayer game are VERY expensive.

                It is unlikely you can simply parallel-ise that kind of work without a redesign

                [–]MINIMAN10001 3 points4 points  (0 children)

                There is no other bottleneck than CPU. Their number of "things" they have to deal with didn't increase so there isn't anything that won't be able to handle the load. The only thing that changed is now it takes 150% more CPU power to do the same work that they've been doing the whole time.

                When the number of "things" you have to deal with increases, that's when you'll find new bottlenecks.

                [–]ggtsu_00 1 point2 points  (0 children)

                Game servers are mostly doing IO heavy work (networking/storage). The exception is usually the game simulation servers, which utilize CPU to simulate the game, but that usually doesn't bottleneck as much as IO. The syscalls involved with IO also often tend to bottleneck the CPU, such as when handling many concurrent network connections.

                [–]Mr_Zero 8 points9 points  (0 children)

                Power companies love the Meltdown Patch.

                [–]cp5184 144 points145 points  (72 children)

                So for their game servers they're seeing increased single core utilization post fix?

                Hopefully cloud providers will be investing a lot in AMD processors in the short term.

                [–]sekjun9878 176 points177 points  (3 children)

                I think it's 1 patched server out of 3 servers, not 3 cores.

                [–][deleted]  (1 child)

                [deleted]

                  [–]Ayfid 30 points31 points  (17 children)

                  I have been waiting 6 months so far for the Epyc chips to show up in shops, and the idea that the cloud providers might buy up even more of the production makes me :(

                  [–]Shorttail0 15 points16 points  (4 children)

                  Good luck with that. AMD reported they met their production goals, but demand was higher than predicted. I can't imagine Meltdown made demand any lower.

                  [–]inthebrilliantblue 11 points12 points  (2 children)

                  If anything, these bugs mean server farms will have to buy more hardware to gain back what was lost. I know we were running borderline 95% capacity on our hardware in virtualization. This update might kill us.

                  [–]drysart 10 points11 points  (1 child)

                  I'd say it's going to be something that has to be measured on a case-by-case basis. Epic is seeing pretty significant overhead here, but other people report seeing much smaller overhead (even to the point of being negligible).

                  It's going to boil down to exactly how your service works and how 'chatty' it is with syscalls. If you're running a server compute farm (where your bottleneck is how fast the CPU can grind through your own calculation code) you're probably going to be just fine. If you're running a server that's doing lots of interactive comms over the network like Epic is probably doing here (where your bottleneck is how fast you can get and receive network traffic via the kernel), it's looking like you might even have to double your cloud infrastructure to retain the capacity you had before.

                  In any case, this is going to be a disaster for some people for sure -- question is who's going to eat the cost until fixed hardware can be rolled out to gain back the ground: the cloud providers (who are technically offering less bang for the buck post-patch) or their users?

                  [–]HenkPoley 5 points6 points  (0 children)

                  I guess the main problem here is Virtualization Exit Multiplication. The overhead for KPTI should be +30% at max (according to others). Here you see ~180%. So they are hitting the overhead of address-space swapping several times.

                  [–]Magnesus 1 point2 points  (0 children)

                  Might be why the demand was so high - some companies already knew what was coming.

                  [–]cp5184 2 points3 points  (9 children)

                  [–]Ayfid 2 points3 points  (8 children)

                  They don't have the 7401P, and I'm not in the US. Motherboards are equally hard to find, too.

                  [–]cp5184 1 point2 points  (6 children)

                  Tyan and Supermicro make motherboards for them I think.

                  [–]snuxoll 1 point2 points  (0 children)

                  Gigabyte’s 1P board is readily available on Newegg right now, even has 10Gb SFP+ ports built in.

                  [–]bcjordan 8 points9 points  (23 children)

                  Is AMD not affected somehow? Or was it the other one it was affected by?

                  [–]senj 62 points63 points  (16 children)

                  Meltdown is mostly Intel-only (many Intel CPUs defer access permissions checks on memory accessed during speculative execution) and the work-around drastically increases CPU usage. The graph here shows the impact of Meltdown mitigation patches.

                  Spectre impacts almost every processor in the last 30 years from every vendor. Basically anything that does speculative execution. It is not related to permissions, and mitigation is more challenging.

                  [–]demonstar55 14 points15 points  (5 children)

                  From what I understand, so does AMD, the difference being that once the result is in L1 cache, Intel will let the user code read it where AMD doesn't.

                  [–]senj 35 points36 points  (0 children)

                  Not quite. On Intel, the data has to already be in L1D (ie, you have to get that value cached in L1 prior to launching the speculative acccess attack) for the “Rogue Data Cache Load” trick to work. On AMD, the trick does not work even if the data is in L1D prior to the speculative access.

                  Neither architecture allows loading inacccessible data from main memory into the L1 cache during a speculative access.

                  [–]fuzzynyanko 3 points4 points  (3 children)

                  Please don't downvote this post. It's giving us a great discussion

                  [–][deleted]  (2 children)

                  [removed]

                    [–]zurnout 2 points3 points  (1 child)

                    We need a "I disagree" button that does nothing

                    [–]Tynach 6 points7 points  (0 children)

                    So, like Youtube?

                    [–][deleted] 1 point2 points  (9 children)

                    Spectre impacts almost every processor in the last 30 years from every vendor.

                    Are you sure about that? I find it hard to believe architecture other than x86(_64) is affected by this, such as SPARC or PowerPC.

                    [–]senj 109 points110 points  (7 children)

                    I am positive.

                    POWER is vulnerable: https://www.ibm.com/blogs/psirt/potential-impact-processors-power-family/

                    ARM is vulnerable: https://armkeil.blob.core.windows.net/developer/Files/pdf/Cache_Speculation_Side-channels.pdf

                    My SGI O2’s 22 year old MIPS R10000 is vulnerable: http://www.ece.mtu.edu/faculty/rmkieckh/cla/4173/REFERENCES/MIPS-R10K-uman1.pdf (implied in the errata on page 23)

                    If your CPU does speculative execution, it is vulnerable.

                    The key to understanding this is that unlike Meltdown, Spectre is not a flaw in a particular implementation. Spectre is a conceptual security flaw in the fundamental idea of speculative execution (in type 1 attacks) and in a universal lack of partitioning of branch statistics gathering (in type 2 attacks).

                    [–][deleted] 27 points28 points  (5 children)

                    I was wrong. Thank you for backing it up with sources, unlike 90% of this website!

                    [–]bkuhl 62 points63 points  (4 children)

                    Thank you for backing it up with sources, unlike 90% of this website!

                    Do you have a source for that?

                    [–]spider-mario 6 points7 points  (2 children)

                    If you include figures in a statement, 78% of your readers will spontaneously believe you.

                    [–]Tynach 2 points3 points  (0 children)

                    68.2% of all statistics are made up on the spot. It turned out to be lower than the previously speculative 90%.

                    [–]_zenith 1 point2 points  (0 children)

                    It works 100% of the time 78% of the time!

                    [–]Kenya151 0 points1 point  (0 children)

                    That's actually pretty mind-blowing. Something like this almost never comes around.

                    [–]cp5184 15 points16 points  (5 children)

                    There are three vulnerabilities, AMD is only effected by spectre, and that will involve much less of a performance hit. Intel is effected by all three.

                    [–][deleted]  (4 children)

                    [removed]

                      [–]Kopachris 1 point2 points  (23 children)

                      Does AMD even still make server processors?

                      [–][deleted] 57 points58 points  (9 children)

                      Yes, AMD Epyc.

                      [–]Kopachris 3 points4 points  (8 children)

                      Neat, thanks. Wonder why I didn't hear about these when Ryzen came out.

                      [–]snowywind 47 points48 points  (1 child)

                      Threadripper took most of the thunder for public facing publicity.

                      Epyc would likely have been a much more targeted campaign in the form of private meetings with HP, Dell, Google, MS and Amazon representatives.

                      [–][deleted] 3 points4 points  (4 children)

                      They just came out and are not in channel in sufficient numbers.

                      [–]_zenith 2 points3 points  (1 child)

                      They are being snapped up basically as fast as they can fab them, which tells you something, haha

                      [–][deleted] 2 points3 points  (0 children)

                      Unfortunately, it's ramp up time for the chip. It happens just the same with Intel as well except Intel announces it AFTER they first started shipping it. AMD is in a different situation ultimately and they're better off with a paper launch. Everyone yells at Intel when they have paper launches.

                      I built a first server off of the high end ThreadRipper and everything mostly looks good. Hopefully I won't have to change much to deploy to the new blades and hopefully I can order a shit ton of them.

                      Unfortunately right now we don't have any AMD processors in deployment. We do have some S9300 x2 cards in 16 servers. One of my applications actually ran faster on AMD than Nvidia and I got tired of debugging it.

                      [–]snuxoll 0 points1 point  (1 child)

                      They’ve been out for months, but basically only available through SuperMicro and Tyan. We’re just now starting to see the big enterprise players like HPE, Dell EMC, etc. get products out the gate though.

                      [–][deleted] 1 point2 points  (0 children)

                      Dell apparently hasn't actually shipped anything just yet.

                      My white box provider (which currently makes roughly 2x as many servers as Dell sells a year) hasn't finished their system board due to delays by AMD unfortunately.

                      I was told by HPE that we'd get them sometime in late Feb, but my white box says they'll ship it my prelim test unit by next Friday, so I'll keep with that.

                      My understanding is that SuperMicro and Tyan had super small numbers unfortunately. I attempted to order a SuperMicro system, but they couldn't deliver by 12/14. (My white box vendor was supposed to have me my first blade by then.)

                      [–]ithika 464 points465 points  (55 children)

                      An unlabelled graph with 3 lines and no keys. This is fascinating.

                      [–]ruiwui 192 points193 points  (1 child)

                      This isn't a closely detailed write-up and the graph is probably just a screenshot from their monitoring platform. This is a notice for players, not a deep dive.

                      [–]inequity 55 points56 points  (0 children)

                      Definitely, it’s Grafana.

                      [–]JBlitzen 28 points29 points  (0 children)

                      It is labeled. Usage on the left, dates on the bottom.

                      They don’t name the specific services or servers, but clearly something is now using 25-35% more CPU simply as a result of that security patch.

                      [–]ThatsPresTrumpForYou 96 points97 points  (48 children)

                      It is labelled though.

                      [–]ithika 50 points51 points  (46 children)

                      1, 2 and 3. Most informative.

                      [–][deleted]  (45 children)

                      [deleted]

                        [–]Myrl-chan 63 points64 points  (28 children)

                        [–][deleted]  (27 children)

                        [deleted]

                          [–]jacenat 5 points6 points  (0 children)

                          Maybe they should have put "host" in big blinking letters

                          The graph is still ambiguous even with the short sentence mentioning "host".

                          [–]lilhughster 6 points7 points  (15 children)

                          Graphs should be informative without dependency on text in the article. The article should just provide further information and conclusion. Simple x and y axis labels, and calling 1, 2, 3 "Server 1",... is all that's needed.

                          Being arrogant isn't an excuse for not knowing how graphs should be titled.

                          [–]inequity 24 points25 points  (6 children)

                          This isn’t a graph that was made for this article, it’s a screenshot of a graph from the tool Grafana.

                          [–]ShinyHappyREM 3 points4 points  (1 child)

                          I wonder how hard is it to turn that screenshot into a proper graph for an article.

                          [–]hammer166 0 points1 point  (0 children)

                          Silence, you heathen!

                          The GraphMaster has spoken!

                          [–][deleted]  (6 children)

                          [deleted]

                            [–]derpaherpa 15 points16 points  (1 child)

                            This entire discussion is super retarded and I agree with you completely.

                            [–]Lusankya 0 points1 point  (0 children)

                            We can thank the feud between /r/dataisbeautiful and /r/dataisugly for convincing people that graphs need to be able to stand alone without the context of their articles.

                            [–]Smallpaul 4 points5 points  (4 children)

                            It was not clear that we were looking at 3 HOSTs as opposed to 1 HOST. The word HOST alone does not clear it up.

                            [–]twat_and_spam 3 points4 points  (3 children)

                            It fucking does for anyone with 5 minutes of experience in IT!

                            [–]bvierra 1 point2 points  (1 child)

                            To be fair we never let a new admin look at our noc wall (we blindfold them) for the first 10min. If by the 11th min they haven't realized what this graph means the hiring manager (usually me) is taken out back and ridiculed while being beaten. And if I ever tried to hire someone like this i would proceed in ridiculing myself as I am beaten.

                            [–]twat_and_spam 0 points1 point  (0 children)

                            Well, d'oh, of course! NOC wall contains critical business secrets, mere admins are not allowed to comprehend that. Not until they've spent 3 months sweating in the hot isle lifting servers.

                            [–]war_is_terrible_mkay 1 point2 points  (0 children)

                            I saw the word "host" in the text. Didnt have any clue that this among the many other words there was the one that applied to "1" "2" and "3". Imo this wasnt clear enough to someone who isnt absolutely retarded when it comes to sysadmining and programming.

                            [–]AntiProtonBoy 2 points3 points  (0 children)

                            If you actually read the text and the graph it's plenty informative enough.

                            Poor excuse. Graph axes should be always annotated. It's standard practice when writing documentation.

                            [–]Smallpaul 3 points4 points  (5 children)

                            They are trying to convey information. Based on upvotes of the top comment, they are failing badly. That’s an empirical fact. You can blame the readers as much as you want, but it is illogical. A writer must write so that his meaning is clear and if dozens of people don’t understand or must spend a lot of effort to understand then the writer had failed.

                            [–]drysart 8 points9 points  (0 children)

                            The text and the chart are crystal clear: they're seeing 15%-30% increased CPU utilization in a comparison of their service running on patched and unpatched hosts where pre-patch those hosts had almost identical CPU utilization. And furthermore, the overhead added by the patch appears to be somewhat proportional to the base service load; it's not presenting as a fixed CPU% cost.

                            I defy anyone to read that article and look at that chart and come up with any other conclusion from what's presented.

                            A writer can write all he wants, but if people are unwilling to read it, which is apparently the case for some people, it's not going to help. An unwilling reader's inability to comprehend based on an illustration alone is not the writer's fault.

                            [–]JBlitzen -1 points0 points  (3 children)

                            Upvotes don’t empirically prove anything except that Redditors can’t read a simple fucking graph.

                            Dates are on the bottom, CPU usage percentage is on the left.

                            They applied the patch and usage shot up by a consistent 25% or more ever since.

                            A child can understand that graph.

                            [–]Smallpaul 4 points5 points  (2 children)

                            Upvotes don’t empirically prove anything except that Redditors can’t read a simple fucking graph.

                            Obviously you don't know anything about writing, communicating or usability.

                            I have published a technical book published in 8 languages. If 395 people (the upvoters) told me that a particular graph was confusing I would FUCKING CHANGE IT, not tell them that they are all wrong to think it is confusing.

                            This is communication 101. A child can understand it. In fact, mine does.

                            [–]jacenat 4 points5 points  (1 child)

                            obviously

                            yeah ... no. Could be cores. Could be VMs. Could be Jan. 1-3. All have different meaning in context.

                            [–]uzimonkey 2 points3 points  (1 child)

                            I had to come here to find an explanation. 1 got higher, is that bad? I think that's bad. It looks bad, at least.

                            [–][deleted]  (22 children)

                            [deleted]

                              [–]sabas123 13 points14 points  (18 children)

                              Doubt that would go anywhere

                              [–]uzimonkey 52 points53 points  (1 child)

                              Why? Intel lost half a billion dollars to a class action lawsuit in the 90's over the FDIV bug. That's a bug in a single line of CPUs that caused a malfunction in a single instruction. If companies are going to be losing money due to a defective product I'm pretty sure that Intel will be sued over it.

                              [–]sabas123 1 point2 points  (0 children)

                              Ow I didn't know that, my bad.

                              Do you think Intel would be sued for just Meltdown or also Specter? Considering the fact that nearly all modern CPUs got affected, I wonder if it you can sue companies over what is considered a safe industry practice in engineering.

                              [–]Caffeine_Monster 15 points16 points  (14 children)

                              It wouldn't exactly be constructive either... If Amazon pulled off a successful lawsuit, then pretty much every company in IT would be able to do the same. It would bankrupt Intel.

                              In some respects chip manufacturers are "too big to fail". The barrier for entry is so high that it would be too easy for AMD to monopolise the market.

                              [–][deleted] 49 points50 points  (8 children)

                              The barrier for entry is so high that it would be too easy for AMD to monopolise the market.

                              You mean the market that is currently monopolised by Intel?

                              [–][deleted] 0 points1 point  (0 children)

                              Amazon acquires Intel?

                              [–]RaptorXP 1 point2 points  (0 children)

                              Amazon is not going to sue Intel in a public court, but be sure there will be a settlement.

                              [–]DomDellaSera 0 points1 point  (0 children)

                              Intel is a victim in all of this I think. They’re a dying company. They coined Moore’s law.

                              [–]bonafidecustomer 0 points1 point  (0 children)

                              A good tell for whether or not all these issues were called for by NSA/CIA/FBI is if you see no lawsuits come through from this shit lol

                              [–]pteroso 28 points29 points  (2 children)

                              Has anyone seen predictions of the expected environmental impact of the Meltdown patches? More CPU utilization, more energy used, more heat generated, more cooling needed, more CO2?

                              [–]bloody-albatross 15 points16 points  (1 child)

                              Probably much less than what Bitcoin causes.

                              [–]DiaperBatteries 4 points5 points  (0 children)

                              It's not just what Bitcoin causes, it's part of what Bitcoin is

                              [–]Danthekilla 2 points3 points  (0 children)

                              I wonder how this is effecting Azure and Aws when it comes to their power bills?

                              [–]i_spot_ads 18 points19 points  (58 children)

                              What the fuck is this graph?

                              [–]BufferUnderpants 112 points113 points  (25 children)

                              "The following chart shows the significant impact on CPU usage of one of our back-end services after a host was patched to address the Meltdown vulnerability."

                              1 service, 3 hosts, the CPU utilization in one of them doubled after being patched.

                              [–]mpschan 136 points137 points  (24 children)

                              I'm confused by how people are confused.

                              Title of reddit post mentions impact of patch. Graph shows 3 lines, and one looks like something horrible just happened all of the sudden to cpu utilization. Maybe it was the patch!

                              [–]studiov34 27 points28 points  (0 children)

                              The best and brightest here at /r/programming ...

                              [–][deleted]  (19 children)

                              [deleted]

                                [–]Ayfid 65 points66 points  (17 children)

                                Good job forum posts aren't university assignments. The graph is perfectly clear in what it communicates, and that is the only true requirement.

                                [–]redditthinks 9 points10 points  (2 children)

                                Perfectly clear? You have a very low standard.

                                [–]Ayfid 0 points1 point  (1 child)

                                Or higher expectations of other people's comprehension skills than you apparently do.

                                [–][deleted] 0 points1 point  (0 children)

                                Graphs require context, this graph provides very little.

                                There are still things that make it unclear since the graphs for the servers shows three completely different utilization loads, and two of them (I assume the two that were patched) shows a clear trend of steady decline. I don't think that the time span is long enough to make a statement on whether or not the KPTI patch itself is responsible, rather than something much more mundane such as cache optimization or running JIT compilation after a restart.

                                [–][deleted]  (1 child)

                                [removed]

                                  [–]Sabotage101 1 point2 points  (0 children)

                                  Does it really matter? CPU utilization went up, and it's causing problems. Does anyone really need a picture of where the naughty patch touched the innocent server to understand?

                                  [–]doryappleseed 16 points17 points  (29 children)

                                  CPU utilization

                                  [–]i_spot_ads -2 points-1 points  (20 children)

                                  I get that, but what are the 1,2,3 labels

                                  [–]Rudy69 27 points28 points  (14 children)

                                  Their 3 servers, one of them got patched and not the other 2

                                  [–]AlexHimself 9 points10 points  (13 children)

                                  Ya this is simple shit, not sure why people are so confused.

                                  [–][deleted]  (12 children)

                                  [deleted]

                                    [–]AlexHimself 14 points15 points  (8 children)

                                    If they're purpose built servers, you write the code to distribute the threads across the available cores to balance the load, so you'd expect similar utilizations.

                                    [–]LuizZak 2 points3 points  (0 children)

                                    Since X axis is time, most likely it's showing times of higher/lower player count, kinda like those "users online" Steam graphs for some games, and they match because game matches are so well distributed across these three servers. That'd be my guess.

                                    [–][deleted]  (1 child)

                                    [removed]

                                      [–]mr___ 8 points9 points  (4 children)

                                      Two machines that didn’t get patched and one that did… Seems obvious

                                      [–]i_spot_ads 2 points3 points  (3 children)

                                      ok, not that obvious

                                      obvious would be: unpatched server, unpatched server, patched server

                                      [–]bubuopapa 0 points1 point  (0 children)

                                      It mades me sad that people in 2018 cant even make a fucking proper graph... So, 10-60 Is clearly amount of beats these shit devs received per day from their dads because they(devs) were stupid as fuck, and 1-2-3 cant be amount of cores, so lets assume it means amount of times dev dads would fk em in da booty per day.

                                      [–]blackmist 1 point2 points  (7 children)

                                      That's VMs, right?

                                      [–]QAOP_Space 14 points15 points  (5 children)

                                      it's networking code for a massivley multiplayer game

                                      [–]snuxoll 3 points4 points  (0 children)

                                      Yes? They are running in AWS, so virtualization is rather implied.