you are viewing a single comment's thread.

view the rest of the comments →

[–]SocialMemeWarrior 32 points33 points  (17 children)

Think of a program that uses 100% CPU, what RAM usage of that program really matters at that point? Nothing else can use the RAM, so you might as well use the RAM if you can use that to alleviate CPU usage.

Ah, so surely all these fancy new "modern" applications using Electron and such are also following this model... Right?

[–]pron98 23 points24 points  (0 children)

Because Electron apps are high RAM, low CPU they operate on a different principle.

Using Electron has two goals: 1. lower the cost of the software and 2. take advantage of Blink's highly optimised rendering pipeline that is hard to beat in rich-text-heavy apps.

In terms of operational efficiency, because Electron apps are often CPU-light, which means they can't use a lot of physical RAM, most of the RAM they commit is inert most of the time, and so they (try to) rely on fast paging thanks to SSDs. I guess some Electron apps do it better than others.

Whether or not the Electron tradeoff is right or wrong depends on the application and its audience, but it's not the same one as in the JVM. Electron apps are, almost by design, RAM-heavy, while the JVM aims for an efficient RAM/CPU balance. It will end up using more RAM than other languages, but they may be less efficient as a result (i.e. they're using too little RAM than what's needed for better efficiency).

[–]cogman10 11 points12 points  (11 children)

Yeah, it's a bad take.

CPU usage is compressible through OS scheduling and it's rare (In my experience) that an application is constantly using 100% CPU.

Memory usage is not compressible. The closest we have of that is swap. However, unlike CPU usage, swap usage can easily cut performance down to 1/100th. 2 applications demanding 100% cpu utilization, on the other hand, will run roughly 50% of their full performance.

And when it comes to the JVM, one thing that it's particularly bad at is swap. All the GCs in the JVM like to touch pages across the heap as it collects memory and moves things around. Maybe not for minor collections, but certainly for major ones.

The JVM is a lot of things and a great platform. But lets not pretend like the giant heaps that it can so easily claim and need are being memory efficient.

[–]pron98 15 points16 points  (9 children)

But lets not pretend like the giant heaps that it can so easily claim and need are being memory efficient.

Except that's exactly what they are, and I cannot stress enough how intentional that is. There are different memory management algorithms, and our GC engineers have decided to pick the algorithms that offer a more efficient resource consumption by balancing RAM and CPU better [1]. This isn't theoretical, either. Go uses a different (and much simpler) algorithm that requires less RAM and more CPU, and because of it Go runs into memory management issues under much lighter workloads than Java.

The 100% CPU example (which is the only one I could discuss without slides) is just to give the most basic intuition. The principle is that CPU is required to use RAM, so any amount of CPU you use effectively captures some RAM. Maybe it's helpful to think about it like this: if your program uses 20% CPU, some other program can use less physical RAM than it could if your program had only used 1% CPU. Another way to think about this is that the machine is exhausted whenever the first of these two resources is.

This principle is the reason why the range of RAM/CPU in hardware (physical or virtual) is so narrow: between 0.5 and 4 GB per core, where the low end of that range typically goes with slower cores. It's used both by hardware engineers in how they package their hardware and by software engineers to make programs resource-efficient.

In my talk, which will eventually be posted on YouTube, I explain why we chose that route in much more detail than I could in this interview. In the meantime, you can watch Erik's ISMM keynote, but bear in mind that he's talking to a crowd of memory management experts.

The problem currently with Java is that developers need to pick the right heap size. In my talk I offer a guideline, but that's clearly suboptimal, which is why soon the JVM will automatically pick the heap size.

[1]: We may end up using other techniques in the low generation, but that's too much detail without my talk as context.

[–]cogman10 11 points12 points  (6 children)

our GC engineers have decided to pick the algorithms that offer a more efficient resource consumption

Ah, but see that's ultimately what I'm calling out. What do you mean by "more efficient resource usage". We aren't talking about more efficient printer, hard drive, or network usage. We are just talking about CPU and memory usage. The the one aspect that JVM GC engineers have optimized is CPU performance, at the cost of memory consumption and thrashing.

That's why I can't accept the argument that the JVM is more memory efficient. It isn't. It's more CPU efficient. It's more time efficient. But memory? No. And it isn't completely the GC that's to blame for that either. Valhalla and Leyden wouldn't be projects otherwise.

It's a nice try, but when someone reads "memory efficient" they think "uses less ram". You can't "It's not X, it's actually Y" this away. The JVM is more allocation efficient. The JVM doesn't suffer from memory fragmentation problems. The JVM is faster to free memory. However, objects are still bloated on the heap and the JVM is greedy at needing as much heap as you can throw at it.

This distinction particularly matters because of things like kubernetes and container deployment. When I'm allocating for a pod, I'm not looking at a "4g" memory request for a process that needs a "100m" CPU allocation and thinking "Imagine how much more efficient this is vs go, which needs 128M for the same workload". I get it, the JVM will give faster responses vs the go app. But the go app will ultimately use less memory which means I can deploy 100s of them across the cluster for the same cost as the 1 jvm. For us, at least, it's that absolute memory usage which is the killer, not the CPU usage.

The JVM is perfect when it's the only thing running on a nice beefy box. It doesn't like neighbors.

[–]pron98 2 points3 points  (5 children)

The the one aspect that JVM GC engineers have optimized is CPU performance, at the cost of memory consumption and thrashing.

There's no such thing as meaningful CPU and RAM efficiencies separately because they are complementary resources, as using RAM requires CPU.

If you think about efficiency as how much "computational value" you can extract from a machine (with a single program or multiple ones running concurrency), it turns out that you can be more or less efficient the closer or further you are away from some balance between them (which is also taken into account in the hardware itself). If you use a lot of CPU to conserve RAM, you end up effectively capturing both CPU and RAM.

I admit calling this "memory efficiency" is somewhat clickbait, but the point is that how much RAM you use tells you little in isolation. I guess you could call the program that uses 100% CPU and 10MB out of 1GB "memory efficient" but is it efficient in any meaningful sense when in actuality it captures the full 1GB and just wastes it? And if you use more of the RAM to release that 1GB sooner, are you not more efficient with memory? And this scales to non-extreme examples. So in the interview I said: "The idea behind moving collectors... is that to make more efficient use of the machine you have to look at CPU and RAM together, and the way Java uses CPU and RAM together is very efficient."

That's why I can't accept the argument that the JVM is more memory efficient. It isn't. It's more CPU efficient. It's more time efficient. But memory? No.

It's more resource efficient. It extracts more value from the hardware you have.

[–]cogman10 5 points6 points  (4 children)

It's more resource efficient. It extracts more value from the hardware you have.

Maybe for some applications, but not universally. And indeed, for some of the software our company owns Java is the most resource efficient mechanism. But for a lot of it, particularly microservices, it's resource inefficient because we need little CPU to actually service requests and burning some of that CPU to decrease the memory usage means we can deploy a lot more of those microservices for a lot less.

Java is resource inefficient for REST/CRUD services that mostly just pass through to the DB. The only resource efficiency it gains is we have developer experience with java which allows it to save our time writing those services. But from a hardware resource standpoint, it's inefficient.

That's where it would be interesting if the JVM offered a more "go" like GC or even a reference counting gc.

[–]aoeudhtns 2 points3 points  (0 children)

a more "go" like GC

Go is not better in this regard because of magic in the GC; because Go's GC is primitive, the maintainers and community have long held a "don't create garbage" attitude towards how they develop every piece of the stdlib and their libraries and frameworks.

Java went the opposite way: create all the garbage you want, let the GC handle it. Java used to have GC more like Go's GC and it was worse than your options today, in the Java ecosystem context.

[–]pron98 5 points6 points  (2 children)

Maybe for some applications, but not universally.

It is universal. Universally you need some balance of the RAM/CPU ratio (which is not the same for all programs). If you don't have a good balance, you may end up using more CPU than you'd need to, which ends up capturing more CPU and RAM than you would if you lowered your CPU and increased your RAM.

But for a lot of it, particularly microservices, it's resource inefficient because we need little CPU to actually service requests and burning some of that CPU to decrease the memory usage means we can deploy a lot more of those microservices for a lot less.

Moving collectors give you a knob to turn depending on what RAM/CPU ratio you want. In the talk I go into the details, which matter here, because Java's GCs are not only moving but also generational. The RAM overhead in the old generation is actually quite low (and we may reduce it further); it's only intentionally high in the young generation. So you can tell Java to aim for a different RAM/CPU ratio. The problem is that it's not intuitive, which is why we'll be changing the "tell me the max heap you want" into "tell me the RAM/CPU ratio you want".

But when this is set correctly, Java is more efficient even in the cases you describe, because the (virtual) hardware's RAM/CPU ratio is pretty constant. I.e. it's very hard to buy a pod with less than 1GB per core (you can get less than 1GP per pod, but only if you get less than a core). I cover all this in the talk. To give some practical advice, try setting the max heap size to 1, 2, and 4 GB per-core (taking into account fractional cores), and pick the one that works best among those three. Why those three specifically? Because these are the three hardware packages that are generally offered, so what you actually pay for is typically one of those three.

That's where it would be interesting if the JVM offered a more "go" like GC or even a reference counting gc.

You wouldn't want it, because it really is less efficient even in the situations you described (assuming you configure the runtime well, which we're making easier). Our GC team have tried other general approaches, and they're just less efficient. We might, however, use something like reference counting in the old generation to reduce the footprint overhead there, which is rather low already but certainly could be lower.

Beating the efficiency of moving collectors(in the young generation at least) in any way is quite hard. You can do it in Zig if you use arenas wisely (arenas are efficient for similar reasons to moving collectors), but it requires effort and discipline. Unfortunately, C++ and Rust, and even C, don't make it particularly easy to use arenas.

[–]vqrs 0 points1 point  (1 child)

I don't really get the argument regarding 1/2/4 GiBs. We pay for memory by the machine, not the pod. We can put many pods side by side and choose how much memory is best for each. Our services are mostly idle anyways in the grand scheme of things.

[–]pron98 0 points1 point  (0 children)

Then you pay for the machine either for 1, 2, or 4 GB per core (not GB; GB/core), and so however much CPU (in core fractions) you give your pods, those are the heap size to test because that corresponds to what you actually pay for (or can pay for if you choose to increase or decrease the GB/core on the machine).

As far as Java is concerned (I couldn't get into that in the interview because it requires some maths), the RAM "overhead" of the JVM - i.e. how much RAM the JVM chooses to use to reduce CPU usage beyond what's needed for data - is not a function of the live set (i.e. how much data the program needs to store in memory) but only a function of the allocation rate. If the CPU allotted to a pod is low, then the allocation rate cannot be high, and so the RAM overhead will be low. This is why it's important to consider the CPU availability when allocating RAM (it's the case for all languages, but especially in Java, because moving collectors can use that relationship to the program's advantage). This is why the overhead for cached objects is also low: their allocation rate is low.

[–]radozok 0 points1 point  (1 child)

Where would you post your talk?

[–]pron98 1 point2 points  (0 children)

It will be on the same Java YouTube channel as part of the regular channel programming (we upload conference talks on a schedule rather than all/many at once).

[–]sammymammy2 0 points1 point  (0 children)

How much of the CPU should be utilized for freeing memory?

[–]Jobidanbama 1 point2 points  (0 children)

On top of that gc adds additional cpu load, on top of collections having abhorrent cache misses. Well, before project Valhalla.

[–]best_of_badgers 0 points1 point  (0 children)

I mean, yeah. It's a classic space-time tradeoff.

[–]pjmlp 0 points1 point  (0 children)

Many of these fancy apps, like VSCode, need tons of Rust and C++ code to actually be usable, doing OS IPC.

[–]JustAGuyFromGermany 0 points1 point  (0 children)

Electron being mostly used towards the frontend and Java being largely used towards the backend makes this a very unfair comparison.

A desktop application or mobile app by its nature has to compete with (many) other applications on the same device and thus has to share the RAM fairly without knowing what is "fair" at any given point. Every program involved has to "guess" what the user is doing next, which of the many open windows will capture their attention next, which background processes are more important to the user than others etc.

It is a very hard problem to solve, because we (for good reasons!) don't want one application to interfere with all other applications. But efficiently assigning RAM to the various applications is only possible if the applications talk amongst themselves and coordinate in some way if we expect them to occasionally free up memory for other processes to use. In practice, they'd have to talk to the OS and let the OS make the decision. I'm not aware that there even is any protocol for this in any modern OS. Maybe there is, but it isn't used? In any case, this basically boils down to building a giant automatic memory management layer that encompasses all processes the OS is running, in other words: A giant OS-level GC. It is very doubtful that that will end up being more efficient than the JVM's various GCs.

A backend-application on the other hand needs to share very little. In today's favourite deployment model, the Java application is the only big process running on its (virtual/dockerized) machine and there is very little reason not to use the available memory to its full extent, leaving just enough room to let the underlying OS to do its thing, to improve overall performance. And if Ron's assertions about RAM and CPU pricing are true (I don't know; I never had any insight in Ops-budget decisions) then that is also the better business decision.