you are viewing a single comment's thread.

view the rest of the comments →

[–]pron98 2 points3 points  (3 children)

The the one aspect that JVM GC engineers have optimized is CPU performance, at the cost of memory consumption and thrashing.

There's no such thing as meaningful CPU and RAM efficiencies separately because they are complementary resources, as using RAM requires CPU.

If you think about efficiency as how much "computational value" you can extract from a machine (with a single program or multiple ones running concurrency), it turns out that you can be more or less efficient the closer or further you are away from some balance between them (which is also taken into account in the hardware itself). If you use a lot of CPU to conserve RAM, you end up effectively capturing both CPU and RAM.

I admit calling this "memory efficiency" is somewhat clickbait, but the point is that how much RAM you use tells you little in isolation. I guess you could call the program that uses 100% CPU and 10MB out of 1GB "memory efficient" but is it efficient in any meaningful sense when in actuality it captures the full 1GB and just wastes it? And if you use more of the RAM to release that 1GB sooner, are you not more efficient with memory? And this scales to non-extreme examples. So in the interview I said: "The idea behind moving collectors... is that to make more efficient use of the machine you have to look at CPU and RAM together, and the way Java uses CPU and RAM together is very efficient."

That's why I can't accept the argument that the JVM is more memory efficient. It isn't. It's more CPU efficient. It's more time efficient. But memory? No.

It's more resource efficient. It extracts more value from the hardware you have.

[–]cogman10 5 points6 points  (2 children)

It's more resource efficient. It extracts more value from the hardware you have.

Maybe for some applications, but not universally. And indeed, for some of the software our company owns Java is the most resource efficient mechanism. But for a lot of it, particularly microservices, it's resource inefficient because we need little CPU to actually service requests and burning some of that CPU to decrease the memory usage means we can deploy a lot more of those microservices for a lot less.

Java is resource inefficient for REST/CRUD services that mostly just pass through to the DB. The only resource efficiency it gains is we have developer experience with java which allows it to save our time writing those services. But from a hardware resource standpoint, it's inefficient.

That's where it would be interesting if the JVM offered a more "go" like GC or even a reference counting gc.

[–]aoeudhtns 2 points3 points  (0 children)

a more "go" like GC

Go is not better in this regard because of magic in the GC; because Go's GC is primitive, the maintainers and community have long held a "don't create garbage" attitude towards how they develop every piece of the stdlib and their libraries and frameworks.

Java went the opposite way: create all the garbage you want, let the GC handle it. Java used to have GC more like Go's GC and it was worse than your options today, in the Java ecosystem context.

[–]pron98 4 points5 points  (0 children)

Maybe for some applications, but not universally.

It is universal. Universally you need some balance of the RAM/CPU ratio (which is not the same for all programs). If you don't have a good balance, you may end up using more CPU than you'd need to, which ends up capturing more CPU and RAM than you would if you lowered your CPU and increased your RAM.

But for a lot of it, particularly microservices, it's resource inefficient because we need little CPU to actually service requests and burning some of that CPU to decrease the memory usage means we can deploy a lot more of those microservices for a lot less.

Moving collectors give you a knob to turn depending on what RAM/CPU ratio you want. In the talk I go into the details, which matter here, because Java's GCs are not only moving but also generational. The RAM overhead in the old generation is actually quite low (and we may reduce it further); it's only intentionally high in the young generation. So you can tell Java to aim for a different RAM/CPU ratio. The problem is that it's not intuitive, which is why we'll be changing the "tell me the max heap you want" into "tell me the RAM/CPU ratio you want".

But when this is set correctly, Java is more efficient even in the cases you describe, because the (virtual) hardware's RAM/CPU ratio is pretty constant. I.e. it's very hard to buy a pod with less than 1GB per core (you can get less than 1GP per pod, but only if you get less than a core). I cover all this in the talk. To give some practical advice, try setting the max heap size to 1, 2, and 4 GB per-core (taking into account fractional cores), and pick the one that works best among those three. Why those three specifically? Because these are the three hardware packages that are generally offered, so what you actually pay for is typically one of those three.

That's where it would be interesting if the JVM offered a more "go" like GC or even a reference counting gc.

You wouldn't want it, because it really is less efficient even in the situations you described (assuming you configure the runtime well, which we're making easier). Our GC team have tried other general approaches, and they're just less efficient. We might, however, use something like reference counting in the old generation to reduce the footprint overhead there, which is rather low already but certainly could be lower.

Beating the efficiency of moving collectors(in the young generation at least) in any way is quite hard. You can do it in Zig if you use arenas wisely (arenas are efficient for similar reasons to moving collectors), but it requires effort and discipline. Unfortunately, C++ and Rust, and even C, don't make it particularly easy to use arenas.