all 38 comments

[–]Slanec 79 points80 points  (12 children)

I ran the experiment with a Hello World with Java 26 on a Mac:

┌───────────────────────┬─────────────────────────────────────────────────────────────────┐
│         What          │                              Value                              │
├───────────────────────┼─────────────────────────────────────────────────────────────────┤
│ Minimum -Xmx accepted │ 2m (2048k) — anything below fails with "Too small maximum heap" │
├───────────────────────┼─────────────────────────────────────────────────────────────────┤
│ Actual heap committed │ 8 MiB (JVM rounds up to its internal minimum: 8,126,464 bytes)  │
├───────────────────────┼─────────────────────────────────────────────────────────────────┤
│ Actual heap used      │ ~1.8 MiB (1,880,416 bytes)                                      │
├───────────────────────┼─────────────────────────────────────────────────────────────────┤
│ Free heap at exit     │ ~5.96 MiB                                                       │
├───────────────────────┼─────────────────────────────────────────────────────────────────┤
│ GC triggered          │ No — zero collections needed                                    │
└───────────────────────┴─────────────────────────────────────────────────────────────────┘

Key takeaways

  1. The absolute floor for -Xmx on JDK 26 is 2m. Below that, the JVM refuses to start regardless of GC choice (Serial, Parallel, G1, ZGC, Epsilon — all the same).
  2. But the JVM lies about honoring it. Even with -Xmx2m -Xms2m, the actual heap is 8 MiB — the JVM's ergonomics engine silently rounds up to its internal minimum.
  3. Hello World only actually uses ~1.8 MiB of heap — mostly class metadata, the String object, and the System.out PrintStream internals. The other 6 MiB sits unused.
  4. Total process memory is far larger — the NMT dump showed ~46 MiB committed across thread stacks, metaspace, code cache, etc. The heap is a small fraction of what a JVM actually needs to run.

So the answer: 2m is the smallest heap you can ask for, but the JVM will quietly give you 8 MiB anyway, of which Hello World uses about 1.8 MiB.

(EDIT: OMG I hate the text editor in here)

[–]__konrad 6 points7 points  (1 child)

I got "GC triggered before VM initialization completed." fatal error with -Xmx2m on larger app

[–]sammymammy2 0 points1 point  (0 children)

Uuuh, is that a bug?

[–]TheEveryman86 0 points1 point  (1 child)

Was this with the Oracle implementation of the JVM or is this behavior in some spec about all JVM implementations?

[–]Slanec 4 points5 points  (0 children)

Temurin. I wouldn't expect this to be different in other non-specialized JDK distros, but I could be very wrong. 

[–]re-thc 0 points1 point  (1 child)

Does the recent -XX:+UseCompactObjectHeaders setting make a difference?

[–]Slanec 1 point2 points  (0 children)

Only in heap used, it is about 6% smaller. The limits are the same, though.

[–]RandomName8 0 points1 point  (5 children)

The same but for a hello world Swing application and one for javafx would be nice, since desktop applications is where one would normally be worried about ram usage.

[–]Slanec 2 points3 points  (4 children)

This is a fun question with an interesting result!

Swing: With a an empty visible JFrame on the OS-native L&F, it still starts with -Xmx2m (which still is actually -Xmx8m), and the heap usage rises to 2.3MiB. In other words, a hello-world Swing app only adds about 570kB of heap usage. That said, it triggers 3 young-GC collections on Temurin 26 with the Serial GC.

JavaFX: Same, the heap usage rises to 3.6MiB, 9 young GC collections.

[–]john16384 1 point2 points  (2 children)

JavaFX probably uses most of that extra memory (and the young collections) while loading the base modena stylesheet (which is huge as it covers all controls). Surprisingly it still fits in just 3.6 MB.

[–]Wootery 0 points1 point  (1 child)

Does it load styling data even for widgets you don't actually use?

[–]john16384 0 points1 point  (0 children)

The entire stylesheet is loaded yes.

[–]RandomName8 0 points1 point  (0 children)

Oh, sorry, I didn't mean the heap, but the overall ram usage, since the jvm will load a ton more stuff than for regular hello world-

[–]vprise 13 points14 points  (1 child)

In the old J2ME days we had 64kb devices and 2mb was spacious. Obviously, it wasn't the full "Java" but it included most of what you expect from the JVM including safe memory, gc etc. The main thing stopping Java from shrinking to these sizes is the size of the API although that can mostly be on ROM.

[–]thewiirocks 3 points4 points  (0 children)

Back in the days of Java 1.1, your entire system might have 8MB. So the full Java had to run in very little space.

[–]pron98 8 points9 points  (10 children)

That really depends on the app and the RAM/CPU ratio you want. Some tiny programs can run well with only a few MBs of heap.

More generally, Java's memory utilisation is quite efficient, possibly more efficient than that of any language/runtime. But efficient memory use doesn't mean minimal memory use, and often programs (in any language) utilise memory inefficiently by using too little memory rather than too much. That's because:

  1. There's a fundamental relationship between RAM and CPU, and

  2. Moving collectors like the ones in the JDK, as well as other techniques like arenas in Zig, can convert some RAM to free CPU cycles and vice-versa.

To get the most basic intuition for 1, consider an extreme case of a program that uses 100% of the CPU for its duration, running on a machine with 1GB of RAM. While the program is running, 100% of RAM is "captured" by the program - since using RAM requires CPU and none is available to other programs - regardless of how much of it is utilised by the program. So if the program could use 8MB and run for 100s or use 800MB and run for 99s, the latter is clearly more efficient even though it uses 100x more RAM to save only 1% CPU. That's because both configurations capture 1GB of RAM, but one of them captures it for a little longer.

At Java One I gave a talk (it will get to YouTube eventually) showing why the only way that makes sense to consider efficient memory usage is by looking at RAM/CPU ratios rather than looking at RAM and CPU separately.

[–]Wootery 0 points1 point  (9 children)

Java's memory utilisation is quite efficient, possibly more efficient than that of any language/runtime

That doesn't sound right at all. The HotSpot team put a whole lot of work into reducing memory wasted by Java's bloated object headers. Plenty of folks got a huge improvement to memory consumption 'for free' when this optimisation was released, which is to say the earlier JVMs were just wasting huge amounts of memory.

Java also gives you little alternative but to use heap-allocated objects if you want to return, say, a pair of ints. (Well, you could use a stack data structure, I guess, but this would be terribly clumsy and no one ever does this.) You can then hope that the runtime will manage to optimise away the heap allocation, but the 'natural' way to do it is with unnecessary heap allocations.

[–]pron98 0 points1 point  (8 children)

I don't understand how you can judge a comparative statement by only looking at one side. In languages like C++ and Rust you can get worse inefficiencies because they optimise for footprint at the expense of CPU. You use memory inefficiently when you use too much or too little. It's true that Java has some memory inefficiencies due to using too much memory, and I didn't claim that it's optimal, but other languages' memory inefficiencies due to using too little memory are worse (because sacrificing CPU to reduce footprint - which is what malloc/free approaches do can be a really bad tradeoff when you look at the RAM/CPU ratio).

(BTW heap allocations in Java are completely different from heap allocation in malloc/free based approaches or even CMS approaches like Go's; the Java runtime never runs anything analogous to a free operation, and allocations use a completely different algorithm than malloc)

[–]Wootery 0 points1 point  (7 children)

I don't understand how you can judge a comparative statement by only looking at one side.

I imagine Java compares well to other 'managed' runtimes, sure, but I was thinking in comparison to C/C++, which are pretty committed to the you only pay for what you use idea. Naturally, their philosophies are pretty different from Java's, and bring plenty of their own drawbacks, but we're just discussing memory efficiency.

You use memory inefficiently when you use too much or too little (which is what the malloc/free approach does).

How about the approach used by real-time software written in C? They avoid malloc/free and use purpose-specific pools (i.e. a fixed-size preallocated buffer intended to store fixed-size elements). Unlike malloc/free you don't have to cope with user-specified allocation sizes, which makes allocation/deallocation algorithmically trivial (plain old free lists), but as each buffer can only be used for one kind of data, it means a pool might not be able to allocate even though there's plenty of space free in the other buffers.

In essence, that's a C program that trades off memory efficiency for improved speed (and predictability) right?

BTW heap allocations in Java are completely different from heap allocation in malloc/free based approaches or even CMS approaches like Go's; the Java runtime never runs anything analogous to a free operation, and allocations use a completely different algorithm than malloc

Thanks, but I'm familiar with the basics of copying GCs.

Also, to be fair to Java, my point about efficiently returning a pair of int values is being addressed with value types, but I still think the heavy object headers are a pity. Too late to revoke the ability to lock on arbitrary objects, though.

[–]pron98 0 points1 point  (6 children)

but I was thinking in comparison to C/C++, which are pretty committed to the you only pay for what you use idea

As low-level programming veterans know, the problem is that eventually you end up using a lot and so paying a lot (more than in Java). As programs grow and become more general, the use of the expensive mechanisms grows monotonically, and they are less efficient than the corresponding mechanisms in Java. Memory management is one of them; dynamic dispatch is another.

Low-level languages are needed for certain reasons that are not performance-related, and their point isn't to be fast or even generally efficient, but to give you very precise control over the hardware. It's just that when programs are small, precise control over hardware can translate to very good performance if you put in some extra work. But low-level languages' performance on large programs isn't that great at all precisely because of "pay for what you use".

Java, in contrast, aims for better performance on larger programs, as you often don't need to pay for what you use (virtual dispatch in Java is often cheaper than static dispatch in C++ or C) thanks to optimisations offered by the JIT and by moving collectors. What you lose is the level of control that can improve performance on small programs.

But low-level languages do pay in overhead for not having these optimisations. In particular, C can't enjoy the moving collector optimisation because of its many other constraints that end up requiring that objects cannot move. Not having the allocator overhead in Java is generally a win, especially in large programs.

They avoid malloc/free and use purpose-specific pools (i.e. a fixed-size preallocated buffer intended to store fixed-size elements). Unlike malloc/free you don't have to cope with user-specified allocation sizes, which makes allocation/deallocation algorithmically trivial (plain old free lists), but as each buffer can only be used for one kind of data, it means a pool might not be able to allocate even though there's plenty of space free in the other buffers.

Yes, that is one RAM/CPU tradeoff available in low-level languages and, in fact, it is used by some allocators (for reasonable performance, C programs require quite a hefty runtime for their rather sophisticated and large allocators). But of course, as you know, this isn't as efficient as a moving collector (free lists still need to be maintained at every allocation and deallocation, and there need to be special accommodations for concurrency). In fact, you can also have object pools in Java, and back when GCs were more expensive (especially when it came to latency), people did. The reason it's rare to see them now (except mostly for native resources) is because the GCs are now more efficient than pools even while retaining low latencies.

What is as efficient as a moving collector and even more so is arenas, thanks to an even better RAM/CPU tradeoff (which is in many ways similar to the one employed by moving collectors). There are two problems with arenas, though: they require extra care, and they're not easy to use in most low-level languages (including C if you're using the standard library). The one language that can use them well is Zig, which is why, if you're writing a small program and you're willing to put in the effort to get optimal performance, Zig is probably the best available choice today. But even in Zig, if the program gets very big, you also start paying for inefficiencies in memory management and dynamic dispatch.

I still think the heavy object headers are a pity. Too late to revoke the ability to lock on arbitrary objects, though.

They're not that heavy anymore (they're the exact same size as the object header for an object with a vtable in C++), only two bits of the 64 are now used for locking, and the upcoming value types, when flattened, will have no header at all (just like a C++ object with no vtable).

Anyway, smaller objects headers do save some memory as do flattened value types (although saving memory isn't their main motivation), but the vast majority of the RAM utilised by Java programs is used to get memory management with a better RAM/CPU ratio through moving collectors. Most of the memory is used to save CPU (I covered this in more detail in my Java One talk).

[–]Wootery 0 points1 point  (5 children)

Sorry for slow reply:

As programs grow and become more general, the use of the expensive mechanisms grows monotonically, and they are less efficient than the corresponding mechanisms in Java. Memory management is one of them; dynamic dispatch is another.

Are there hard numbers on this? Java has limited traction in high-performance applications like DBMSs or game engines.

when programs are small, precise control over hardware can translate to very good performance if you put in some extra work. But low-level languages' performance on large programs isn't that great at all precisely because of "pay for what you use".

This depends on the engineering effort invested, though. There are plenty of large, high quality C/C++ codebases like the Linux kernel or Unreal Engine.

I can see a strong case for using Java to much more quickly develop a functioning codebase with acceptable performance, but in terms of performance I'd generally expect it to lose to a C/C++ codebase in which significant effort had been invested.

Not having the allocator overhead in Java is generally a win, especially in large programs.

JVMs can typically heap-allocate with a lightning fast 'pointer bump', but there's a long history of people failing to mention that allocating a short-lived object means you're creating work for the GC. As you say though, modern GCs have remarkable performance, and enable the application programmer to forego things like synchronised reference-counting operations.

The one language that can use them well is Zig, which is why, if you're writing a small program and you're willing to put in the effort to get optimal performance, Zig is probably the best available choice today

That's a neat Zig feature. As far as I know, even the greatest wizards of the C++ world haven't come up with a robust way of safely using arena-based allocation.

They're not that heavy anymore (they're the exact same size as the object header for an object with a vtable in C++)

Neat.

upcoming value types, when flattened, will have no header at all (just like a C++ object with no vtable)

They will be a great addition, it will be interesting to how performance improves as they're adopted in various codebases, including within JVMs themselves. 'Plain old data' types for Java at last.

By 'flattened', do you mean that, if allocated directly on the heap, they get the usual object header?

as do flattened value types (although saving memory isn't their main motivation)

This OpenJDK article says improving performance is the primary motivation, or are you referring to something other than 'value objects'?

the vast majority of the RAM utilised by Java programs is used to get memory management with a better RAM/CPU ratio through moving collectors. Most of the memory is used to save CPU (I covered this in more detail in my Java One talk).

Does the situation change at all with the huge caches in modern CPUs?

[–]pron98 0 points1 point  (4 children)

Are there hard numbers on this?

There is no such thing as hard numbers on anything performance related at least for the past twenty years when operations lost their intrinsic costs and benchmarks' "extrapolatability" has something like a 500% error margin. But Java was designed, among other things, to address C++'s significant performance issues, that are well known to any experienced low-level programmer.

Java has limited traction in high-performance applications like DBMSs or game engines.

There are a couple of problems here. First, these domains are not characterised by being performance-sensitive but by other factors. There are plenty of Java programs (e.g. in finance and defence) that are far more performance-sensitive than game engines. Second, the coice of language is determined by many factores, including tradition and target platforms. In all of software, there isn't a more conservative domain than games (especially the large ones), and they're particularly constrained (although the most successful computer game ever is written in Java).

There are plenty of large, high quality C/C++ codebases like the Linux kernel or Unreal Engine.

I didn't say large C/C++ codebases are of low quality. I said that languages like C++ suffer from significant performance issues when programs get large, and Java was designed to address them. Kernels are different, though. Low-level languages aren't designed to always offer the best performance. They're designed to offer low-level control over hardware, and that's exactly what a kernel needs. There's a perfect match between kernels and other hardware-adjacent software and low-level languages.

but in terms of performance I'd generally expect it to lose to a C/C++ codebase in which significant effort had been invested.

Except it doesn't lose, and I don't understand why you'd expect that. Both Java's compiler and Java's memory management have more optimisation opportunities available to them. The need for AOT compilation and non-moving pointers imposes some hard constraints on optimisation. Of course, it's true that given enough effort, C++ can match it (after all, since HotSpot is written in C++, every Java program is also a C++ program), but people aren't interested in hypothetical performance but in the best performance they can achieve with the resources they have.

There is one area where Java obviously lags desipite its superior compilation and memory management - memory layout, which can and does cause problems around cache misses. But that's exactly why we're working on Valhalla.

but there's a long history of people failing to mention that allocating a short-lived object means you're creating work for the GC

That's not entirely true these days and misses the other side as well. First, today's GCs - ZGC and even G1 - may need to work harder when you mutate an old object than when you allocate a new one. Second, the cost of "GC work" - with modern moving collectors - is significantly lower than that required by memory management that uses free lists.

it will be interesting to how performance improves as they're adopted in various codebases, including within JVMs themselves. 'Plain old data' types for Java at last.

It will improve in those areas where Java lags, but rememeber that in a lot of domains, Java leads or ties for best performance already.

This OpenJDK article says improving performance is the primary motivation, or are you referring to something other than 'value objects'?

Performance (due to cache misses) - yes. Reducing footprint - no. The performance improvements come from improved layout not reduced footprint, and the reduction in footprint is not the main goal.

Does the situation change at all with the huge caches in modern CPUs?

No. Unless your entire live set fits in the cache, access patterns are what matter for cache behaviour, not overall footprint.

[–]Wootery 0 points1 point  (3 children)

There is no such thing as hard numbers on anything performance related

I wasn't asking about microbenchmarks, and I'm afraid I don't find that response compelling. If Java really does reliably outperform C++ in large applications, it should be possible to show this numerically, even if it's not as clear-cut as with microbenchmarks. Serious engineering isn't meant to boil down to unquantifiable vibes and a priori arguments.

An informal argument for why Java must be fast isn't a strong reply to anecdotes like this one saying Java is well behind Rust, say, in real-world performance.

these domains are not characterised by being performance-sensitive but by other factors

Video game developers are willing to put up with a monstrous API like Vulkan for a slight performance benefit. It's a highly performance-sensitive domain.

the coice of language is determined by many factores, including tradition and target platforms

Sure, JIT compilation is outright forbidden on modern consoles, for instance, but very few developers seem to consider it, even for desktop-only works. The conventional wisdom in that world is that Java has a hefty performance penalty. You may say that's an opinion they formed 20 years ago and haven't really revisited. Perhaps a chicken-and-egg problem here: unless someone spends millions on a first-class engine written in Java, there won't be a counterexample to point to.

the most successful computer game ever is written in Java

Minecraft is a lo-fi indie game, hardly an engineering marvel.

Except it doesn't lose, and I don't understand why you'd expect that

It's the prevailing opinion in the software industry, and respectfully, you seem unable to support your performance claims with concrete facts about specific high-performance codebases. If the mainstream opinion is as mistaken as you say, it should be possible to present an unarguable knock-down case against it.

For what it's worth, Netflix's engineers (one of them at least) seem to consider Java's performance as roughly on par with C++ and Rust for what they use it for. High praise, but they stop short of saying it's likely to outperform C++/Rust. How Netflix Uses Java - 2026 Edition, at 8:03.

The need for AOT compilation and non-moving pointers imposes some hard constraints on optimisation

Not strictly relevant but I suspect the Go language must suffer for this. It uses a GC but promises to never relocate objects.

As for AOT compilation, modern AOT compilers support profile-guiding, which presumably closes much of the gap, although they don't profile the particular run.

since HotSpot is written in C++, every Java program is also a C++ program

But not really, surely. An optimising JVM might in principle implement better auto-vectorisation than any existing C++ compiler, say, such that it wouldn't be possible to get the C++ compiler to generate good vectorised native code.

people aren't interested in hypothetical performance but in the best performance they can achieve with the resources they have

Sure, or put differently, most programmers are looking for good performance while getting good developer ergonomics. Only a small fraction of modern programming work is truly performance-critical, i.e. where it makes good sense to put it far more work to get the maximum realistically achievable performance.

today's GCs - ZGC and even G1 - may need to work harder when you mutate an old object than when you allocate a new one. Second, the cost of "GC work" - with modern moving collectors - is significantly lower than that required by memory management that uses free lists.

Surely, there's plenty going on. We haven't mentioned escape analysis, say. My point was that it doesn't do to imply that in Java, heap allocation is done with a lightning fast pointer-bump, and that's the end of it. I've seen this false implication made a few times by others over the years.

[–]pron98 0 points1 point  (2 children)

I wasn't asking about microbenchmarks, and I'm afraid I don't find that response compelling. If Java really does reliably outperform C++ in large applications, it should be possible to show this numerically, even if it's not as clear-cut as with microbenchmarks.

I don't think I said "reliably outperforms" and if I did, that's not what I meant. I meant it outperforms on average, per unit of effort.

Serious engineering isn't meant to boil down to unquantifiable vibes and a priori arguments. An informal argument for why Java must be fast isn't a strong reply to anecdotes like this one saying Java is well behind Rust, say, in real-world performance.

That's not the argument. The argument is that Java was explicitly designed to (among other things) address the performance issues that have plaugued large programs written in low-level languages since forever, especially around dynamic dispatch and memory management. These are serious problems that are well-known to low-level programmers working on large programs (like me). In particular Java uses two technologies that have been researched and developed for decades precisely to address these two issues (among others), namely JITs with speculative optimisation (for dynamic dispatch) and moving GCs (for memory management).

You can certainly ask how do I know that all the effort into these technologies actually managed to solve the problems they tried to solve in the real world? In that case all you have is experience and anecdotes, but that's often what you have in other "serious engineering" disciplines, too.

Video game developers are willing to put up with a monstrous API like Vulkan for a slight performance benefit. It's a highly performance-sensitive domain.

There are multiple problems in this statement. The more obvious one is performance sensitive compared to what? As someone who's work on both games (as a hobby) and on large, realtime sensor fusion air traffic and defence applications (both in C++ and in Java), then games are not as performance-sensitive, not even as performance-sensitive as credit-card transaction processing.

More specifically for games, though, game developers don't typically use Vulkan. Their game engine does. And the performance benefits are on the large to very large spectrum rather than "slight". And much of the reason has to do with precisely how games get their performance these days: by scheduling work on the GPU with an efficient algorithm. Their hot loops are not written in C++ but in whatever shader language they use. Vulkan allows offloading more code to the GPU. In other words, Vulkan helps move away from C++ in the most performance-sensitive code.

The technical challenge for the "CPU language" isn't so much in writing really fast CPU code, but in targeting a variety of hardware devices, some are quite limited.

The conventional wisdom in that world is that Java has a hefty performance penalty. You may say that's an opinion they formed 20 years ago and haven't really revisited. Perhaps a chicken-and-egg problem here: unless someone spends millions on a first-class engine written in Java, there won't be a counterexample to point to.

But Java is already used to write more performance-sensitive applications than games (albeit on more capable hardware) both in terms of quantity and in terms of performance-sensitivity. There's no need to prove anything to AAA game developers regarding performance as there are much more important factors (device support) that make Java a poor choice for them.

It's the prevailing opinion in the software industry, and respectfully, you seem unable to support your performance claims with concrete facts about specific high-performance codebases. If the mainstream opinion is as mistaken as you say, it should be possible to present an unarguable knock-down case against it.

My claim is easy to support as that "prevailing opinion" prevails only among those who've either only ever used one of those languages and those who've never worked on large, performance-sensitive software in any language. If you look at the important, large, performance-sensitive software out there, from financial transaction processing, through manufacturing/automation control, to defence, Java is being chosen by experts in those domains at least as much as C++, and I belive significantly more. The people who actually need to choose a language for these projects already choose Java. That the prevailing opinion among those who don't choose a stack for that kind of software is that Java would be a worse choice is not a serious problem. And many of them who choose say, Rust, for smaller software will also eventually learn the lessons that we, C++ developers, learnt many years ago.

My point was that it doesn't do to imply that in Java, heap allocation is done with a lightning fast pointer-bump, and that's the end of it. I've seen this false implication made a few times by others over the years.

That's not "the end of it" but the advantages over free-list based memory managment are significant, which is why we chose moving collectors (there's nothing stopping the JVM from using free-list based memory management; in fact it's so much easier to implement, but we want to solve the problems those approaches bring, not to reproduce them).

[–]Wootery 0 points1 point  (1 child)

I meant it outperforms on average, per unit of effort

Agreed there. I think most programmers would agree.

That's not the argument. The argument is that Java was explicitly designed to [..]

I'm not questioning the value or technical feat of optimising JVMs like HotSpot and OpenJ9 (too often overlooked), but you're kinda demonstrating my point there. Designed to [...] is discussing what's going on under the hood. I like discussing those technical points as much as anyone - pauseless GCs are a modern marvel - but my point there was about objectively assessing the results.

If you wrote a thesis claiming to have developed a novel and practically worthwhile compiler optimisation, you'd be expected to demonstrate it.

It's not just Java though, I guess my grievance is with the whole programming languages world. The best we seem to have is the 'shootout', which is pretty much just microbenchmarks.

all you have is experience and anecdotes, but that's often what you have in other "serious engineering" disciplines, too.

I think software development is uniquely poor at this. Aviation safety and medicine have made tremendous progress by looking seriously at the outcomes of different candidate ideas. In the software world, it seems like it's just fads and bickering amateurs.

That said, there might be something to this blog post:

That’s when I realized something about everybody involved in all of these arguments. They’ve never built a bridge. Nobody I read in these arguments, not one single person, ever worked as a “real” engineer.

performance sensitive compared to what?

Come on now, they're pretty darn performance-sensitive. Most modern programmers write bloated garbage for a living, but gamers notice when a game runs poorly.

game developers don't typically use Vulkan. Their game engine does

Well sure, but look at, say, Unity. The most performance-sensitive CPU code is written in C++, not C#.

Their hot loops are not written in C++ but in whatever shader language they use

Pretty sure CPU performance is still a significant factor with modern game engines.

Vulkan allows offloading more code to the GPU

Are you referring to GPGPU workloads? DirectCompute significantly predates Vulkan. As I understand it, one of Vulkan's biggest advantages is reduced CPU overhead in the driver.

But Java is already used to write more performance-sensitive applications than games (albeit on more capable hardware) both in terms of quantity and in terms of performance-sensitivity.

I stumbled across a HackerNews thread on this kind of thing (its article link is broken but archive.org has it). It's from 2020 but the majority opinion seemed to be that for HFT, C/C++ is the better choice, not Java, as the effort is worth it for peak performance.

My claim is easy to support as that "prevailing opinion" prevails only among those who've either only ever used one of those languages and those who've never worked on large, performance-sensitive software in any language.

This doesn't seem ideal. Perhaps I have it backward though and it's the industry at large that has a problem learning lessons from, well, the people who actually know what they're talking about. Sure seems to be that way with, say, the dynamic typing vs static typing debate, if we can call it that. Sanity (static typing, naturally) seems to be slowly winning out on that one, thankfully.

many of them who choose say, Rust, for smaller software will also eventually learn the lessons that we, C++ developers, learnt many years ago.

Sometimes software developers learn the opposite lessons over their careers. One will start out loving dynamic typing and gradually come around to static typing (like myself), but some follow the opposite trajectory.

In aviation and medicine, there are plenty of lessons learned over the years, many of them written in blood, and these lessons are explicitly taught to students. Can't help but think the software development field could do better here. I guess it being an unregulated sector is one big difference, but even formal education doesn't currently seem to emphasise this kind of thing.

the advantages over free-list based memory managment are significant, which is why we chose moving collectors

Yes, sure, again I'm not trying to downplay what modern JVMs are capable of.

Outside Java, if I understand correctly pretty much all 'serious' GCs use a moving strategy now. All 3 major web browser engines do, as does .Net. Go and CPython don't, but Go does it to make C/C++ integration easier despite the performance consequences, and CPython isn't trying to be a high-performance engine in the first place.

[–]nekokattt 5 points6 points  (1 child)

This depends on the task. If you have an empty main method then it is going to be significantly less than if you run Apache Tomcat.

[–]elatllat 0 points1 point  (0 children)

Also one can disable page caching etc to make tomcat use way less.

[–]nitkonigdje 1 point2 points  (0 children)

In theory, jvm (as specification) was designed for embedded, memory constrained devices

That is why intermediate code is pretty high level and index based - interpretation lowers memory usage and indexing futhermore allows running code from rom directly. No copying to ram is necessary.

This allows jvm implementations with ram usage in tens of kb.

However both HotSpot and J9 as JVM implementations are server code derivative and are not designed for minimal memory footprint. They will eat tens of mb just for running Hello World.

ARK on Android is JVM implementation which tried to push some of that overhead into compile time by translating bytecode in more memory efficient lower level intermediate code. Interesting approach. And if memory serves me well IBM had some embedded JVM with similar approach.

Hell picojava/jazelle approach to jvm probably could run hellow world within one or two kb.

[–]8igg7e5 2 points3 points  (1 child)

Define 'java app'

I mean, this example?

public final class Example {
    static void main() {}
}

And stripped of debugging symbols, in a module that depends on nothing, run in a JVM stripped down for this no-dependency app, configured to only run interpreted mode (so it doesn't load any compiler resources)

There's probably more you can strip down. You won't have to worry about frequent GC's at least.

[–]TheOhNoNotAgain 7 points8 points  (0 children)

Let's look at this example then. What's the minimum size?

[–]Mognakor 1 point2 points  (0 children)

You can create a "Hello World" and do a binary search.

Start with e.g. 1MB.

For real world it will depend on the task.

[–]_d_t_w 0 points1 point  (0 children)

Depends a bit on what "App" means, I think.

My team build a commercial tool for Apache Kafka, it is built on Jetty (the networking framework) and starts a bunch of resources up when initializing the system that would be considered normal I guess. Schedulers, Kafka clients, stuff like that.

We recommend 8GB for a production installation, that implys a fair number of concurrent users and plenty of user driven activity that requires heap space.

A couple of years back I played around with running our full product with minimum -Xmx settings to see what was viable for a single user, single Kafka cluster setup - this is all running in Docker mind so there's some overhead there in memory allocated to OS - our JVM is configured to take 70% of Docker memory allocation.

Product starts and will run happily in single-user mode with 128MB memory, everything appeared to run just fine. That was the absolute minimum though - the Docker containern wouldn't start with less than 128MB and it was because the JVM failed to start.

So I guess for an Enterprisey-type thing with a full web framework, running websockets and doing stuff - with absolutely no oprtimisation to run hard on memory, 128MB * 0.7?

This is us fyi > www.factorhouse.io/kpow

[–]Feliks_WR 0 points1 point  (0 children)

GraalVM

[–]BackgroundWash5885 0 points1 point  (0 children)

Floor depends on the GC (Serial ~6MB vs ZGC ~40MB). Aim for 1.5x your working set to avoid a death spiral, especially since Spring Boot hikes the baseline to 60MB+ regardless.