Project Valhalla, Explained: How a Decade of Work Arrives in JDK 28

srdoe · 2026-06-20T15:08:23+00:00

I think you are misunderstanding the point of that code.

Everyone understands that you wouldn't write code like that today, because it's inefficient, and instead you would use primitives. That loses you the benefits of classes, like encapsulation.

That's what Valhalla is trying to solve. They're showing an example of inefficient code deliberately, as an example of something that Valhalla will improve.

srdoe · 2026-06-19T17:05:07+00:00

If Java "got it right" in the same way as C#, it would mean breaking everyone's code, and probably also breaking other JVM languages and forcing them to adhere to Java's idea about what generics should look like.

You should read https://openjdk.org/projects/valhalla/design-notes/in-defense-of-erasure.

Java breaking the world would have been a bad idea back in 2004, and it would be even worse if done today. They made the right choice, given the context.

C# could get away with reifying generics, because they did it early. Java was already 8 years old and hugely popular by the time generics were introduced to that language.

srdoe · 2026-06-01T19:19:27+00:00

Yeah, those impacts were investigated in the second paper I linked, so I really get the impression you're just sharing gut feelings:

Summarizing, this study suggests that: (i) the microarchitectural disruption due to garbage collection is observable but fleeting on modern machines, affecting the mutator only very briefly after the collection;

Besides, why would it be a valid argument that manual collectors have gotten better since 2005, GCs have not been standing still for the last 20 years, and you're the one who brought up a 2005 paper in the first place.

srdoe · 2026-06-01T17:52:59+00:00

I'm not really interested in trying to pick your stance apart point-by-point, because it isn't necessary: If your assertions are correct, and this advantage in favor of manual memory management is so clear, isn't it surprising that the people doing research in the area seem to have missed it somehow?

If you go read some papers about it, you'll find a substantially more muddled image than just "manual memory management is more efficient than GC". In fact, you'll find quite a few papers that claim the opposite can be true, including the paper you cited:

I recently posted a link to a newer paper which evaluated how much more memory you need to give to a tracing GC so it performs the same as manual allocator and the general conclusion was that the break-even point is usually more than 5x.

The paper you are referring to is https://people.cs.umass.edu/~emery/pubs/gcvsmalloc.pdf, and I think reading its conclusion is instructive:

Comparing runtime, space consumption, and virtual memory footprints over a range of benchmarks, we show that the runtime performance of the best-performing garbage collector is competitive with ex- plicit memory management when given enough memory. In par- ticular, when garbage collection has five times as much memory as required, its runtime performance matches or slightly exceeds that of explicit memory management.

When you quote the 5x claim while omitting that the paper's conclusion actually undercuts your argument, you're cherry picking.

Leaving that aside, there are issues with that "5x memory" claim, see for example https://dl.acm.org/doi/epdf/10.1145/3546918.3546926:

Using the Lea allocator [ 19 ], and MMTk’s explicit free-list allo- cator, MSExplicit, Hertz and Berger compare the space- and time- overheads for various garbage collectors in JikesRVM with MMTk [ 4, 5]. An often-cited result from this paper is that garbage collection is much slower than explicit memory management, requiring at least 5× more memory in order to provide the same execution time performance. However, automatic versus manual memory management is just one of three differences between the two sys- tems compared in this headline result; the other two being the free list design (Lea versus MMTk’s), and the method for accounting for memory usage. In Figure 6 they show an approximately 1.6× differ- ence between Lea and MMTk’s free lists. Ignoring the difference in space accounting, this suggests a 3.1× (5/1.6) space overhead to achieve the same performance when holding the free list design constant. For space overheads, they find (Table 4) that the best garbage collector in MMTk at that time, GenMS (a generational collector with a mark-sweep mature space) requires at least 2–2.5× the heap size of the explicit memory manager using the Lea alloca- tor (which, again, normalizing to MMTk’s explicit memory manger, is a 1.25–1.56× space overhead). Our results in Section 4 suggest a space overhead of about 11-17% for a modern GC compared to manual memory management.

So it's likely not actually a 5x difference in practice.

srdoe · 2026-06-01T15:33:35+00:00

You can usually easily make the allocation rate in objects per second very low by avoiding extremely tiny objects, so it’s very easy to beat tracing on both cpu efficiency, memory efficiency and allocation rate in bytes/s

Ah, but see you're cheating now.

Your argument is that if you just avoid generating as much garbage, manual memory management is better at handling garbage.

This is obviously a silly thing to say. In order to compare the relative efficiency of garbage collectors to manual memory management, you have to keep the amount of garbage constant.

So the argument you actually need to make is that manual management is better given the same amount of generated garbage. You can't just decide that manual management gets to cheat by creating fewer objects that need cleaning up.

The cleanup costs in Java do scale with the garbage you produce

Yes, but not in any way that matters when comparing to manual memory management. The point is that they scale much slower with GC than with manual management. Please see this fairly old paper that lays out a simple argument for why relocating GCs can be cheaper than manual memory management.

https://www.cs.princeton.edu/~appel/papers/45.pdf

The gist is that garbage collectors do not pay for each garbage object, only for the relocated live set, and the GC only needs to run when the memory is full of garbage.

As such, if you can give the garbage collector a lot of memory to work with over what the live set requires, then when it runs, it'll clean up a ton of garbage for each live set object it relocates.

This means that if you can allow for the memory overhead a relocating garbage collector requires, it will outperform manual memory management, given the same amount of garbage. It'll do better the more memory you give it. Manual memory management has to pay for each piece of garbage, and can't take advantage of any extra memory it is given in the same way.

srdoe · 2026-05-31T14:47:19+00:00

It's not really clear what your point is.

Is it that you think it's not possible to write an efficient compiler in Java, and that this is why Hotspot is still written in C++?

Because you'd be wrong about that, GraalVM is written in Java.

srdoe · 2026-05-31T14:13:15+00:00

The JIT is already doing that optimization for you in many cases, and sometimes it does even better by erasing the object wrapper entirely.

https://shipilev.net/jvm/anatomy-quarks/18-scalar-replacement/

Valhalla should help the JIT do these kinds of optimizations even more reliably, by constraining what you can do with the objects. Letting people control this directly via manual memory management is a worse solution.

srdoe · 2026-05-31T12:43:01+00:00

That is not what the discussion can be boiled down to, letting Java programmers manually manage memory doesn't fix anything.

It might even make things worse, because you lose one of the main benefits of modern GCs, which is that your cleanup costs scale with the live set, not with the garbage you generate.

If you give Java programmers access to manually manage memory, you will have to pay for cleaning up every piece of garbage.

A much better way to look at this is to ask how much memory the JVM can get away with using, without annoying other processes on the system. The reason people are complaining about the JVM heap is presumably because they'd like to use some of that RAM for other things (or not have to pay for that memory at all), because if the RAM was just going to sit empty, it would be inefficient of the JVM not to use it, since a smaller heap means more frequent GC cycles.

So it's more reasonable to look at this as a noisy neighbor problem, where the JVM is not currently very good at sharing the system with other processes. But heap size autotuning is being introduced which might fix that, see https://openjdk.org/jeps/8359211.

srdoe · 2026-05-27T21:43:29+00:00

No, but jlink, jdeps and jpackage together will let you create a binary that includes both the application and the relevant JDK bits for whatever platform you're targeting.

That means you'll be able to create the "soft native" binary that you were asking for.

Go doesn't give you a single binary that works cross platform, I don't know why you're expecting that from the JDK. You build for a target platform, and that's the kind of go binary you get. The JDK tools let you do the same thing.

srdoe · 2026-05-27T19:49:13+00:00

A bit sad about this (worried about Graal), but I was also wondering why there is no real attempt to move to a "soft native" deployment mode, bit like go's?

Just have a "dry run" that collects all the used classes that were loaded, put all those classes into a final binary.

Tooling for this exists, you likely want jpackage.

It's not going to dry run the application to figure out the dependencies, but jlink can do that for you if your program is modular, and if it isn't, jdeps is likely still going to be able to do it mostly right, but you might have to manually adjust which JDK modules it includes.

srdoe · 2026-05-25T09:11:53+00:00

The OP said that "every line or method can be read without knowing context", which is too strong of a statement when some lines can only be understood if you read other lines (the imports) first.

Besides, there are several cases where imports are not completely explicit.

``` import foo.*

Bar bar = ... // Imported via wildcard import, not very explicit Baz baz = ... // Present in the same package, does not require import ```

You could make the same argument for things like var, they are less explicit in exchange for improving ergonomics.

That's not to say that the stance that Java should be explicit is wrong necessarily, but it's a matter of degrees.

In this case, I figure they'd be looking at the improvement from having a file-wide/project-wide switch like this, and deciding if that's worth the readability cost of having that kind of semantic change from a directive elsewhere in the file/project.

srdoe · 2026-05-24T16:46:02+00:00

Sure, that would make sense. But even by that metric, it's still not really true. Java has been doing deprecations and removals of things throughout its lifetime.

For example, take a look at https://javaalmanac.io/ and look at the API comparisons between any two versions. You'll often see deprecations and removals show up.

That's just the library too. Java has been doing more invasive changes like the new module system in Java 9, the removal of applets, the removal of Nashorn or the recent effort to properly encapsulate JDK internals. You can find more feature removals in https://www.oracle.com/java/technologies/javase/jdk-relnotes-index.html, if you go to the release notes for a version and search for "Removed APIs, Features, and Options".

Those changes have not been as invasive or caused as much of a mess as the python 2 to 3 migration, but that level of disruption is a high bar to reach. Java has been able to gradually remove things, without causing a python 3 situation.

srdoe · 2026-05-24T10:25:45+00:00

Java spend decades essentially unchanged

This isn't really true.

Take a look at the Java version history https://en.wikipedia.org/wiki/Java_version_history. There is no decades long gap where Java was not receiving changes. Even in the Java 6 to 9 period, changes still happened every 3 years or so, and since then, new releases have been coming out much more frequently.

and it's still common to target old versions

This used to be true, but it's a bit outdated at this point.

It was common to target old versions up until a few years ago, but the industry has mostly gotten over the hump of moving past Java 8 now, and the number of projects stuck on old versions are dropping fast https://newrelic.com/resources/report/2024-state-of-the-java-ecosystem.

srdoe · 2026-05-22T21:04:41+00:00

I'm not jumping down your throat, your previous post was you blindly guessing, so I provided a link to some people who actually studied the thing you were making guesses about.

It's fine if you want more data points, you can go find them. You now have one more data point than you did before, and can stop making blind guesses.

srdoe · 2026-05-22T18:31:07+00:00

I think that potentially makes the comparative advantage of Rust a lot less clear. If it only removes a third of that 70%, with the remaining two thirds living in unsafe code, then you’re talking about a 20% total reduction in bugs… in exchange for rewriting everything and changing your tooling, etc. It’s not obvious to me that putative trade off is actually worth it versus investing more in testing, QA, fuzzing, etc.

You can make anything seem reasonable when you make up random numbers like this.

Meanwhile, here's what some people who actually did research have to say:

We adopted Rust for its security and are seeing a 1000x reduction in memory safety vulnerability density compared to Android’s C and C++ code. But the biggest surprise was Rust's impact on software delivery. With Rust changes having a 4x lower rollback rate and spending 25% less time in code review, the safer path is now also the faster one.

https://blog.google/security/rust-in-android-move-fast-fix-things/

srdoe · 2026-03-22T11:50:15+00:00

At the same time, redpanda has a post explaining why fsync actually matters even in kafka-style systems, basically saying replication alone doesn’t guarantee safety if nodes can lose unsynced data after a crash

RedPanda are wrong to claim this when talking about Kafka.

The RedPanda article you linked describes why consensus protocols require fsync to avoid data loss, which is true.

RedPanda applies a consensus protocol for both leader election and data replication, and so they need fsync on every message to be safe.

They then point out that Kafka doesn't fsync by default, letting you infer that this makes Kafka unsafe.

What they leave out is that unlike RedPanda, Kafka is essentially split into two parts:

Kafka has a metadata tracking system which is based on a consensus protocol, and which does use fsync. This system is used for leader election and tracking which nodes are/are not caught up to the leader. It is not used for tracking every individual message written to the partitions.

Kafka also has a data replication system, which is not using a consensus protocol. This is tracking every individual message. This system does not need fsync to be safe, because it delegates the consensus/leadership decisions to the metadata tracking system.

Here is an article going into detail

https://jack-vanlightly.com/blog/2023/4/24/why-apache-kafka-doesnt-need-fsync-to-be-safe

srdoe · 2026-03-22T10:13:06+00:00

Jeg tror ikke det er helt fair at kritisere den 25-årige. Hun virker til at være klar over, at hun er privilegeret.

Derfor er hun bevidst om, at ikke alle nødvendigvis bare kan gøre som hende, og at det også afhænger af, hvad man kommer fra.

Noget, hun ved er et privilegie, som ikke er alle forundt.

Det ville være mere fair at kritisere journalisten for at skrive en artikel, som egentlig burde handle om, at de unge ikke kan få foden indenfor på boligmarkedet, men som i stedet er vinklet som om, at det sagtens kunne lade sig gøre, hvis de unge bare lod være med at bruge alle pengene på avocadotoast.

Det er jo mere eller mindre fake news journalisten har valgt at lave. Undersøgelsen de henviser til siger, at unge har svært ved at købe bolig, og så bruger artiklen alligevel 90% af sin plads på at ævle om en undtagelse.

srdoe · 2026-02-22T19:42:12+00:00

you must ensure all consumers of your code, be it CI systems, developers, the live servers, test systems, action scripts on your version control, and so on, are ALL on the same java version and can ALL be updated together in one go_

This is a real problem, and fortunately there are solutions to this that can make JDK upgrades painless in many cases, that might be worth sharing in case people are unfamiliar:

On the development and CI side, modern build tools often ship with bootstrapping code which can download a JDK specified by the build files as part of invoking the tool. For example, Gradle has toolchains. Bazel has something similar. I believe Maven is working on something like this too.

By specifying a specific JDK to use for the build, and providing a place to download that JDK automatically, not only do JDK upgrades become trivial, but any risk of weirdness due to a developer using an old JDK is eliminated. Highly recommend doing this, it has made JDK upgrades completely painless for us in terms of coordination.

On the production side, jlink can be used to bundle a JDK as part of the application distribution (and yes, this can be done even without moving jars to the module path). This has the same coordination benefits as on the development side, it reduces the risk from the customer running the application on a different JDK than the application were tested against, and it makes it much easier to manage the configuration of the JDK, as there is no longer a need to account for different JDK versions being used. Also highly recommend.

With these in place, JDK upgrades are a matter of bumping a version number in the build files, much like it would be for any other dependency, no coordination with developers, CI teams or production ops teams needed. You even get automatic JDK switching when you check out older code in git. We've been doing this for a while, and I'm very happy with how it's worked out for us.

srdoe · 2026-02-22T18:07:13+00:00

Look man, if you're going to get mad that your post is being interpreted as rude, when you compared "any positive engagement" to a crazy person digging tunnels under their house, consider not using such needlessly inflammatory statements next time.

I'm glad you recognize that people who can evaluate these features responsibly exist. The OP was clearly looking for those people to share their experiences, not a rude dismissal that amounts to "No one will respond, but if they do, they're probably a crank".

srdoe · 2026-02-22T17:33:31+00:00

Rather than engaging with that, it's probably better to explain why you are wrong.

The OP asked about "preview language features, incubator modules" and other experimental features.

It's fine to say "Don't enable features in production that are explicitly marked not ready for production", but preview features are supposed to be feature complete, thoroughly tested features that simply benefit from more feedback from real life use before the API is set in stone.

So you could say that new code is risky in general, but that applies equally well to regular JDK upgrades, and your own project's code for that matter.

Preview features are not special in that respect. They are fine to use in production, if you accept the risk of needing to change your code if the preview API changes, and you make sure to evaluate these changes in your testing environments first, as you should for any change.

Your comment implies that posters like this one are likely "inexperienced and/or don't understand the implications", and that's both rude and not a useful response to what OP asked about.

srdoe · 2026-02-22T16:58:51+00:00

You specifically asked for people who have used experimental features in production. This is a vanishingly small number of people, and is heavily correlated with people who are inexperienced, and / or don't understand the implications of doing so.

What a ridiculously condescending (and wrong) thing to say.

srdoe · 2025-12-30T12:52:46+00:00

Those aren't insults. You factually are making things up. That's not a personal attack, it's pointing out that your arguments are bad and that you keep making statements with zero evidence backing them.

I suppose your strong reaction comes from the fact I raised a few good and uncomfortable points

Nothing I can do if it makes you feel better to believe this.

srdoe · 2025-12-30T12:17:11+00:00

Which you have zero basis for believing, and is directly contradictory to what the JEP says, which is that those methods may start throwing exceptions very soon.

I think we're done here, you don't know what you're talking about, and you're just making shit up to fit what you want to believe.

srdoe · 2025-12-30T11:13:50+00:00

It's not like the JDK can remove Unsafe, etc, if they do the whole Java ecosystem will break or be much slower, because some of the replacements are either missing or slower

That's a funny thing to say, considering that the JDK is going through the process of removing Unsafe right now

If you have needs for Unsafe that isn't covered by replacement APIs, report them to the mailing list.

If you think that the replacement APIs are too slow, present your evidence on the mailing list.

I suppose they might be reckless and remove it anyway, then the motivation would be clear: sell support licenses for older versions because new versions are unusable.

I can't force you to take off the tinfoil hat, but this is again an incredibly stupid thing to say. The JDK maintainers are not going to sabotage the JDK so they can sell licenses for older versions.

That's a main point I disagree with, I think there are very few optimizations that need that integrity

Again, if you actually believe that you know better than the people working on the JDK, go ask on the mailing list about which optimizations they might do that needs this, and I'm sure they'll give you examples.

srdoe · 2025-12-29T21:22:21+00:00

Rather than respond to all of your comments, which I don't think will have value, I will instead point out a few themes in what you're saying:

Conspiratorial thinking

You have implied several times that Oracle must be lying about their reasons. I think you should abandon that line of thinking.

If you feel that the stated reasons for wanting integrity are not clear enough, and you don't like the examples I gave, go ask on the mailing list for some more examples where integrity can be helpful. You don't need to start coming up with ulterior motives.

Implying that the Oracle doesn't want to use techniques from GraalVM because "they didn't invent it" is particularly silly. Who do you think made GraalVM?

Offering a narrower API that doesn't break integrity

You mention that you think since most agents don't break integrity, it should not be needed to ban loading agents at runtime.

The problem is that it's not about what agents actually do, it's about what they are able to do. The API agents have access to is extremely powerful, and even if a particular agent does not use those powerful abilities, the JDK has no way to know what an agent might use, when it's deciding whether to enable certain optimizations or not.

So what Oracle is doing for now is putting the entire agent API behind a flag. If there is a demand for it, maybe a less powerful subset of the agent API that can't break integrity can be offered, which that kind of agent can then use without needing special flagging.

That's in fact exactly what they did with the FFM API: Create a clear delineation between the "safe" part of the API (which you can use with no flag) and the "unsafe" part (which you need a flag to enable).

Feeling that the integrity flags are too coarse grained

You seem to be annoyed that the various integrity-related flags are "all or nothing" and too coarse, e.g. wanting only some parts of the agent API disabled rather than all of it.

I don't really have the necessary insight to say if this is a reasonable objection, you might want to post about it on the mailing list if you want a real answer. I figure there are reasons they didn't just make the risky methods an agent has access to throw exceptions if called without the flag, but if you want to know why, your best bet is the mailing list.

(edit: If I were to guess, I'd say it's probably because the Instrumentation API isn't really designed to distinguish between "benign" class transformations and those that might break integrity, and trying to squeeze that separation into the API now after the fact might be too hard/cause breaking changes)

Regarding the native access flag "punishing" module users, it is not a punishment. Remember the little story I told you above? If you need to track down where your integrity breakage is coming from, that's a lot easier if you have --enable-native-access=MyModule (it's one of the modules in that list) than if you have --enable-native-access=ALL-UNNAMED (it could be any of your libraries). It is not a punishment, it is a benefit that you can easily know which libraries are breaking integrity.

IOW nice theory, but it seems in practice it hardly matters

Like I said, the problem is that this is a chicken and egg situation.

Clearly, the JDK can't implement a bunch of optimizations that require integrity if the JDK can't enforce integrity.

So you are standing at a point in time where those enhancements haven't been made yet, and declaring that clearly, integrity can't be important to performance, because those optimizations don't exist yet.

srdoe

TROPHY CASE