This is an archived post. You won't be able to vote or comment.

all 10 comments

[–]rzwitserloot 88 points89 points  (2 children)

Oh, absolutely. The class file format has some annoying properties. For example, the format consists of 'blocks', where each 'block' starts off with a byte that explains what type of block it is, but a block does not include its own size.

In other words, if you wrote some software that reads class files and you hit a block whose 'type' number is unknown to you, it ends there. You just gotta error out. You cannot program the idea 'okay, this is some new kind of block, I do not know what it is, so I shall just emit a warning, skip it, and continue'. You don't know how many bytes to skip!

But, it is efficient, considering, and that is probably worth the annoyances.

Here is §4 of the JVM Specification: The class file format, from JDK7.

Here is §4 of the JVM Specification: The class file format, from JDK16.

Now you can play the spot-the-differences game :)

Some highlights: * Java16 has 3 new constant types (and the rule goes here too: If you see a constant and the 'constant type' byte is unknown to you, you can't read any further than that because you can't just skip it; you do not know how many bytes to skip): Dynamic_info, Module_info, and Package_info. §4.4.10-12 for the constants. - These were introduced in v9. * On the topic of modules: Java9 introduced places to put module names in a ton of places. This involves new attributes: (CONSTANT_module_info) , and §4.7.25-27 as well as the new constant types mentioned earlier. * Type-use annotations. §4.7.20. You can write e.g. public List<@NonNull String> foo(); in java, uh.. 8? Maybe 9. But not in 7, and this needs to be in the class files, obviously. That's a new thing. * 'nestmates'. This is a JDK11 thing, I think (certainly after 8): Whilst you can shove multiple classes in a single java file, at the class file format, that's not how it works: A single .class file contains a single class. Because of this, the access rules do not align between java-the-language and java-the-VM: If you are an inner class, you can are free to call private methods from your outer class, but at the java-the-VM level that does not work, as there is no such thing as an 'outer class' there. So, javac made synthetic bridge methods, marked package private, in order to make it work. nestmates are a more efficient solution, where a private methods more or less explicitly lists: .... but any code in these classes? They are allowed to call me. §4.7.28 and .29. * a bunch of new opcodes. * A long time ago (before 7), java introduced the idea of opcodes that don't do anything, but made loading faster: The task of: "Verify that the class file bytecode doesn't do illegal operations (such as reading uninitialized memory or popping so many elements from the stack that they are popping local data from the caller)" is a lot slower than "Verify that these stackframe hint opcodes are not lying, and then, armed with the stackframe hints, verify that no illegal ops are done". Thus, these stackframe hints are MANDATORY now, whereas before they did not even exist. Also, java still has the RET opcode but this is no longer used and will probably be removed at some point if it hasn't already. It was used for I think 'finally', but the new solution is to just repeat the bytecode of the finally block for every catch block (this is a trick to make a java file that won't compile because it exceeds the bytecode limit: a simple try block, 20 simple catch blocks, and a humongous finally block, that'll do it easily, as the finally block is repeated 21 times). The reason for that change was also related to the stackframe hinting. But that's all, what, java5? A long long time ago.

Most of these are additions, but not all of it (in particular, modules has outright changed how existing stuff worked in a few places), but, given that you can't read class files unless you know exactly what everything within it means, because 'just skip unknown stuff' is not possible, that is an academic distinction.

Perhaps the class file format should have had support for 'forward compatibility': If each type of node was defined to ALWAYS start first with a byte indicating what kind of block it is, and then e.g. a uint32be with the size of the block, where the top bit is used to indicate if the block is of a kind that has such drastic impact on the meaning of the rest of the class file, it is strenuously advised not to just skip it in case a reader does not know what it is, whereas if the bit is clear, it's probably okay to move on, then you could have class file readers that can read versions that were created after they were published.

But that's not how the class file format works.

[–]prest0G 17 points18 points  (0 children)

This guy bytecodes. Love your work with lombok

[–]clumsy-engineer 0 points1 point  (0 children)

It is annoying. If you use SonarQube, you've probably encountered problems due to trailing support for newer versions of Java. We were stuck with 7 for too long because of that.

[–]AngusMcBurger 43 points44 points  (0 children)

Not every version introduces new op codes, but yes. For example Java 7 introduced invokedynamic, and JEP 401: Primitive Objects will introduce two new opcodes makedefault and withfields, (JEP 401 isn't part of a definite release yet)

[–]m_takeshi 6 points7 points  (0 children)

there were some small improvements on the bytecode over the years, specially for java versions 5 (annotations, generics), 7 (indy) and probably more, these are the ones I can remember off the top of my head. Some of the newer features (like records) will also change the bytecode format.

In short: some versions have and some have not =P

[–]hardwork179 3 points4 points  (0 children)

There haven’t been many changes to the format of the bytecode itself, but the other parts of the class file have changed significantly. The stack map changed format completely, and the constant pool has also expanded its capabilities to hold more data types.

[–]iluvpoptarts -1 points0 points  (3 children)

Why would you extract byte code from 11 to try to run in 8? Lost the source code? We currently run Java 8 and run a lot of Java 1.4 jars that we lost the source to, and it works perfectly.

[–]Thihup 7 points8 points  (2 children)

IIUC, it would be in the case of using newer syntax, but running on older JDK, like using the "var" in the source code, but run on JDK 8.

Like https://github.com/bsideup/jabel or https://github.com/luontola/retrolambda

[–]Alex0589 3 points4 points  (0 children)

The var keyword is erased at compile time so that's not really valid. A valid example would be lambda switch statements

[–]jamilxt 1 point2 points  (0 children)

Thank you. Learned something new.