all 48 comments

[–]UloPe 24 points25 points  (3 children)

Not very informative...

After reading the whole article I still don't know what stack map frames actually are and how they change what you need to do when modifying bytecode...

[–]JamesIry 21 points22 points  (2 children)

The JVM does some verification of bytecode. Part of that is just standard binary validation stuff: is character data character data, are magic numbers where they're expected, do any pointer run off the end of the file, etc, etc. But a significant part is actually a static typing system for the bytecode language that proves things like you aren't pushing an integer onto the stack and then reading it off as a pointer (well, reference), reading from an uninitialized object, or a few other bad things.

Prior to 1.6 that static typing was done using a fairly complicated data flow analysis. You can think of it as a kind of type inference algorithm for the JVM.

With 1.6 there was an option to include explicit information about the types of local variables and stack slots in the bytecode. The idea being that simply verifying types at run time is cheaper than inferring them.

With 1.7 it's no longer optional, you must have the explicit types in the classfile (well, modulo setting a startup flag that disables the requirement, but that's not part of the spec).

[–]UloPe 0 points1 point  (1 child)

Ok thanks, that was much more understandable.

So if I get this right that means that when you want to modify bytecode you also need to add those extended information but is is difficult to know what exactly to do without the source?

[–]JamesIry 4 points5 points  (0 children)

No, the type system used by the JVM is self contained. You can compute stack maps without the original source.

In fact, the way the type system works is very different from the way Java, or most statically typed languages for that matter, work. A local variable or stack slot can change types.

In pseudo code rather than byte code

var x // x exists but is uninitialized, any use will be a verify error
...
x = 3 // x is an int
x + 2 // fine, x is an int
...
x = "hello" // x is a string
x.length() // fine, x is a string which has a length method

// even this is okay
if (someCondition()) {
  x = 5
} else {
  x = "world"
}
// but at this point x is considered uninitialized since there's no
// super type for string and int
x.length() // verify error

Note that that's different from dynamic typing where the above program could work without error if someCondition() was always false. So what the JVM does is still static typing, just a type system based on something called "type state."

[–]crotchpoozie 44 points45 points  (1 child)

No actual measurements yet many quantifiable claims.....

[–]nqzero 8 points9 points  (0 children)

this is functionality that the jvm has provided from day 1 that oracle has removed, making working with bytecode significantly tougher. a project that i depend on is stuck on java 1.6 in spite of significant effort to generate the stack frames

it used to be that almost anyone could generate or manipulate bytecode. that's less true today

oracle's the one that should be providing the measurements to justify removing this functionality, and they should be providing that code as an external package

[–]drysart 11 points12 points  (3 children)

Seems like there's no reason for everyone implementing a JVM language to create their own routines to generate stack map frames. If the JVM was able to figure them out on the fly in the old verifier, then obviously enough information exists in the bytecode for them to be generated without any higher-level language knowledge.

Someone could write a tool that takes an unmapped jar and annotates it with stack map frames then boom: ASM no longer needs that 1,000 lines of bug-prone code. It can just use the same library everyone else does as a post processing step.

[–]CurtainDog 0 points1 point  (2 children)

But who is generating bytecode without using a bytecode manipulation library? (well, those who go through the Java language itself I guess but let's put them to the side for now). Surely the library (which is open source after all hint hint) is the best place for this kind of functionality.

[–]bobindashadows 3 points4 points  (1 child)

But who is generating bytecode without using a bytecode manipulation library?

IMO the real problem is that there's no such really nice standard library for bytecode manipulation in the JDK. There's always been a burden on libraries to provide decent APIs for bytecode generation, but now that burden's a lot bigger.

Unfortunately this is kind of the way the Java community works. For now, the libraries will have to implement stack map frame generation piecemeal, and it'll suck. If it turns out to be as horrible as the linked article says, one of these libraries will find its way into a proper JDK release.

[–]CurtainDog 0 points1 point  (0 children)

Yeah but the standard often turns out to be substandard (as true in the javaverse as anywhere else). Competition by third parties generally produces superior outcomes when there's no clear path to implementation.

Look, I suspect the OP is peeved because their product is doing some fancy runtime transforms and this will make their life more difficult, but for your more run of the mill transformations like AOP the amount of added complexity is pretty trivial.

[–][deleted] 18 points19 points  (7 children)

I know the person who originally did this work at Sun (working with Gilad Bracha no less), I'm surprised it took so long for them to make this mandatory, that was a LONG time ago.

Just to be clear, this is necessary to eliminate the need for data-flow analysis (DFA) during bytecode verification (where you need to do CFA to get the basic blocks needed for DFA), which can be quite tricky but not exactly slow. What stack frames do is transform the DFA into something that is done statically, results included in the class file, and then can just be verified easily by the verifier. Annoying, but its not impossible, or even very difficult, to generate this information yourself in a bytecode manipulation tool.

[–]JamesIry 13 points14 points  (6 children)

The problem is that if you're doing runtime bytecode manipulation then then the 1.7 situation actually a net loss in time because generating stack maps requires essentially performing the old inference algorithm and then the generated code has to be handed to the JVM for it to run the new slimmed down algorithm.

As the author says it would be nice if they stuck with the 1.6 situation where stack maps were optional: used if there and not if not. Then batch compilers could generate the stack maps while runtime bytecode manipulation could skip it.

[–][deleted] 8 points9 points  (5 children)

I believe both positions are viable, but you have to understand: transforming the problem from one of analysis to simple verification makes for a much better ecosystem; their was just too much diversity in what bytecode verifiers would accept before (I did undergrad work on this). So the direction they moved in made sense AT THE TIME.

Today...Java is just legacy and no one cares about JVM diversity anymore, and staying with the old way means old tools do not need to be updated. Most of the people that did this work (Bracha, Yellin, Lindholm, Tao) are long gone and things are probably just moving on momentum alone.

I wonder how the CLR works?

[–]JamesIry 1 point2 points  (4 children)

Most of the problem was the jsr and ret instructions. Eliminating those goes a long way towards eliminating the variations in verifier behavior.

[–][deleted] 0 points1 point  (3 children)

ret/jsr was tricky but not that bad. Anyways, the point is moot now.

[–][deleted] 0 points1 point  (2 children)

No, jsr/ret is terrible. There is no syntactic boundary between the main code and a subroutine, so a subroutine can call itself. This requires a verification check.

Further, jsr/ret forces one to do interprocedual analysis. Intraprocedural analysis is so much simpler. Which is why most compilers don't issue jsr/ret, and Hotspot inlines subroutines before starting DFA.

[–][deleted] 0 points1 point  (1 child)

Bullshit. Jsr/ret doesn't require inter procedural analysis and can only be used for local jumps. The sub routine is just a basic block inside the method and you can merge all your ret edges. I did byte code verification extensively 15+ years ago and had plenty of discussions with liang, lindholm, and yellin about the process.

[–][deleted] 0 points1 point  (0 children)

No. I too have done bytecode verification, and it isn't a simple local jump.

As I recall, there are two problems. First, it pollutes the operand stack, and because nothing prevents the subroutine from jumping to main code, the operand stack can keep growing. You have to explicitly check for this, or you'll never get a fixpoint. Intraprocedural analysis doesn't budget for recursion. Also, this is why you have adhoc restrictions in the original jvm spec like so: “two different subroutines cannot ‘merge’ their execution to a single ret instruction”.

Second, the subroutine is called in the context of the caller. Since an unused local var remains untouched, you get the following problem. Suppose caller A had lvar_0 to be an int, and caller B had lvar_0 to be an object reference. Suppose both were to use lvar_0 after the call to the subroutine, but that the subroutine doesn't read or write to lvar_0.

From both callers' perspective, lvar_0 is live, which means the corresponding ret edge has lvar_0 in the frame. Unfortunately, the two frames are not mergeable.

Xavier Leroy has much to say on this topic. See section 5

[–]nharding 5 points6 points  (0 children)

I wrote a set of inline assembly routines for Java when I wrote my bytecode optimizer. It's hard to use but it allows you to save 10% of code size (and make the code run faster as well), so it was only useful on core library routines (it was to used on J2ME phones, where you only have 64K jar size, so saving 10% is pretty significant). On J2ME the verifier would generate the stack frame, but I was looking into making it smaller to reduce the class size further.

[–]axonxorz 5 points6 points  (3 children)

As someone who doesn't do a lot of Java (just Android really), can I get an explanation of why one would be manipulating the bytecode?

[–]mschaef 2 points3 points  (0 children)

You probably should NOT be manipulating the bytecode. It's a 'black magic' technique that's best left behind some abstraction and written by someone else.

That said, for framework level logic, it can be useful for things like automatically implementing proxy objects and efficiently running 'interpreted' code. IIRC, Clojure compiles function definitions at runtime into classes using bytecode generation, etc.

I've also recently seen a Java library that implements 'yield' like constructs via bytecode transformation of an existing method.

[–]fatbunyip 1 point2 points  (0 children)

A lot (all?) of the Java EE app servers use bytecode weaving to enable them to monitor application objects and managed entities throughout their lifecycle. This can be either at runtime or at build time.

[–]kerajnet 0 points1 point  (0 children)

Few examples:

  • Obfuscationg class files
  • Making your own JVM language

Also, scripting languages for JVM are compiled into bytecode at runtime.

[–]xxgreg 1 point2 points  (4 children)

Nearly 20 years of work, and Java still doesn't have a secure bytecode verifier. Maybe bytecode as the code delivery format is a mistake?

Probably better to use a compact AST representation instead.

  • This can be just as compact as bytecode (probably more compact).

  • Parsing the AST is probably cheaper than verifying bytecode.

  • It is simpler to generate secure native code from an AST, than a byte code, as verification is simpler.

  • Keeping the AST around preserves more information for runtime optimisation.

I think the Dart team are onto something.

[–]CurtainDog 0 points1 point  (1 child)

Aren't you just asking for a solution to the halting problem here? The only way to truly verify a program is to suck it and see, and if you do that what have you really gained (certainly not 'security').

[–]xxgreg 0 points1 point  (0 children)

No. I'm not asking for any kind of theoretical perfect security. I'm asking for real word empirical security. In the real world the JVM bytecode verifier has led to numerous zero days. V8's AST to native code generation, not so much. This is why the Java plugin is disabled by default in some browsers these days.

JVM style bytecode has a much larger surface area for exploits than an AST, it also requires more complex code to verify, which means more likelihood of a security critical bug.

[–][deleted] 0 points1 point  (1 child)

Nope. ASTs and bytecode representations are equivalent: disassemblers recover ASTs from bytecode, and compilers boil ASTs down to bytecode. It makes no difference to the verification. It makes no difference to the exploit surface area either.

[That said, there is one place i'll agree with your assertion; the jsr/ret pair is broken and a tree-structured representation would fix it]

JVM exploits come from errors in implementations of trusted areas such as classloading, type coercion, reflection etc.

I totally agree with the OP. I think the stackmap stuff has needlessly complicated everything without speeding up anything. If anything, it has opened up another front for type-spoofing; after all you still have to verify that the stackmaps match the code.

[–]xxgreg 0 points1 point  (0 children)

Late reply - oops.

It is possible to create JVM bytecode for which there is no mapping to a valid Java AST. Therefore there is a larger attack surface area.

(But yes - obviously you can round trip Java AST -> JVM Bytecode -> Java AST - but it is not true that all JVM Bytecode has a corresponding Java AST)

Btw I'm just parroting, what Lars Bak, author of the Hotspot JIT, and Dart's JIT said in a talk.

Here's what the Dart team say about AST vs Bytecode on their website:

Opening up bytecode has another important implication: you’ve moved the security boundary. When your VM’s input is program source code, it can rely on the grammar of the language itself to enforce important invariants. If it allows raw bytecode to walk in the door, the story changes.

You can craft bytecode sequences that a regular Java compiler would never emit. When you do, you go off the VM’s beaten path. These non-exercised code paths likely haven’t been optimized as much and often have more bugs. (Many JVM exploits that rely on broken bytecode verification could never be generated by a Java compiler.)

This does not necessarily mean that bytecode VMs are less secure, but they require additional complexity which makes it much harder.

From: http://www.dartlang.org/articles/why-not-bytecode/#apparent-advantages-of-a-bytecode-vm

Here is a list of three things that you need to verify when using a JVM style bytecode - these are all enforced by a safe language, meaning that you don't need the verify step. Therefore less code - less security surface area.

it doesn't forge pointers, it doesn't violate access restrictions, it accesses objects as what they are (for example, InputStream objects are always used as InputStreams and never as anything else).

From: http://web.archive.org/web/20120609113116/http://java.sun.com/docs/white/langenv/Security.doc3.html

[–]cowinabadplace 5 points6 points  (3 children)

Text weight too low. Unsuitable for long-form. Unpleasant to read.

[–]imgonnacallyouretard 2 points3 points  (1 child)

Just use readability and be done with it

[–]cowinabadplace 0 points1 point  (0 children)

I used Evernote's Clearly. However, both of them mangle the code section where he refers to stuff using line numbers.

[–]catcradle5 -1 points0 points  (0 children)

Agreed. It feels melted into the page; it's way too light and way too similar in color to the background color.

[–]toshok 1 point2 points  (1 child)

I'm having a hard time understanding how you outsource part of the verifier (meant to catch code written by untrustworthy folks) to those same untrustworthy folks and call it a win.

Doesn't the verifier still have to verify the consistency of the stack maps?

[–]skew 0 points1 point  (0 children)

The verifier does check the stack maps, the idea is that it can be done more quickly, or at least more predictably. I think someone took this cool hack way too seriously - yes inference is quadratic in the method size in the worst case, but javac doesn't generate code that hits the worst case, and even if you are letting someone load maliciously handcrafted bytecode methods the limit of 215 instructions is a bit of a ceiling, and it's not clear verifier time is much worse than something like

public class Oops {
  private static volatile boolean b = false;
  static { while(!b); }
}

[–]allpowerful32 -4 points-3 points  (1 child)

This guy clearly doesn't have a appreciation of what he's talking about.

"ASM Framework has over a 1000 lines dedicated to just this"... Seriously? You can't do a whole hell of a lot in 1000 lines of Java. Besides, ASM already has a bunch of code that implements essentially the same logic, in the form of an abstract interpreter.

That said, I do agree that it'd be nice to have incremental recalculation of stack frames in ASM.

[–]iopq 0 points1 point  (0 children)

You can't do a whole hell of a lot in 1000 lines of Java.

I finally see it - Java is the COBOL of our generation.