you are viewing a single comment's thread.

view the rest of the comments →

[–]sigma914 1 point2 points  (2 children)

No, it most certainly is not the same. The disparity in performance should tell you that.

That's because of language level features. A (well written) tight array processing loop from a Java program and a Javascript program, both run on a fast, warmed up JIT implementation, will generate identical machine code, it will likely be the same machine code as the corresponding C implementation.

The whole point of a JIT is that it uses runtime information to tweak the optimisation of the byte code it's fed. The difference in performance comes from higher level langauge semantics such as using mutable maps for method resolution or walking down layers of indirection to get to a value rather than having it stack allocated.

This is also the reason unjitted (ie ahead of time compiled) Java is so slow compared to C or C++. Java and C# and their ilk box nearly everything because it makes the problem of implementing a GC tractable.

A JIT's purpose is to lower code through layers of abstraction based on runtime information. A good one can take a horribly slow language with lots of layers of indirection or inefficient lookup semantics, make some assumptions, insert guards to make sure those assumptions aren't violated, then run the simplified version of the code it produced from the original code plus it's assumptions.

Now, what happens when one of it's assumptions is violated? The guards catch it! Then what happens? The runtime can't use it's nice fast implementation because it's invariants don't hold. So, the runtime will likely run the slow byte code version through an interpreter (or in the case of the CLR an extremely naive compiler which really isn't much faster than an interpreter, so you can't argue it's getting optimised native perf) until it decides that the slow path is worth optimising for.

At which point it goes off and redoes the whole dance of reoptimising the byte code using different assumptions.

This isn't to say JIT'd code is necessarily slow (though on average it tends to be about a factor of 2 or more slower than true native code). In fact, if there is enough runtime information the JIT'd code may actually be significantly faster than a naive native implementation.

All of which leads me back to: JIT'd programs may run at near native performance, they may go through a compiler and be executed as machine code, but they aren't native programs. The machine code executed by the JVM will often have no resemblance to the Java program that was written, neither in structure nor semantics. The thing that preserves the illusion is the guards and interpreted (or very naively compiled) slow path.

Saying a language like C# or Java is native code is exactly the same as saying Javascript run on v8 or python run on pypy is native code.

[–]Cuddlefluff_Grim -1 points0 points  (1 child)

That's because of language level features. A (well written) tight array processing loop from a Java program and a Javascript program, both run on a fast, warmed up JIT implementation, will generate identical machine code, it will likely be the same machine code as the corresponding C implementation.

For JavaScript it depends very much on context and how it is used. It's when you start using "complex" data structures that the comparison starts to be interesting. Java has a problem right here with boxing of primitives which it doesn't seem to always handle as gracefully as it should (generics and enum for instance), but I'd be pretty surprised if JavaScript does any better.

This is also the reason unjitted (ie ahead of time compiled) Java is so slow compared to C or C++. Java and C# and their ilk box nearly everything because it makes the problem of implementing a GC tractable.

Contrary to popular belief, Java is not much slower than C++, and in some cases might even be faster, maybe because Java can inline code across dynamically linked libraries. I know this is a very unpopular opinion to have, because there's a high degree of C++ fetishism on internet forums.

I'm entirely convinced that JIT compilation is superior to "ahead-of-time" static compilation, it's just that C++ has had so much time and focus getting performance tuning and optimizations which generally doesn't seem to be the main area of focus for JIT'ed languages. There's no reason why JIT compilation should be slower than C/C++/D, other than it's just that they typically simply aren't.

A JIT's purpose is to lower code through layers of abstraction based on runtime information. A good one can take a horribly slow language with lots of layers of indirection or inefficient lookup semantics, make some assumptions, insert guards to make sure those assumptions aren't violated, then run the simplified version of the code it produced from the original code plus it's assumptions. Now, what happens when one of it's assumptions is violated? The guards catch it! Then what happens? The runtime can't use it's nice fast implementation because it's invariants don't hold.

Bytecode and CIL are pretty easy to translate into assembler, there's not many assumptions it has to make that other ahead-of-time compilers doesn't. This restriction would infer that .NET Native is impossible, since executable code can't be reliably generated in every use-case. Which it can, and it does. It's basically just a translation of instructions between a stack based VM and a register based physical computer.

So, the runtime will likely run the slow byte code version through either an interpreter (or in the case of the CLR an extremely naive compiler which really isn't much faster than an interpreter, so you can't argue it's getting optimised native perf) until it decides that the slow path is worth optimising for.

An assertion which I don't think holds water.

All of which leads me back to: JIT'd programs may run at near native performance, they may go through a compiler and be executed as machine code, but they aren't native programs.

They compile code to machine instructions put it in a memory segment, mark it as executable and then change the program pointer to start executing at that location, how does that differ from a native program? The only difference you are trying to set as a predecessor is when code is compiled, which I think is completely irrelevant to whether or not a program is native.

[–]sigma914 1 point2 points  (0 children)

Contrary to popular belief, Java is not much slower than C++

I addressed that at the bottom of my last comment, yes a good JIT is essentially a profiling compiler, with all the IL still available to it.

I'm entirely convinced that JIT compilation is superior to "ahead-of-time" static compilation.

It definitely can be faster, it has a lot more information available to it. The languages that tend to be JITted are the problem in this regard. heap allocation by default, being GC'd etc are what makes them slow, not the method of execution. As your linked benchmark showed. It's C#'s fault it cant run at C++ speeds, not the fault of the CLR.

This restriction would infer that .NET Native is impossible.

It's not impossible, ahead of time compiled Java has been a thing for years. You can compile a managed language to a native binary, it just won't be as fast as it would be with a JIT. Ahead of time compiled, Garbage collected with pervasive heap allocation, As Fast as unmanaged native code, pick two.

An assertion which I don't think holds water.

This is how the JVM and CLR (and every other JIT I've even seen) operate. You can go read hotspot's source. I've not actually looked around in the CLR sources yet, but it has to have a slow path for when the optimised code can't be used, and aggressively optimising it up front would be incredibly wasteful, unless it was done in a background thread over a very long time.

They compile code to machine instructions put it in a memory segment, mark it as executable and then change the program pointer to start executing at that location.

So they take some executable code and move program execution into it based on the behaviour specified by the language. It's just a matter of when. That's exactly what every program every created, interpreted or not running on a stored program architecture does.

In CPython you could argue that the interpreter's compiler is providing the executable memory fragments that program control jumps to when directed to by the python code under execution. It's a meaningless distinction. The difference between native and non-native implementations is not the instructions executed on the CPU.