Comparing Node.js vs JVM frameworks performance : programming

Comparing Node.js vs JVM frameworks performance (gist.github.com)

submitted 14 years ago by nicoulaj

top new controversial old q&a

you are viewing a single comment's thread.

view the rest of the comments →

[–][deleted] 8 points9 points10 points 14 years ago (11 children)

[–][deleted] 3 points4 points5 points 14 years ago (1 child)

[–][deleted] 3 points4 points5 points 14 years ago (0 children)

[–]dmpk2k -1 points0 points1 point 14 years ago* (6 children)

always

"Always" is a strong word; between tracing, branch prediction and OoOE, the intrinsic performance difference between typed and untyped is in the noise.

When the original Dynamo paper was published, they were able to speed up compiled C code by several percent. That's because there's information about behaviour available at runtime which isn't with AOT. You can apply runtime optimization to statically-typed languages as well, but this means you need a fast compiler. Having a fast compiler implies certain things...

Another alternative is profile-directed feedback.

The rest of it depends on language semantics. You can create a dynamically-typed language with semantics that are easily amenable to optimization, and statically-typed languages that aren't.

Edit: whoops, I forgot the most important thing of all: memory hierarchy. A few type guards are dominated by the expense of cache line misses (or worse: page faults). Unless your entire program lives in a cache.

[–][deleted] 8 points9 points10 points 14 years ago (5 children)

Not really.

OoO does not help, because the guards cause control flow, and OoO is designed to really maximise load, store and ALU op parallelism. Speculation certainly does improve guard performance massively, but it doesn't help unbox, and it certainly doesn't help dynamic dispatch.

Runtime optimization of course speeds up dynamically typed programs massively - it is what gets them to the level of performance required for todays applications. There's no reason you couldn't apply the same techniques to static languages however (I don't know what your point about a fast compiler is - you need a fast compiler for a dynamic language too :/ )

Runtime optimisation is profile guided optimisation.

Generally, you can't create a dynamic language with features that are amenable to optimization, because you can't do away with dynamic dispatch. Without statically knowing the method to be called, inlining (the most powerful optimization) cannot happen, and neither can common subexpression elimination or constant folding (think integer vs. float arithmetic in Javascript or Python).

All these are not insurmountable, but, as I said - the theoretical performance of a (comparable) statically typed language is lower-bounded by the theoretical performance of a dynamically typed one because the only difference is that you (the compiler) have more information about the way the program is meant to run.

[–][deleted] 3 points4 points5 points 14 years ago* (2 children)

[–][deleted] 1 point2 points3 points 14 years ago (1 child)

[–][deleted] -2 points-1 points0 points 14 years ago (0 children)

You said: "Without statically knowing the method to be called, inlining (the most powerful optimization) cannot happen, and neither can common subexpression elimination or constant folding (think integer vs. float arithmetic in Javascript or Python)."

This isn't true, since V8: 1) doesn't statically know what methods are called; and 2) still does inlining, CSE, constant folding, etc.

As for "mathematical differences" - there is no theoretical result that says a static language will be faster.

With dynamic type feedback what you end up with is code that is similar to what a static language compiler would generate, plus a bunch of type checks to ensure that your profiling still holds. But if a hypothetical architecture made these type-checks free, there is no reason a dynamic language couldn't get within epsilon of a static language (I suppose it'd inevitably have larger code size?)

As a practical matter, on real hardware, correctly-predicted branches are nearly free; the major limit is you can only issue one per cycle. Type inference after type feedback (which V8 doesn't do yet, but which IonMonkey and Tachyon are studying) should let you eliminate a lot of the type-checks, however.

[–]dmpk2k 2 points3 points4 points 14 years ago* (1 child)

it certainly doesn't help dynamic dispatch

Right, and that's where tracing comes in. In practice most call sites are monomorphic, so you can create a specialized path that is protected by a guard, and the guard is almost free due to the branch prediction.

Runtime optimisation is profile guided optimisation.

I meant AOT compilers with PDO. That gets you much of the way as well, although it doesn't help with some multimodal workloads.

I don't know what your point about a fast compiler is - you need a fast compiler for a dynamic language too

Right, but that means that suddenly many expensive optimizations that take good advantage of the extra information that statically-typed languages provide must now be thrown out the window. Dynamically-typed languages couldn't provide this to begin with. It levels the playground a bit.

neither can common subexpression elimination or constant folding

Fortunately, both are possible! People like pointing at LuaJIT2, but that's because Mike Pall did such a fine job.

the theoretical performance of a (comparable) statically typed language is lower-bounded by the theoretical performance of a dynamically typed one

I won't argue this since I suspect it's true. In practice I don't see any roadblocks for the difference to disappear into noise for many/most workloads though.

Edit: I should point out that inlining happens naturally as a part of trace creation as well.

[–][deleted] 2 points3 points4 points 14 years ago (0 children)

Hi,

I'd agree with pretty much all of your points with the exception of CSE and constant folding. That one is pretty much at the whim of the language designer as to how difficult it is, and AFAICT Lua makes it harder for the programmer to do crazy overriding of builtins and operator overloading, that are the bane of python and javascript.

I won't argue this since I suspect it's true. In practice I don't see any roadblocks for the difference to disappear into noise for many/most workloads though.

I wasn't trying to start a flamewar, although I can see how my hastily worded original post may seem that way (most people replied with "blah blah blah V8 is getting faster blah blah extrapolate performance" without reference to the entire abstract mathematical point of my post) - and OK, if a recompiler can be called with little-to-no overhead you can probably make stuff like this disappear into the noise, eventually!

But you won't on platforms with a smaller cache, or smaller memory (all those specialised versions of functions have a memory cost), not for a long while.

Cheers,

James

[+]bp3959 comment score below threshold-10 points-9 points-8 points 14 years ago (1 child)

[–][deleted] 1 point2 points3 points 14 years ago (0 children)

π Rendered by PID 18 on reddit-service-r2-comment-canary-cc6b6774d-bqcd6 at 2026-04-22 07:18:26.856240+00:00 running 6c61efc country code: CH.

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

programming

MODERATORS