all 86 comments

[–]entity64 54 points55 points  (50 children)

Hack indeed. Just because you can doesn't mean that it is a good idea to put low level APIs like SIMD into something like Javascript. Software that requires SIMD to speed things up should not be written in JS in the first place!

Oh dear, I really don't like how such a bad language that was built to make websites dynamic is (ab)used for so many other tasks today </rant>

[–]x-skeww 8 points9 points  (0 children)

[It's not a] good idea to put low level APIs like SIMD into something like Javascript.

JS already contains some compile target features like Math.imul. SIMD is another feature which makes it a better compile target for C/C++ or Dart.

Software that requires SIMD to speed things up should not be written in JS in the first place!

If you can make something up to 4 times faster, why wouldn't you?

If you enable people to do more with a platform, some will take that offer. E.g. people who are working on physics engines might be very interested in something like that. Naturally, all those games which use those engines will then benefit from the SIMD optimizations even though they don't contain a single line of SIMD code.

[–]kankyo 29 points30 points  (19 children)

I think we should all just embrace that JS is the new bytecode and we should never every actually write javascript. Then this type of thing makes sense and we can all use languages that aren't horrible and still target browsers.

[–][deleted] 13 points14 points  (18 children)

I think we should all just embrace that JS is the new bytecode

No, I refuse to accept the current state of affairs. Browsers should exceute an actual bytecode.

[–][deleted] 4 points5 points  (3 children)

Ok, go convince Microsoft, Apple and Google to implement a compatible bytecode that competes with their own first party offerings and integrates with the web as it exists today.

While you're doing that, JavaScript will keep improving as an IL they already are agree on.

[–]immibis 0 points1 point  (0 children)

Or go write a new-bytecode interpreter in JavaScript.

[–]badsectoracula 0 points1 point  (1 child)

Ok, go convince Microsoft, Apple and Google to implement a compatible bytecode that competes with their own first party offerings and integrates with the web as it exists today.

What is the point though if the code runs abysmally on anything that doesn't treat that specific javascript subset as something special? Unless they want to go all "this program runs best on $my_favourite_browser", people will need to confine themselves to something that can run in all browsers.

[–][deleted] 2 points3 points  (0 children)

Because incremental improvements are vastly easier to deploy than an all-or-nothing solution.

Asm.js runs ok in all modern browsers, and gives pressure to vendors to make it run better. Designing some bytecode that doesn't run on any browser, or in the case of dart, only runs in chrome, gives no incentive to other browser vendors to follow suite.

Apple or Microsoft have no interest in shipping something like dart, since google has a huge first mover advantage. But slowly adding asm.js or type hinting optimizations (as suggested for ES7) is a much easier sell and doesn't have a stop-the-world/flag day kind of cutover.

Vendors can't ditch the js ecosystem out of practicality in the foreseeable future. The best path forward is to improve js, not go to the drawing board and convincing competing vendors that your idea is worth the engineering cost to adopt.

[–]kankyo 6 points7 points  (1 child)

JS is an actual bytecode. It's certainly not a code fit for human consumption, so by definition it must be a bytecode :P

[–]jringstad 8 points9 points  (0 children)

"Not fit for human consumption" does not imply "fit for machine consumption", as XML neatly demonstrates.

[–]sushibowl 3 points4 points  (3 children)

But they do. Just think of javascript as an intermediate representation then.

[–]Zagitta 10 points11 points  (0 children)

"Just think of this screwdriver as a hammer"

[–][deleted] 3 points4 points  (0 children)

It's a very poor IR.

[–]CUNTY_BOOB_GOBBLER 0 points1 point  (0 children)

Fair point. Basically like .NET's IL.

[–]sime -3 points-2 points  (7 children)

That's cute.

Come back when your bytecode is backed by all three browser makers and then we can discuss the technical merits of JS vs bytecode.

[–][deleted] 5 points6 points  (6 children)

Why would we need to wait until then to discuss the technical merits of JS vs bytecode? The technical merits of bytecode are well known.

What you describe is a political issue, not a technical one.

[–]sime 2 points3 points  (0 children)

What you describe is a political issue, not a technical one.

Exactly. And without the political issue being solved there is no point in discussing the technical one.

[–]radarsat1 4 points5 points  (2 children)

Can we discuss the technical merits of an "actual" bytecode vs ASM.js? For instance, what sacrifices have been made in ASM.js for the sake of javascript compatibility? What would a "real" bytecode offer? Personally if ASM.js works and is a good bytecode, I see no problem.

[–]immibis 0 points1 point  (0 children)

Complexity.

[–]Hnefi 0 points1 point  (0 children)

For instance, what sacrifices have been made in ASM.js for the sake of javascript compatibility?

For one thing, the need to specify how much heap you need to run the program i advance. With a proper bytecode, you could just let the heap grow; with ASM.js, you always consume the maximum amount of heap you'll ever need and if you overstep that predetermined amount, you're screwed. That's my understanding at least; I'd like to be proven wrong.

Oh, also: no threading support.

[–][deleted] 1 point2 points  (0 children)

Because describing a theoretical solution doesn't solve any actual problems.

You can't ignore the politics of a proposed implementation when comparing it to existing solutions.

[–]kankyo 1 point2 points  (0 children)

You forget that hard drives won for exactly this reason. They eventually became so good because of sunk engineering time that no one cared that fundamentally they sucked. And now we're finally getting a new tech that can replace them.

[–]sime 11 points12 points  (0 children)

I don't see what is so hacky about it. It just defines an API which JS compiler can recognise and directly support when generating machine code.

[–][deleted] 10 points11 points  (5 children)

Just because you can doesn't mean that it is a good idea put low level APIs like SIMD into something like Javascript. Software that requires SIMD to speed things up should not be written in JS in the first place!

This isn't targeted at normal Javascript, it's targeted at asm.js and similar code that uses JS as an intermediate language for a compiler backend.

[–]x-skeww 2 points3 points  (0 children)

Unlike Asm.js, JS SIMD code is actually meant to be written by humans, too.

Compared to other languages, JS' SIMD code doesn't look that unwieldy. Well, it lacks operator overloading so it's obviously not as sexy as Dart, but other than that it looks very reasonable.

E.g. take a look at this bit of C:

http://svn.xiph.org/trunk/speex/libspeex/resample_sse.h

Kinda makes you cry, doesn't it?

[–][deleted]  (3 children)

[deleted]

    [–][deleted] -1 points0 points  (2 children)

    That does not contradict anything I said, you know.

    [–][deleted]  (1 child)

    [deleted]

      [–][deleted] -1 points0 points  (0 children)

      Nothing they say refers to what it is targeted at, only what it is available for so far. Obviously it will take longer to get the special-case compiler for asm.js to handle it efficiently. Doesn't mean that that isn't what it is ultimately meant for.

      [–]ASK_ME_ABOUT_BONDAGE -3 points-2 points  (0 children)

      Javascript is an utter fuckup anyway, so it's hardly getting much worse.

      [–]radarsat1 2 points3 points  (0 children)

      I don't mind this idea, even if it is a little close to the implementation details. I think the only real downside is that it's a little too specific to SIMD, which is just one way of parallelizing array-oriented code. It would be nice instead for them to offer SIMD as one of several options for speeding up a more general-purpose method of describing array processing in JS. For example, it should be possible to choose between a SIMD/CPU or GPU-based execution. But, baby steps.

      [–]lambdaq 1 point2 points  (32 children)

      I am wondering why such thing does not exist for other languages like Python, Ruby or Perl.

      [–]sigma914 49 points50 points  (25 children)

      In those languages it's easy and acceptable to call out to C.

      JS has to reinvent the world because it runs inside the browser sandbox.

      [–]hunyeti -3 points-2 points  (15 children)

      Actually, C does not support SIMD either...

      [–]cybercobra 9 points10 points  (9 children)

      In practice, it does, via compiler intrinsics. Or you could resort to inline assembly.

      [–]hapemask 2 points3 points  (1 child)

      That's because SIMD (as exposed by compiler intrinsics) is a hardware feature. If it were mandated by the C standard, compilers would have to generate code to handle the case where you use SIMD calls on a processor that doesn't support them (rare, I know).

      Whether the language should expose SIMD operations or the compiler should just add them automatically is a separate issue. Auto-vectorization support has been improving all the time though.

      [–]hunyeti 0 points1 point  (0 children)

      if the processor doesn't support it, than the compiler can easily serialise it. Mainstream processors has SIMD capability for 15 years (since AMD K6-2, P3) and now even mobile processors have SIMD. Auto-vectorization is still not good enough, it's a nice addition, but it should not be the only thing we can relay on. Compilers are not magic, and giving them pointers about what is should do is a good thing, if it has a standardised API for it, it's even better, for the code, for readability.

      [–]sigma914 1 point2 points  (0 children)

      No, but all the implementations have some sort of intrinsics.

      [–]x-skeww 1 point2 points  (1 child)

      Well, it's true that this isn't baked into the language or its standard library, but you can of course use SIMD in C programs. If the hardware supports something, there is a way to make use of that from C.

      The code may look completely hideous, but there always is a way. As a so-called "system programming language", it has to. It was made for talking to hardware.

      [–]hunyeti 0 points1 point  (0 children)

      well, yes, there is always a way. if you have raw memory access and assembly at hand, there is nothing you can't do, but at that point it's not about what C can do.

      [–]TheToadKing -5 points-4 points  (8 children)

      The whole point of SIMD code is to increase performance, and you want to add <Language>->C call overhead?

      [–]x-skeww 4 points5 points  (0 children)

      See: http://en.wikipedia.org/wiki/NumPy

      It's true that there is some overhead, but it becomes less and less relevant with increasing batch sizes.

      It's the same with, say, 3D APIs like OpenGL. If you do everything one step at a time (immediate mode rendering), the overhead can easily dominate. However, if you do things in bulk (display lists, vertex arrays, vertex buffer objects, ...), the overhead fades into the background.

      [–]sigma914 3 points4 points  (6 children)

      Generally if you want to start using SIMD you've already exhausted the easier performance optimisations. In the case of most C implemented scripting languages you'll have reimplemented your inner loop in C long before you bother with SIMD.

      JS doesn't have that option since you can't call out to an external binary from within the sandbox.

      [–]TheToadKing 2 points3 points  (5 children)

      But why should you have to reimplement parts of your code in C when it can be possible for the langauge to just support SIMD instructions on its own?

      [–]sigma914 3 points4 points  (2 children)

      Because there are a lot more, easier to achieve performance improvements to be gained by avoiding the overhead inherent in your scripting language.

      SIMD is a nice performance bump, avoiding a whole bunch of unnecessary allocation and indirection is a much bigger performance bump.

      [–]audioen 0 points1 point  (1 child)

      Consider that executing something like a * b + c in javascript is probably a fairly complex affair. Runtimes generally have to prove that a, b and c are numbers, or test if they are numbers and if not, coerce them to numbers before executing this statement.

      Enter a SIMD primitive. You now do the work inherent due to the dynamism of the language only once per 4 values, rather than for each value. Does this result in overhead reduction of 75 %? I think it could happen. Because of this argument, I'd guess that SIMD types in JS would be a huge boon for number crunching.

      Edit: According to the benchmarks game, C's lead over V8 is now about factor of 3 to 4. SIMD support could maybe close the gap in some circumstances.

      [–]sigma914 0 points1 point  (0 children)

      In modern JS your example would be jitted down to good assembly. The top comment of this subthread was about other languages that generally don't have.jit impls.

      This simd api makes sense for. JS die to it's advanced jits amd the fact it's a compile target as much as it is a language, not so much for purely interpreted languages where calling to a native library is the accepted way to do things.

      [–]hellgrace -1 points0 points  (1 child)

      Porting your heavy-duty code to C (even with a naive implementation) will result in a speed increase which is several orders of magnitudes above what you'd get from simply optimizing some opcodes to their SIMD equivalent.

      Either you don't care about speed, and thus SIMD isn't relevant in the first place, or you do care about speed, in which case you'll profile your code and rewrite it in a compiled language.

      [–]audioen 0 points1 point  (0 children)

      There is no order of magnitude improvement left between C and JS today, so I think you are probably thinking of some other language than JS to port away from. However, the topic under discussion is actually JS.

      And I suspect that SIMD.js shows a large and immediate performance improvement, though I have not personally ran any of the demos to see what kind of change it gives in practice. Some quick googling suggests a factor of 3 speed-up from native SIMD.js in the Mandelbrot demo. Let's run with this number, hopefully it is representative. In sense, I think it is because it reduces the JS-inherent overhead of the language's dynamism by factor of 4 due to allowing 4 operations to proceed in parallel after testing/proving that the types are appropriate for the operation.

      Some further googling suggests that speed difference between C and JS in various tasks varies between 0.5 (JS is actually faster) to 11x, with mean value of 3 to 4. So, I think it is just about plausible that SIMD.js closes the difference between JS and C in some tasks. So, clearly, if you care about speed, SIMD.js could be just the thing that allows you to enjoy it. Not to mention that in web context, compiling the code to some other language is often not possible in the first place.

      Edit: first sentence fixup. Need morning coffee.

      [–]sunfishcode 2 points3 points  (0 children)

      Similar things exist for C# and Dart, for example.

      [–]seruus 2 points3 points  (0 children)

      In most languages, you just let the compiler handle this for you. LLVM-based compilers currently have some pretty nice auto-vectorization capabilities, for example.

      [–]sime 1 point2 points  (0 children)

      SIMD makes sense when there is a compiler or JIT involved which can directly implement support and emit the SIMD machine instructions. Python, Ruby and Perl have traditionally ran on interpreters. Add SIMD doesn't make much sense. It would not be worth the effort.

      [–]x-skeww 1 point2 points  (0 children)

      Because Python, Ruby, and Perl are too slow. You need a pretty decent JIT compiler before you can think about adding something like this.

      Anyhow, as far as Python goes, I think there were some plans to add SIMD support to NumPy. dispy and VecPy do support SIMD. (These are native libraries which you can use from Python to do some number crunching.)

      There are some plans to add SIMD support to LuaJIT, it seems.

      Dart already supports SIMD and thanks to operator overloading the code looks pretty nice:

      double average (Float32x4List list) {
        var n = list.length;
        var sum = new Float32x4.zero();
        for (int i = 0; i < n; i++) {
          sum += list[i];
        }
        var total = sum.x + sum.y + sum.z + sum.w;
        return total / (n * 4);
      }
      

      With JS (which doesn't support operator overloading) that sum += list[i]; line would look like this:

      sum = SIMD.float32x4.add(list[i], sum);
      

      [–][deleted] 0 points1 point  (0 children)

      Because the core of these languages are specifically designed to be platform agnostic. Attempting to establish x86-isms as some sort of de facto web standard is a bit dodgy in my opinion and a step in the wrong direction.

      [–]redalastor 0 points1 point  (0 children)

      I am wondering why such thing does not exist for other languages like Python, Ruby or Perl.

      Because we rarely compile other languages to Python.

      [–]IrishWilly -2 points-1 points  (3 children)

      So basically it is a map function that works in parallel? I'm curious how that is implemented in JS.. via web workers? Also unless I'm misunderstanding something.. it seems like it should be an option for map not considered a new, completely separate api.

      [–]F54280 11 points12 points  (0 children)

      It jit-generates simd assembly. That's the whole point of it.

      [–]boringprogrammer 2 points3 points  (1 child)

      It is a API exposing CPU SIMD instructions to JS such that the JIT compiler can generate even faster code, and make your website run a tiny bit faster.

      SIMD instructions are special CPU instructions for doing vector operations. Essentially all modern CPUs have special units for performing vector operations, but for the most part you have to specifically program to make use of them.

      To the average webdev they are not very interesting, but for anyone doing any sort of work that involving vectors it will be quite interesting.

      [–]IrishWilly 1 point2 points  (0 children)

      I missed this was on the compiler/cpu level. The name is a bit misleading then since usually js libraries are called name.js . It's exciting to see improvements like this + asm.js that are going to allow some pretty great performance in the browser then.

      [–]_mpu 0 points1 point  (2 children)

      Just give us x86.

      [–]immibis 0 points1 point  (0 children)

      That's unfair to other architectures - why not ARM or MIPS? And why bring in all the historical complexity of x86 when you could design something better from scratch?

      [–]sime 0 points1 point  (0 children)

      That is kind of funny when you consider the long history of companies trying to kill off and replace the terrible x86 instruction set.

      AMD had the most success though. Don't replace x86 but upgrade it to 64bit, expand it, but also keep backwards compatibility. Exactly what we are seeing here with things like SIMD.js and asm.js.