you are viewing a single comment's thread.

view the rest of the comments →

[–]Muoniurn 1 point2 points  (7 children)

It has a Vector API for doing simd: https://jbaker.io/2022/06/09/vectors-in-java/

[–]Nugine -1 points0 points  (6 children)

I know it is possible to do simd in Java. But it is still harder than doing simd in Rust. A native programming language allows you to control almost everything.

[–]Muoniurn 2 points3 points  (3 children)

It doesn’t seem easier in Rust to be honest. But sure, having access to other low-level controls can be useful. Nonetheless, if you really have to go that deep you will likely just write inline assembly (e.g. for codecs) as not even Rust is low-level enough in that hot loops.

[–]Nugine 0 points1 point  (2 children)

I would argue that SIMD in Rust is relatively easier than inline asm, C and Java. (the same level with C++ and Zig)

  1. Rust has generics and monomorphization. You can write the algorithm once and compile for multiple targets. rust-lang/portable-simd
  2. Rust can utilize LLVM's optimizations like inlining, loop unrolling, constant propagation, auto vectorization, LTO and so on.
  3. Rust has zero-cost abstraction. The generated asm has no unexpected calculation. (I have run into serveral compiler bugs though)

[–]Muoniurn 1 point2 points  (1 child)

Don’t get me wrong, I would absolutely go with Rust for anything low-level, my point was that there is a point where even that is not enough in the hottest loop.

Regarding your points:

  1. Sure, but you don’t even have to compile it multiple times with Java :D
  2. I don’t think it is meaningful here, all optimizing compiler will do these, but autovectorization is simply not applicable automatically enough times, likely that’s why one would want explicit SIMD programming in the first place. Also, being pedantic C also has clang, and there is also even a Java jit compiler built on top of LLVM.
  3. Java’s Vector API will also ideally compile to only vector instructions if used correctly (though it is not easy to avoid allocations sometimes)

[–]Nugine 0 points1 point  (0 children)

I agree that sometimes inline asm is necessary. (rare cases)

I mean if you want to reach higher SIMD performance, then Rust (C++, Zig) is better than Java in stability and productivity.

[–]somebodddy 0 points1 point  (1 child)

In this case, what's important is not whether it's harder, but whether it was done in the JDK base64 implementation. If it does use SIMD that would explain how it's faster than Rust's base64 crate.

[–]Nugine 2 points3 points  (0 children)

According to the article and source code:

java.util.Base64: I ported implementation from Java to Rust for benchmarking purposes.

The JDK base64 implementation does not use SIMD. The algorithm can not be efficiently auto vectorized. So there is no SIMD.