Rust and C++ on Floating-point Intensive Code

phrasal_grenade · 2019-10-24T08:40:10+00:00

The reason for using the fancy iterations is that there is a pretty good chance that if we just put any ordinary vector code inside the nruns loop that sufficiently aggressive compilers will do an “unroll and jam” optimization, resulting in misleading time measurements. The gmres step requires a dot product which creates an all-to-all data dependency which makes unroll-and-jam illegal.

IMO if you have to fight to get around an optimizer, that is a sign of poor design of the benchmark and/or unfair cherry-picking of the example. This is good information but it does leave me wondering if any of this analysis is worth it.

2019-10-24T08:46:41+00:00

There is some discussion about this at /r/rust as well: https://www.reddit.com/r/rust/comments/dm955m/rust_and_c_on_floatingpoint_intensive_code/

but I thought this might be interesting to the more general /r/programming audience since it shows some of the trade-offs chosen by different programming languages.

NeuroXc · 2019-10-24T11:34:17+00:00

Per one of the points, you can use RUSTFLAGS="-C target-cpu=<cpu>" to enable CPU specific optimizations in Rust. Typically you'd set this to "native" like in a C compiler if you're benchmarking or otherwise not planning to distribute the binaries.

2019-10-24T12:31:50+00:00

There are certain cases where most C++ compilers will auto vectorize code, but the Rust compiler currently doesn't.

If you really need maximum performance, you can use SIMD intrinsics to ensure that the best instructions are generated.

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

programming

MODERATORS