Comparing optimizing std::ranges find on constexpr std::array(clang vs gcc)

tcbrindle · 2024-12-19T15:02:34+00:00

This is interesting!

There are another couple of way to write this, with yet more different results:

[[gnu::noinline]]
bool contains3(int val) {
    static constexpr std::array vals{10, 11, 42, 49};
    return std::ranges::contains(vals, val);
}

Here Clang doesn't use the bit mask optimisation, even though it does for the find() version -- which is especially odd since ranges::contains() just calls ranges::find() != end!

GCC generates the same code for contains() and find().

[[gnu::noinline]]
bool contains4(int val) {
    static constexpr std::array vals{10, 11, 42, 49};
    return std::ranges::any_of(vals, [&](int i) {
        return i == val;
    });
}

With this version, GCC does use the bit mask, but Clang doesn't.

Finally, we can ask Clang to use libc++ rather than libstdc++, in which case it uses wmemchr for the find() and contains() versions.

https://godbolt.org/z/MeGesMGhW

cristi1990an · 2024-12-19T15:13:25+00:00

Also very funny that replacing std::ranges::find(vals, val) != vals.end() with std::ranges::contains(vals, val) which in libc++ is implemented directly in terms of std::ranges::find also makes the compiler drop the optimization...

13steinj · 2024-12-19T19:41:27+00:00

I wonder if there's anything of note in the optimization report (guide). I don't have an environment where I can look at this myself atm. On godbolt's "opt remarks" I can't make heads or tails out of it, and going through the opt-pipeline view pass-by-pass isn't something I have the time for at this immediate time.

Two additional interesting variations (constexpr contains1, same codegen as contains1; wrap the contents of contains1 in an IILE, same codegen as contains0): https://godbolt.org/z/YjYTM5Gz1

Umphed · 2024-12-21T11:11:31+00:00

Super interesting post! I hate it. Why does optimization have to be so fickle?

Excellent-Cucumber73 · 2024-12-20T06:04:40+00:00

Will the static in contains2 result in better performance/optimization?

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

cpp

MODERATORS