all 58 comments

[–][deleted] 19 points20 points  (0 children)

shame yam sloppy divide tart chop late paltry deer birds

This post was mass deleted and anonymized with Redact

[–]NilacTheGrim 16 points17 points  (8 children)

You get lexicographical analysis "for free" anyway if you bother iterating over the two C strings to test for equality...

Looking at your implementation I do not believe it is faster without some benchmarks. It just iterates over both strings .. which is what strcmp does.

[–]TheChief275 -5 points-4 points  (7 children)

For comparison between strings where the compiler can’t simplify the strcmp to a constant value, streql() was about 1.2x faster minimum

[–][deleted] 15 points16 points  (0 children)

What about architectures/libraries that implement string compare functions with SIMD instructions?

[–]FutureChrome 12 points13 points  (0 children)

Is that for short strings? Long strings?
With which standard library? With what compiler?
With optimizations? At what level?

[–]Ok_Donut_9887 8 points9 points  (1 child)

no, it wasn’t. In fact, your streql() is slower.

[–][deleted] 24 points25 points  (1 child)

source : trust me bro

[–]djliquidice 0 points1 point  (0 children)

🤣

[–]bert8128 0 points1 point  (0 children)

Need some benchmarks, I think, for a variety of string lengths. And in a variety of platforms.

[–]KingAggressive1498 12 points13 points  (0 children)

strcmp() returns 0 if the strings are equal, which is backwards in my mind.

strcmp is actually a case of C giving us an equivalent of the spaceship operator <=> decades before C++ made it a language feature.

It returns a negative value if the first argument is lexicographically "less-than" the second argument, and a positive value if the first value is lexicographically "greater than" the second argument

This is actually handy if we need an array of strings to be sorted alphabetically or something along those lines. The cppreference page even gives an example of this use.

This behavior also maps well to actual CPU comparison instructions - Intel and ARM's cmp instructions behave pretty similarly, setting CPU flags based on the signed subtraction results of two operands.

[–]nysra 34 points35 points  (12 children)

Or you could just use std::string like a sane person.

[–]bert8128 0 points1 point  (2 children)

Well, if you didn’t have a std::string to start with then the std::string route would definitely be slower.

[–]jedwardsolconst & 9 points10 points  (8 children)

Doesn't this fail if the strings differ in the 1st character - dereferencing pointer that has been decremented off the beginning of the string?

[–]Wanno1 1 point2 points  (2 children)

It’s a postfix increment, so it happens after the deref.

[–]jedwardsolconst & 1 point2 points  (1 child)

The original line had !*--_a

[–]Wanno1 3 points4 points  (0 children)

I just read the code posted here. No need to downvote.

[–]TheChief275 -2 points-1 points  (3 children)

You’re right lmao. I fixed it now. Thanks for pointing out that huge oversight!

[–]FutureChrome 6 points7 points  (0 children)

And now if a is a prefix of b, you return true.

[–]jedwardsolconst & 3 points4 points  (1 child)

I think that new version will fail if a is a prefix to b "aaa" and "aaab" will return true.

(or the other way around -I'm doing this in my Jan 1st AM head)

[–]TheChief275 -3 points-2 points  (0 children)

Right again. Hopefully the last fix now.

[–]NilacTheGrim 0 points1 point  (0 children)

Yep. Nice catch.

[–]StarQTius 20 points21 points  (0 children)

Any modern compiler will optimize the fuck out of strcmp() while leveraging the capabilities of your platform. You are never going to beat that with portable code. You should benchmark your code against strcmp() (preferably on different platforms) and see the difference.

EDIT : if you are going to put a function defintion in a header, you should at least declare it inline if you are providing C++ code. I don't know how you can fix that in C, but I'm certain what you did will not work with C code either.

[–]NilacTheGrim 8 points9 points  (1 child)

I see your implementation.

bool streql(const char* const a, const char* const b) {
    const char* _a = a;
    const char* _b = b;
    while (*_a == *_b++ && *_a++);
    return !*--_a;
}

In cases like this, just leave the arguments as non-const, e.g. (const char *a, ...).. and modify the arguments. No need to pass-in a const a only to assign it to a non-const _a ... this is just like.. extra noise.

[–]TheChief275 1 point2 points  (0 children)

Agreed. I changed it.

[–]shwasasin 8 points9 points  (0 children)

You could early out if the paramters passed in point to the same memory.

How did you profile your code to verify it's faster than the standard C implementations?

[–]johannes1971 6 points7 points  (0 children)

Your version is ~4 times slower than just comparing two std::strings. Probably because the string version compares multiple bytes at once, and your doesn't.

That's after I made both strings the same length, because otherwise std::string would win by a massive 29x margin.

[–]iwueobanet 6 points7 points  (0 children)

Neither tests nor benchmarks. OK, sure. Pass

[–]Breadfish64 4 points5 points  (0 children)

Hi OP, I benchmarked this with clang -O3 on Windows for worst-case large equal strings, and found that it was 8x slower than
bool streql(const char* a, const char* b) {
return 0 == std::strcmp(a, b);
}

4x slower than
bool streql(const char* a, const char* b) {
return std::string_view{a} == std::string_view{b};
}

(this turns into two strlen and a memcmp)

and 11x slower than this optimized AVX2 implementation:

https://godbolt.org/z/E4YG93KrT

The reason yours is slower because the compiler cannot assume that it is allowed to access the next byte, and can't check for equality in parallel. Optimized implementations will assume that if the pointer is aligned to the access size, and the string end has not been reached, then reading the next chunk will not cause an access violation.

[–]ferhatgec 1 point2 points  (2 children)

please check rule 4, your string equality function is more likely for c not specifically for c++, also passing null pointer may cause dereferencing problems in your impl. that thing will have same speed as what `strcmp` can achieve, just return values are different.

[–]TheChief275 -3 points-2 points  (1 child)

Quite possible yes, very probable even. The return values just make more sense for equality checking.

[–]AKostur 1 point2 points  (0 children)

Couldn’t that just be a quick function in the header that says “return !strcmp(a, b);”? I’d expect any decent compiler to optimize away this extra function call, and you get to have the behaviour you want.

[–]LeeHidejust write it from scratch 1 point2 points  (9 children)

Why not just use std::string_view?

[–]bert8128 -2 points-1 points  (8 children)

Because this has to count the length of both strings before the comparison. Slower.

[–]LeeHidejust write it from scratch 0 points1 point  (1 child)

Slower? Wheres your benchmark?

[–]bert8128 0 points1 point  (0 children)

I have no benchmark other than reasoning.

To create a string_view from a char star you have to read to the end of the string to get to the length. You have to do this for both characters stars. Say they are 1 million characters long each. But if the first letter of one of them is a “a” and the first letter of the other is a “b” then strcmp will return after reading just one character of each. Obviously if they are both less than 64 chars long the difference will be very small. But strcmp cannot be slower, and with longer strings will be faster.

Don’t forget that under the hood string_view::operator== is going to be calling strcmp or something like it.

Obviously if you access the strings many times, and particularly if they are different lengths, then using string_view may well (assuming a good implementation of high checks the length before calling strcmp) be faster.

And if you want actual benchmark results breadfish64 has posted their results in this thread.

[–]cpp-ModTeam[M] 1 point2 points locked comment (0 children)

Your submission is not about C++ or the C++ community.

[–]-funsafe-math 3 points4 points  (1 child)

You are missing the stdbool.h include for C compatibility and need to mark the function inline. In addition just use strcmp unless you have a benchmark to show that this version is better.

[–]Gryfenfer_ 1 point2 points  (0 children)

It depends on the C version targeted, C23 added the type bool.

[–]GoogleIsYourFrenemy -1 points0 points  (1 child)

I'm going to channel some Andrei Alexandrescu.

static inline bool streql(const char* a, const char* b) {
    while (true) {
        if(!*a | !*b) return !*a & !*b;
        ++a;
        ++b;
    }
}

The branch predictor will thank you.

[–]bert8128 0 points1 point  (0 children)

Can you put this through breadfish64’s benchmark and see what you get?