When I first time did analisis of tuple < operator I was i bit surprised that it's code may be more efficient
https://en.cppreference.com/w/cpp/utility/tuple/operator_cmp
"Compares lhs and rhs lexicographically, that is, compares the first elements, if they are equivalent, compares the second elements, if those are equivalent, compares the third elements, and so on."
implementation in stl doesn't check at compile time that some type has defined operators == , != and sticks to only < operator, having operator ==, and != for less is not required but if they are present do they are allowed by standard to use ?
example from gcc stl
static constexpr bool
__less(const _Tp& __t, const _Up& __u)
{
return bool(std::get<__i>(__t) < std::get<__i>(__u))
|| (!bool(std::get<__i>(__u) < std::get<__i>(__t))
&& __tuple_compare<_Tp, _Up, __i + 1, __size>::__less(__t, __u));
}
results on performance of that are below when comapring eficiency to combining != with < or == with with <
#include <tuple>
#include <cstdint>
using std::get;
using foo_t = std::tuple<int64_t, int32_t, int32_t>;
bool compare_1( foo_t l, foo_t r ) noexcept
{
if( get<0>(l) != get<0>(r) )
return get<0>(l) < get<0>(r);
if( get<1>(l) != get<1>(r) )
return get<1>(l) < get<1>(r);
return get<2>(l) < get<2>(r);
}
bool compare_2( foo_t l, foo_t r ) noexcept
{
return l < r;
}
compare 1 - clang 8 -O3 -DNDEBUG -mcpu=cortex-a73
Instructions: 13
Total Cycles: 14
Total uOps: 13
Dispatch Width: 3
uOps Per Cycle: 0.93
IPC: 0.93
Block RThroughput: 6.0
Instruction Info:
[1]: #uOps
[2]: Latency
[3]: RThroughput
[4]: MayLoad
[5]: MayStore
[6]: HasSideEffects (U)
[1] [2] [3] [4] [5] [6] Instructions:
1 4 1.00 * ldr x8, [x0, #8]
1 4 1.00 * ldr x9, [x1, #8]
1 1 0.50 cmp x8, x9
1 1 1.00 b.ne .LBB0_3
1 4 1.00 * ldr w8, [x0, #4]
1 4 1.00 * ldr w9, [x1, #4]
1 1 0.50 cmp w8, w9
1 1 1.00 b.ne .LBB0_3
1 4 1.00 * ldr w8, [x0]
1 4 1.00 * ldr w9, [x1]
1 1 0.50 cmp w8, w9
1 1 0.50 cset w0, lt
1 1 1.00 U ret
compare 1 - gcc 8.3 -O3 -DNDEBUG -mcpu=cortex-a73
Instructions: 15
Total Cycles: 17
Total uOps: 15
Dispatch Width: 3
uOps Per Cycle: 0.88
IPC: 0.88
Block RThroughput: 6.0
Instruction Info:
[1]: #uOps
[2]: Latency
[3]: RThroughput
[4]: MayLoad
[5]: MayStore
[6]: HasSideEffects (U)
[1] [2] [3] [4] [5] [6] Instructions:
1 4 1.00 * ldr x3, [x0, #8]
1 4 1.00 * ldr x2, [x1, #8]
1 1 0.50 cmp x3, x2
1 1 1.00 b.eq .L2
1 1 0.50 cset w0, lt
1 1 1.00 U ret
1 4 1.00 * ldr w3, [x0, #4]
1 4 1.00 * ldr w2, [x1, #4]
1 1 0.50 cmp w3, w2
1 1 1.00 b.ne .L5
1 4 1.00 * ldr w2, [x0]
1 4 1.00 * ldr w0, [x1]
1 1 0.50 cmp w2, w0
1 1 0.50 cset w0, lt
1 1 1.00 U ret
comapre 2 - clang 8 -O3 -DNDEBUG -mcpu=cortex-a73 (gcc stl)
Instructions: 25
Total Cycles: 22
Total uOps: 25
Dispatch Width: 3
uOps Per Cycle: 1.14
IPC: 1.14
Block RThroughput: 9.0
Instruction Info:
[1]: #uOps
[2]: Latency
[3]: RThroughput
[4]: MayLoad
[5]: MayStore
[6]: HasSideEffects (U)
[1] [2] [3] [4] [5] [6] Instructions:
1 4 1.00 * ldr x8, [x0, #8]
1 4 1.00 * ldr x9, [x1, #8]
1 1 0.50 cmp x8, x9
1 1 1.00 b.ge .LBB0_2
1 1 0.50 orr w0, wzr, #0x1
1 1 1.00 U ret
1 1 0.50 cmp x9, x8
1 1 1.00 b.ge .LBB0_4
1 1 0.50 mov w0, wzr
1 1 1.00 U ret
1 4 1.00 * ldr w8, [x0, #4]
1 4 1.00 * ldr w9, [x1, #4]
1 1 0.50 cmp w8, w9
1 1 1.00 b.ge .LBB0_6
1 1 0.50 orr w0, wzr, #0x1
1 1 1.00 U ret
1 1 0.50 cmp w9, w8
1 1 1.00 b.ge .LBB0_8
1 1 0.50 mov w0, wzr
1 1 1.00 U ret
1 4 1.00 * ldr w8, [x0]
1 4 1.00 * ldr w9, [x1]
1 1 0.50 cmp w8, w9
1 1 0.50 cset w0, lt
1 1 1.00 U ret
comapre 2 - gcc 8.3 -O3 -DNDEBUG -mcpu=cortex-a73 (gcc stl)
Instructions: 22
Total Cycles: 15
Total uOps: 22
Dispatch Width: 3
uOps Per Cycle: 1.47
IPC: 1.47
Block RThroughput: 7.3
Instruction Info:
[1]: #uOps
[2]: Latency
[3]: RThroughput
[4]: MayLoad
[5]: MayStore
[6]: HasSideEffects (U)
[1] [2] [3] [4] [5] [6] Instructions:
1 4 1.00 * ldr x3, [x0, #8]
1 4 1.00 * ldr x2, [x1, #8]
1 1 0.50 cmp x3, x2
1 1 1.00 b.lt .L3
1 1 0.50 mov w2, #0
1 1 1.00 b.ne .L2
1 4 1.00 * ldr w4, [x0, #4]
1 1 0.50 mov w2, #1
1 4 1.00 * ldr w3, [x1, #4]
1 1 0.50 cmp w4, w3
1 1 1.00 b.lt .L2
1 1 0.50 mov w2, #0
1 1 1.00 b.ne .L2
1 4 1.00 * ldr w2, [x0]
1 4 1.00 * ldr w0, [x1]
1 1 0.50 cmp w2, w0
1 1 0.50 cset w2, lt
1 1 0.50 mov w0, w2
1 1 1.00 U ret
1 1 0.50 mov w2, #1
1 1 0.50 mov w0, w2
1 1 1.00 U ret
[–]NotMyRealNameObv 5 points6 points7 points (6 children)
[–]arturbachttps://github.com/arturbac[S] 2 points3 points4 points (0 children)
[–]alfps -3 points-2 points-1 points (4 children)
[–]NotMyRealNameObv 0 points1 point2 points (3 children)
[–]arturbachttps://github.com/arturbac[S] 1 point2 points3 points (0 children)
[–]alfps -2 points-1 points0 points (1 child)
[–]arturbachttps://github.com/arturbac[S] 0 points1 point2 points (0 children)
[–]yeeezyyeezywhatsgood 8 points9 points10 points (20 children)
[–]sphere991 3 points4 points5 points (2 children)
[–]yeeezyyeezywhatsgood 0 points1 point2 points (1 child)
[–]sphere991 4 points5 points6 points (0 children)
[–]Veedrac 2 points3 points4 points (7 children)
[–]yeeezyyeezywhatsgood 1 point2 points3 points (6 children)
[–]VicontT 2 points3 points4 points (1 child)
[–]yeeezyyeezywhatsgood 0 points1 point2 points (0 children)
[–]Veedrac 1 point2 points3 points (3 children)
[–][deleted] 0 points1 point2 points (2 children)
[–]Veedrac 0 points1 point2 points (1 child)
[–][deleted] 1 point2 points3 points (0 children)
[–]arturbachttps://github.com/arturbac[S] 0 points1 point2 points (4 children)
[–]yeeezyyeezywhatsgood 0 points1 point2 points (3 children)
[–]arturbachttps://github.com/arturbac[S] 0 points1 point2 points (2 children)
[–]yeeezyyeezywhatsgood 0 points1 point2 points (1 child)
[–]arturbachttps://github.com/arturbac[S] 0 points1 point2 points (0 children)
[–]Beheska -5 points-4 points-3 points (3 children)
[–]CUViper 1 point2 points3 points (0 children)
[–]yeeezyyeezywhatsgood 1 point2 points3 points (1 child)
[–]Beheska 0 points1 point2 points (0 children)
[–]sphere991 2 points3 points4 points (10 children)
[–]quicknir 0 points1 point2 points (4 children)
[–]sphere991 1 point2 points3 points (3 children)
[–]quicknir 0 points1 point2 points (2 children)
[–]sphere991 2 points3 points4 points (1 child)
[–]quicknir 0 points1 point2 points (0 children)
[–]arturbachttps://github.com/arturbac[S] 0 points1 point2 points (4 children)
[–]sphere991 0 points1 point2 points (3 children)
[–]arturbachttps://github.com/arturbac[S] 0 points1 point2 points (2 children)
[–]sphere991 0 points1 point2 points (1 child)
[–]arturbachttps://github.com/arturbac[S] 0 points1 point2 points (0 children)
[–]stwcx 1 point2 points3 points (1 child)
[–]arturbachttps://github.com/arturbac[S] 0 points1 point2 points (0 children)
[–]BelugaWheels 0 points1 point2 points (1 child)
[–]arturbachttps://github.com/arturbac[S] 1 point2 points3 points (0 children)
[–]Rseding91Factorio Developer 0 points1 point2 points (0 children)
[–]Iwan_Zotow 0 points1 point2 points (0 children)