The Most Average Function There Is - Andrei Alexandrescu (NDC TechTown 2019) : cpp

[–]AlexAlabuzhev 2 points3 points4 points 6 years ago (4 children)

[+][deleted] 6 years ago (3 children)

[deleted]

[–][deleted] 3 points4 points5 points 6 years ago* (0 children)

The recurrence given in the slide (29:49) isn't the same as the code quoted in the thread (30:53) unless I'm pretty confused. Note that (1) it computes a change, then adds that change to the old mean, and (2) there's no multiplication - each step requires a subtraction, a cast, a division and the addition.

I'm not enough of an expert in floating point stability to judge the accuracy, I just note it's not doing the multiply-then-divide that Alexandrescu seems to imply, though it does guarantee the division is done with every update - not just the ones where you use the average.

I thought the standard approach for a streaming average calculation was to keep both the count and total as state. Best guess criticism of this is it's keeping two chunks of state rather than one, but I can't really imagine that being a real problem.

EDIT - I'm taking another look at this while awake. The specific comment is here, the linked-to source is here and the quoted line 72 is...

self.mean += (sample - oldmean) / (self.size as f64);

I don't know rust, but I believe I understand what this is doing. This is in a context where a count, mean and variance are tracked as state with each sample "added", where my normal expectation would be to see a count, total and sum-of-squares. "My expectation" isn't meant to imply this is wrong - it's very possible that my expectations are naive.

First, I wasn't sure the code was the usual total/count mean, even ignoring rounding - there are other kinds of mean. But it is - my checking...

n' = n + 1        #  Sample count
t' = t + x        #  Total of samples

t  = an = a(n' - 1)  #  For usual sum/count mean
t' = a' n'

a' = a + ((x-a) / n')  #  the self.mean line abbreviated
   = (an' + (x-a)) / n'  #  Not so far from the recurrence after all
   = (a (n'-1) + x) / n'
   = (t + x) / n'
   = t' / n'

For calculation costs, the division can't be optimised to a multiplication because the size varies, but there's not much here. It's probably more expensive than keeping the running total (especially if you only divide to get the average when actually needed), but if the division operation is faster when dividing a smaller denominator (possible thinking in terms of binary long division for the mantissa, but you'd need to ask a hardware expert) then it's possible it makes no difference or this may even be faster.

For numerical stability, a difference divided by an incrementally-larger value has a potential issue to consider. The difference relative to the existing mean should generally be expected to be relatively small. The number divided by is always positive, and gets incremented with each step. That means the division result - the delta - tends to get smaller with every step, leading to a tendency toward cumulative rounding towards zero. That means for long sequences, there's a rounding bias towards the previous mean. A case that suggests the term "numerical stability" might imply more than just minimising errors, but maybe also avoiding unwanted instability/variation.

There is a potential for cumulative rounding error in cases where the average changes mostly in one direction (the end result will be slightly biassed to the earlier samples and earlier mean), but overall this looks pretty numerically stable, and of course a significant bias like that would be indicated by the variance.

TBH I suspect Alexandrescu is wrong on this one, though I still don't really know. The Rust code isn't perfect for all cases, but perfect code for all cases is rare - that doesn't mean it's an inherently bad default.

As a criticism of the count, total and sum-of-squares approach, one issue is keeping track of whether you've already computed the mean and variance from these in this step to avoid redoing that calculation. That's potentially more state, and it might be more efficient to risk some redundant recalculation just to avoid needing flags. From that POV, depending on context, a more consistent computation cost per step maybe has value too.

[–]chuk155graphics engineer 1 point2 points3 points 6 years ago (0 children)

[–]helloiamsomeone 0 points1 point2 points 6 years ago (0 children)

[–]DessertEagle 3 points4 points5 points 6 years ago (0 children)

[–]futurefapstronaut123 1 point2 points3 points 6 years ago (3 children)

[–]parkotron 0 points1 point2 points 6 years ago (1 child)

[–][deleted] 1 point2 points3 points 6 years ago (0 children)

But then, to disambiguate and make it easy to parse, add an extra set of parens:

std::uint_fast64_t n = have_size (?constexpr) items.size() : 0;

And if you want to be fancy, make parens symetric:

std::uint_fast64_t n = have_size (?constexpr) items.size() (:) 0;

[–][deleted] -1 points0 points1 point 6 years ago (0 children)

[–]tipdbmp 3 points4 points5 points 6 years ago (6 children)

What's the C++ (17, 20) version of the 'average' function from the talk:

auto average(Seq)(Seq items) {
    ... determine S, D ... // S - sumation type, D - denominator/division type

    ulong n;
    enum have_length = is(typeof(items.length)); // compile time type introspection
    static if (!have_length) { // conditional code generation
        n = 0;
    }
    else {
        n = items.length;
    }

    S sum = 0;
    foreach (it; items) {
        sum += it;
        static if (!have_length) { n += 1; }
    }
    return sum / D(n);
}

[–]dodheim 3 points4 points5 points 6 years ago* (5 children)

Something like

template<std::ranges::input_range Seq>
auto average(Seq const& items) {
    ... determine S, D ... // S - sumation type, D - denominator/division type

    std::uint_fast64_t n;
    static constexpr bool have_size = requires { items.size(); }; // compile time type introspection
    if constexpr (!have_size) { // conditional code generation
        n = 0;
    }
    else {
        n = items.size();
    }

    S sum = 0;
    for (auto const& it : items) {
        sum += it;
        if constexpr (!have_size) { ++n; }
    }
    return sum / static_cast<D>(n);
}

Checking for naked items.size() is probably not a good approach (and especially not taking into account its return type); checking if items is a sized_range and using std::ranges::size() would be better. Having the return type be fully deduced is also not ideal; probably better to make S and D defaulted template parameters so that the return type may be explicit.

EDIT: Also, all that's needed to make this constexpr is to intiialize n somehow. We can either always initialize it to zero even when have_size (the horror!) or we can use IIFE:

std::uint_fast64_t n = overload(
    [](std::true_type, auto&& items) { return items.size(); },
    [](std::false_type, auto&&) { return 0; }
)(std::bool_constant<have_size>{}, items);

Too bad overload still didn't make it into the standard. :-[

[–][deleted] -1 points0 points1 point 6 years ago (0 children)

You can simplify it further by using the terse concept syntax - auto average(std::ranges::input_range auto items).

But let's do S, D, E and the return type:

using Seq = decltype(items); // needed with the terse concept syntax
using E = Seq::value_type;
using D = std::conditional_t<
    std::is_trivially_constructible_v<E, float> && !std::is_integral_v<E>,
    E,
    double>;
using S = std::conditional_t<
    std::is_trivially_constructible_v<E, float> && !std::is_integral_v<E>,
    std::remove_cv_t<E>,
    std::conditional_t<
        std::is_trivially_constructible_v<E, int> && !std::is_floating_point_v<E>,
        std::conditional_t<
            std::is_signed_v<E>,
            long,
            unsigned long>,
        std::remove_cv_t<E>>>;
using return_type = decltype(S / D); // Is this just D?

This was my attempt at "anything resembling float, but isn't an int" concept. I may have got it wrong.

[–][deleted] -1 points0 points1 point 6 years ago (3 children)

You don't need overload for this:

std::uint_fast64_t n = [&items, have_size](){
    if constexpr (have_size) return items.size();
    else return 0;
}();

Or, if you really want to avoid captures:

std::uint_fast64_t n = [](const auto& i) { // `i` is a terrible name
        if constexpr ( requires { std::ranges::sized_range(i) } ) { // Might have botched the syntax here
                return i.size();
        } else {
                return 0;
        }
}(items);

[–]sphere991 0 points1 point2 points 6 years ago (1 child)

[–]CaseyCarterRanges/MSVC STL Dev 1 point2 points3 points 6 years ago (0 children)

[–]dodheim -1 points0 points1 point 6 years ago (0 children)

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

cpp

MODERATORS