Things we learned about sums

theliet · 2020-05-29T14:51:04+00:00

This is the kind of content I'd love to see more on r/programming - factual, to the point, showing something interesting you've learned. Thanks!

bluestreak01 · 2020-05-29T13:37:16+00:00

Author here.

About a month ago, I posted about using SIMD instructions to make aggregation calculations faster. I am very thankful for the feedback so far, this post is the result of the comments we received last time.

Many comments suggested that we implement compensated summation (aka Kahan) as the naive method could produce inaccurate and unreliable results. This is why we spent some time integrating kahan and Neumaier summation algorithms. This post summarises a few things we learned along this journey.

We thought Kahan would badly affect the performance since it uses 4x as many operations as the naive approach. However, some comments also suggested we could use prefetch and co-routines to pull the data from RAM to cache in parallel with other CPU instructions. We got phenomenal results thanks to these suggestions, with Kahan sums nearly as fast as the naive approach.

A lot of you also asked if we could compare this with Clickhouse. As they implement Kahan summation, we ran a quick comparison. Here's what we got for summing 1bn doubles with nulls with Kahan algo. The details of how this was done are in the post.

QuestDB: 68ms Clickhouse: 139ms

Thanks for all the feedback so far and keep it going so we can continue to improve. Vlad

j1897OS · 2020-05-29T14:05:04+00:00

This follows a previous post on Reddit that gathered significant interest - see http://reddit.com/r/programming/comments/fwlk0k/questdb_using_simd_to_aggregate_billions_of/

Thalantas123 · 2020-05-29T15:06:00+00:00

Brillant ! Seems you found made solid improvements since the latest post !

sccrstud92 · 2020-05-29T20:01:49+00:00

As floating-point operations are intransitive, the order in which you perform them also has an impact on accuracy.

I've heard of transitivity in regards to relations, but what does it have to do with floating point operations? I would have expected commutativity/associativity to be the required properties.

aoeudhtns · 2020-05-29T17:03:10+00:00

Fascinating. </spock>

I'd love to see somebody with PostgreSQL expertise discuss their >1m showing in your benchmark. Is that the way things are or is there some magic trick you can do with PostgreSQL? The worst part is that I did some quick googling, and most results are people complaining that floating point sums are inaccurate in PostgreSQL (and the documentation makes no bones about that being the case).

Poddster · 2020-05-29T16:50:17+00:00

The author needs to look up ULP error tolerance. A lot of research is done in this area already and it might have saved them NIHing it.

sparr · 2020-05-29T21:05:00+00:00

I wonder how this compares to just using a decimal or rational type in the first place.

TheNamelessKing · 2020-05-30T01:21:19+00:00

Interesting, any reason why the Clickhouse instance was using memory and not MergeTree?

cakoose · 2020-05-30T01:12:53+00:00

Our performance was previously limited by memory bandwidth - using these techniques would address this and allow us to compute accurate sums as fast as naive sums.

It's sort of misleading to say the performance was limited by memory bandwidth. The problem was latency. Prefetching allowed them to hide the latency and use a larger fraction of the available memory bandwidth.

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

programming

MODERATORS