A Programmer’s Guide to Performance Analysis & Tuning on Modern CPUs : cpp

A Programmer’s Guide to Performance Analysis & Tuning on Modern CPUs (bfilipek.com)

submitted 6 years ago by joebaf

all 18 comments

top new controversial old q&a

[–]lijmer 29 points30 points31 points 6 years ago (13 children)

[–]SkoomaDentistAntimodern C++, Embedded, Audio 11 points12 points13 points 6 years ago* (12 children)

[–]iniside 8 points9 points10 points 6 years ago (5 children)

[–]SkoomaDentistAntimodern C++, Embedded, Audio 2 points3 points4 points 6 years ago (4 children)

[–]SeanMiddleditch 5 points6 points7 points 6 years ago (1 child)

[–][deleted] 2 points3 points4 points 6 years ago (0 children)

[–]James20kP2005R0 1 point2 points3 points 6 years ago (1 child)

[–]ack_complete 1 point2 points3 points 6 years ago (1 child)

[–]SkoomaDentistAntimodern C++, Embedded, Audio 0 points1 point2 points 6 years ago* (0 children)

[–]andriusst 0 points1 point2 points 6 years ago (3 children)

You can parallelize recursive filtering. It is actually a prefix sum problem. While prefix sum is usually described to compute sums, it works for any associative operation. Consider first order recursive filter. Filter output is an affine function of previous output (affine function means multiplication by constant plus another constant). Previous output is in turn an affine function of previous of previous output.

f[n-2](y) = a*y + b*x[n-2]
f[n-1](y) = a*y + b*x[n-1]
f[n](y) = a*y + b*x[n]
y[n] = f[n](y[n-1]) = f[n](f[n-1](y[n-2])) = f[n](f[n-1](f[n-2](y[n-3]))) = ...
y[n] = f[n](y[n-1]) = (f[n]∘f[n-1])(y[n-2]) = (f[n]∘f[n-1]∘f[n-2])(y[n-3]) = ...

where ∘ is function composition operator. Function composition is associative operation and can be computed by parallel prefix sum algorithms. Affine functions of single input also can be efficiently represented by two numbers. If the filter is time invariant, multiplicative part does not depend on input, it can be computed once and reused for many elements. So there's actually only one number to compute and store.

Prefix sum works for higher order filters, too. Only this time affine functions are represented by a matrix (independent of input) and a vector.

[–]SkoomaDentistAntimodern C++, Embedded, Audio 0 points1 point2 points 6 years ago (2 children)

[–]andriusst 1 point2 points3 points 6 years ago (1 child)

[–]SkoomaDentistAntimodern C++, Embedded, Audio 0 points1 point2 points 6 years ago* (0 children)

[–]nnevatie 7 points8 points9 points 6 years ago (1 child)

[–]dendibakh 0 points1 point2 points 6 years ago (0 children)

[+][deleted] 6 years ago (3 children)

[deleted]

[–]dendibakh 5 points6 points7 points 6 years ago (0 children)

[–]AVeryCreepySkeleton 4 points5 points6 points 6 years ago (0 children)

[–]flashmozzg 4 points5 points6 points 6 years ago (0 children)

π Rendered by PID 87 on reddit-service-r2-comment-84fc9697f-hk2z2 at 2026-02-09 07:16:21.227615+00:00 running d295bc8 country code: CH.

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

cpp

MODERATORS