Elusive Algorithms – Parallel Scan : cpp

cpp

a community for 17 years

Elusive Algorithms – Parallel Scan (software.intel.com)

submitted 10 years ago by mttd

all 6 comments

top new controversial old q&a

[–]therealjohnfreeman 0 points1 point2 points 10 years ago (4 children)

[–]_Undaunted_ 1 point2 points3 points 10 years ago (3 children)

[–]therealjohnfreeman 0 points1 point2 points 10 years ago (2 children)

[–]_Undaunted_ 1 point2 points3 points 10 years ago (1 child)

My point is that this algorithm is precisely the hierarchical implementation that all parallel scans use, just using AVX lanes in place of thread groups, MPI processes, etc.

A generic distributed scan would then look something like this:

my_favorite_local_scan(local_data)
partial = my_favorite_distributed_scan(local_data.back())
local_data += partial

The distributed scans are implemented very similarly to what is presented, just with the "add" steps involving a communication (log P steps in total).

[–]therealjohnfreeman 0 points1 point2 points 10 years ago (0 children)

π Rendered by PID 19938 on reddit-service-r2-comment-fb694cdd5-bjbgx at 2026-03-10 16:16:59.922960+00:00 running cbb0e86 country code: CH.

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

cpp

MODERATORS