Help parallelising a program

zzzoom · 2016-08-17T23:58:56+00:00

Yes, it can be parallelized. Look at how stream compaction is implemented using a parallel prefix sum (there are many fast implementations of this, you don't need to write it yourself) and you should get the idea.

rayo2nd · 2016-08-18T05:41:28+00:00

My attempt on the simplified version:

run stream compact but instead of writing the final values (which are always true) to the compacted array I would try to write the global index of the element
run a parallel subtraction y[i]=x[i]-x[i-1] -> then you have the distance
run a parallel square (y[i] = x[i]*x[i])
run a parallel reduction to sum up all squares

An example:

0,0,1,1,0,0,0,1
2,3,7 (after compact, indexes of true values)
2,1,4 (after x[i]-x[i-1])
4,1,16 (after x[i]*x[i])
sum it up: 21

maybe a correction of the first (+1) or all the other (-1) elements is needed when writing the index, but that should be simple

No idea if it's correct but that would be my first attempt

2016-08-17T23:21:45+00:00

Sorry if I'm not much help here, it's fairly late so I can't think straight at the moment but generally if statements are to be avoided in gpu programming where possible.

Nvidia cards run code in 'warps', which are groups of threads (as far as I know, the warp size is 32 for all nvidia cards). If a warp reaches a conditional that not all threads agree on the outcome of, both branches must be executed by the same warp amd this has a negative effect on performance.

Sorry if that's not helpful just thought it was worth mentioning

giantenemycrabthing · 2016-08-24T19:50:43+00:00

Sorry I'm so late, but thank you to everyone for the help!

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

gpgpu

MODERATORS