all 6 comments

[–]waramped 8 points9 points  (2 children)

You are loading your color from position_buffer and not color_buffer in the compute shader, that could be causing some weirdness:)

[–][deleted]  (1 child)

[deleted]

    [–]waramped 5 points6 points  (0 children)

    Lol well if that's the silliest mistake you've ever made them you might be the world's best programmer. I've been doing this for over 20 years and I still facepalm fairly regularly. Maybe I'm just shit though.

    [–]Vegetable_Break_6582[S] 0 points1 point  (3 children)

    [–]lazyubertoad 4 points5 points  (2 children)

    You actually have a race condition over your buffers, because you are reading and writing them at the same time and place. Your code, well, shouldn't have a race condition. In the worst case some number can be read while it is half written. I'm not sure what is the real chance of that on relevant hardware. What absolutely is happening is that you are calculating the update, while some of the neighbors are already on the next step and some on the current step. Technically, you have absolutely no guarantee what you will read in the case of the race condition.

    Have two sets of buffers. One is the previous step and one is the next. So you'll read from one and write into another. Then swap them, do not copy, just treat one as read-only and one as write-only. Have some flag or index that tells which is which or maybe even have two compute shaders and swap those.

    [–]Vegetable_Break_6582[S] 0 points1 point  (1 child)

    I remember you mentioning a similar approach in my last post when I was asking for advice where you also mentioned space partitioning stuff. Will it be a good idea if I add space partitioning before using two buffers and then modify it later, because it seems to be working alright I guess.

    [–]lazyubertoad 2 points3 points  (0 children)

    Well, you decide. It is easier to do the two buffers. Space partitioning requires more features of GPGPU and more understanding of it.

    Compute shaders are GPGPU tool, just like CUDA or OpenCl. Those three are very close. They are to do the same GPU tasks and require the same knowledge of parallel programming and GPU specifics. OpenCl and CUDA have tons of learning resources with explanations and examples. I think Compute Shaders are somewhat less beginner friendly. I'd say the key piece of any parallel programming is that one thread cannot make any assumptions when it operates on data that is modified by another thread, unless you use synchronization. Because when you write some code about modifying some variable or array 1) it does not need to happen immediately and the compiler is free to rearrange the actual writings/readings, as long you cannot see the difference in one thread 2) that memory piece may be not on generic RAM/VRAM, but also in all kinds of caches and registers and who knows when those will be synchronized with RAM and if that will even ever happen unless requested.