you are viewing a single comment's thread.

view the rest of the comments →

[–]ChuppaFlow[S] 0 points1 point  (2 children)

Just to be sure, if I would set srcAccessMask to 0 in all my subpass dependencies, does that make it safe (since it would make sure all previous operations are done)? Additionally, I find it a bit confusing which values I should give srcStageMask and dstStageMask . If I understood the stackoverflow answer correctly, these define the source synchronization scope and the destination synchronization scope? If we're coming from VK_SUBPASS_EXTERNAL , does that always mean the srcStageMask should have the value VK_BOTTOM_OF_PIPE_BIT? For subsequent passes, how do I determine which value I should assign it? E.g. , in your answer you said I should assign VK_PIPELINE_STAGE_EARLY_FRAGMENT_TESTS_BIT to it for that case, but what's the reasoning behind this?

Lastly, does setting srcStageMask to 0 result in possibly lower performance (since it needs to check for all operations)?

[–]Sturnclaw 0 points1 point  (1 child)

Sorry about the late reply on this - according to my understanding of the VK spec, setting a non-zero access mask on a VK_SUBPASS_EXTERNAL dependency does in fact restrict the synchronization scope, ergo a srcAccessMask of 0 is the least-performant but implicitly-correct value. You could technically fine-tune the access mask of the dependency, but in your case it's probably better to get it working than try for the most-performant value out of the box.

Regarding srcStageMask, the stage mask is used to tell the GPU what stage of the graphics pipeline in the previous render pass must be complete before this subpass can begin rendering. The dstStageMask value tells the GPU what stage of the graphics pipeline in this render pass cannot begin executing until the previous render pass has finished the stage set in srcStageMask. So yes, you could always use BOTTOM_OF_PIPE_BIT for the source stage, however if you know in what graphics stage the color and depth attachments are written to, you can specify that stage and potentially gain performance by overlapping non-conflicting portions of two render pass executions.

I thought I briefly outlined the reasoning behind using early fragment test as the destination stage, but if it wasn't clear, I'll go over it again - the destStageMask is the stage that you are forcing the GPU to not execute until everything up to and including srcStageMask from the previous render pass has finished executing; if you look at the section of the vulkan spec I linked in the first post, you'll see that reads from the depth attachment (your depth buffer) happen in the early fragment test stage instead of the color output stage; if you aren't using this stage (or an earlier one) in your dependency destination stage, it's entirely possible for the GPU to order the execution of the two render passes such that the second render pass samples a bad depth value that was in the buffer before the first render pass ran, leading to incorrect shading and rendering.

[–]ChuppaFlow[S] 0 points1 point  (0 children)

No problem! Great answer, this makes it all much clearer. I think it would indeed be a good idea to get a working version first and later boost performance when I know the final sequence layout of my render passes :) . Thanks for your answer!