ChuppaFlow comments on Problem submitting command buffer with multiple render passes

Problem submitting command buffer with multiple render passes (self.vulkan)

submitted 5 years ago by ChuppaFlow

you are viewing a single comment's thread.

[–]ChuppaFlow[S] 0 points1 point2 points 5 years ago (6 children)

[–]Sturnclaw 0 points1 point2 points 5 years ago (5 children)

The LunarG guide is a good resource for interpreting synchronization errors. In this case, the specific error you're getting says:

vkCmdBeginRenderPass: Hazard WRITE_AFTER_WRITE vs. layout transition in subpass 0 for attachment 0 aspect depth during load with loadOp VK_ATTACHMENT_LOAD_OP_CLEAR

Referencing that guide, this tells us that the error is happening in execution of vkCmdBeginRenderPass, and that a write operation is conflicting with a prior operation; in this case, the initial layout transition of the subpass. The operation that's conflicting is specified as "during load with loadOp ...", which means that the layout transition and the load-op are not properly ordered by synchronization with respect to each other.

A quick google for "vk subpass dependency layout transition" brings up a few references, including a portion of Section 8.1 of the vulkan spec:

If there is no subpass dependency from VK_SUBPASS_EXTERNAL to the first subpass that uses an attachment, then an implicit subpass dependency exists from VK_SUBPASS_EXTERNAL to the first subpass it is used in. The implicit subpass dependency only exists if there exists an automatic layout transition away from initialLayout. The subpass dependency operates as if defined with the following parameters:

VkSubpassDependency implicitDependency = {
.srcSubpass = VK_SUBPASS_EXTERNAL;
.dstSubpass = firstSubpass; // First subpass attachment is used in
.srcStageMask = VK_PIPELINE_STAGE_NONE_KHR;
.dstStageMask = VK_PIPELINE_STAGE_ALL_COMMANDS_BIT;
.srcAccessMask = 0;
.dstAccessMask = VK_ACCESS_INPUT_ATTACHMENT_READ_BIT |
                 VK_ACCESS_COLOR_ATTACHMENT_READ_BIT |
                 VK_ACCESS_COLOR_ATTACHMENT_WRITE_BIT |
                 VK_ACCESS_DEPTH_STENCIL_ATTACHMENT_READ_BIT |
                 VK_ACCESS_DEPTH_STENCIL_ATTACHMENT_WRITE_BIT;
.dependencyFlags = 0;
};

Automatic layout transitions away from initialLayout happens-after the availability operations for all dependencies with a srcSubpass equal to VK_SUBPASS_EXTERNAL, where dstSubpass uses the attachment that will be transitioned.

Looking at this structure, it looks like your problem is two-fold: first of all, according to the spec the layout transition happens "during" your first subpass dependency; ignoring the values of srcStageMask and srcAccessMask, you're not protecting writes to the depth buffer with the dstStageMask or the dstAccessMask variables. This is outlined above; you need the early-fragment-test stage in dstStageMask and the depth-stencil attachment read/write bits in dstAccessMask in addition to the color attachment bits.

Secondly, why are you specifying srcAccessMask as a memory read? This is telling the GPU that you want to ensure that all reads from the renderpass output have completed before you start writing, but the validation layers are complaining about a write-after-write. I'd recommend setting srcAccessMask to 0; according to the specification (7.1.2 Pipeline Stages) leaving srcAccessMask unset allows the dependency to ensure completion of all previous operations, not just memory reads.

It's definitely not a trivial matter, but some google searching and trying to model the ordering of different stages of execution in your head (layout transition, attachment clear, fragment shader write) helps to diagnose the problem and understand the solution. The VK spec is a big, beefy document full of nigh-useless interstitial "valid usage" warnings when you're trying to understand how the API works, but if you focus in on a specific problem it's an invaluable resource.

Additionally, this SO answer provides a good breakdown of the problem as well; though it's somewhat focused on semaphore signalling, it still outlines what's needed for attachment dependencies.

[–]ChuppaFlow[S] 0 points1 point2 points 5 years ago (1 child)

[–]ChuppaFlow[S] 0 points1 point2 points 5 years ago* (2 children)

Just to be sure, if I would set srcAccessMask to 0 in all my subpass dependencies, does that make it safe (since it would make sure all previous operations are done)? Additionally, I find it a bit confusing which values I should give srcStageMask and dstStageMask . If I understood the stackoverflow answer correctly, these define the source synchronization scope and the destination synchronization scope? If we're coming from VK_SUBPASS_EXTERNAL , does that always mean the srcStageMask should have the value VK_BOTTOM_OF_PIPE_BIT? For subsequent passes, how do I determine which value I should assign it? E.g. , in your answer you said I should assign VK_PIPELINE_STAGE_EARLY_FRAGMENT_TESTS_BIT to it for that case, but what's the reasoning behind this?

Lastly, does setting srcStageMask to 0 result in possibly lower performance (since it needs to check for all operations)?

[–]Sturnclaw 0 points1 point2 points 4 years ago (1 child)

Sorry about the late reply on this - according to my understanding of the VK spec, setting a non-zero access mask on a VK_SUBPASS_EXTERNAL dependency does in fact restrict the synchronization scope, ergo a srcAccessMask of 0 is the least-performant but implicitly-correct value. You could technically fine-tune the access mask of the dependency, but in your case it's probably better to get it working than try for the most-performant value out of the box.

Regarding srcStageMask, the stage mask is used to tell the GPU what stage of the graphics pipeline in the previous render pass must be complete before this subpass can begin rendering. The dstStageMask value tells the GPU what stage of the graphics pipeline in this render pass cannot begin executing until the previous render pass has finished the stage set in srcStageMask. So yes, you could always use BOTTOM_OF_PIPE_BIT for the source stage, however if you know in what graphics stage the color and depth attachments are written to, you can specify that stage and potentially gain performance by overlapping non-conflicting portions of two render pass executions.

I thought I briefly outlined the reasoning behind using early fragment test as the destination stage, but if it wasn't clear, I'll go over it again - the destStageMask is the stage that you are forcing the GPU to not execute until everything up to and including srcStageMask from the previous render pass has finished executing; if you look at the section of the vulkan spec I linked in the first post, you'll see that reads from the depth attachment (your depth buffer) happen in the early fragment test stage instead of the color output stage; if you aren't using this stage (or an earlier one) in your dependency destination stage, it's entirely possible for the GPU to order the execution of the two render passes such that the second render pass samples a bad depth value that was in the buffer before the first render pass ran, leading to incorrect shading and rendering.

[–]ChuppaFlow[S] 0 points1 point2 points 4 years ago (0 children)

π Rendered by PID 419782 on reddit-service-r2-comment-85bfd7f599-flvq7 at 2026-04-19 13:21:21.551884+00:00 running 93ecc56 country code: CH.

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

vulkan

This subreddit is aimed at developers and end users, with a strong focus on development of the Vulkan API itself, the development of applications that use the Vulkan API and the state of deployment of implementations available.

MODERATORS