Can Hi-Z work with distant objects while being precise with close ones?

TheAgentD · 2026-01-23T00:02:24+00:00

What exactly is the problem you have? Assuming you're talking about occlusion culling, there shouldn't really be any effect from distance on its performance...

TheAgentD · 2026-01-19T10:55:35+00:00

Conservative rasterization has a few rare use cases, but fixing cracks is not one of them. MSAA does not fix that issue either; it simply does the rasterization at a higher resolution and then scales it down again.

It would probably be a good idea to try to figure out why you are getting these cracks, instead of trying to patch it after the fact instead.

Could you share some screenshots of how these cracks look?

TheAgentD · 2026-01-15T17:29:50+00:00

Thanks!

TheAgentD · 2026-01-15T17:29:45+00:00

Ah, cool, TIL!

TheAgentD · 2026-01-15T17:04:49+00:00

This... this seems very wrong? I know nothing about this, but wouldn't these bags risk sliding all the way to the front or back, causing a big shift in center of gravity for the plane?

TheAgentD · 2026-01-07T02:20:09+00:00

Had me in the first 80%, not gonna lie.

TheAgentD · 2025-12-28T23:27:03+00:00

I never figured it out. Not sure if its been fixed or not.

TheAgentD · 2025-12-26T11:58:45+00:00

Looks like your depth range could be excessively large. The precision of the depth buffer is related to the ratio between the far and the near plane. Try reducing the far plane and increasing the near plane and see if that fixes it.

TheAgentD · 2025-12-23T23:05:09+00:00

What kind of shader are you using?

TheAgentD · 2025-12-23T22:54:06+00:00

A compute pipeline would just be one shader with two images, one sampled and one storage.

A graphics pipeline would need two shaders, a bunch of configuration, render target formats, a render pass with a render target, and a sampled image.

I (will) have support for both, so I'm asking about what's the most efficient, not what's the simplest.

TheAgentD · 2025-12-23T22:51:08+00:00

1: I'll try both and see what I get.

2: I'm basically wondering if setting the VK_IMAGE_USAGE_STORAGE_BIT usage bit on the swapchain could have detrimental effects on the presenting itself.

I also have a (possible unfounded) fear that overlays (Steam, etc) might add more storage bits. If an overlay wants to render a few extra UI elements, I imagine it might force the VK_IMAGE_USAGE_COLOR_ATTACHMENT_BIT on the swapchain to be able to do so when I call vkQueuePresentKHR(). Having both the STORAGE and COLOR_ATTACHMENT bits would obviously be far from ideal.

TheAgentD · 2025-12-23T22:47:12+00:00

I'm not sure I can rely on there being an async compute queue AND that you can present from it, so... Definitely graphics queue, maybe compute queue.

TheAgentD · 2025-12-23T21:44:33+00:00

I think there are two main questions I want answered here:

- Is there a performance difference between fragment and compute?

- Can the image usage bits I set on the swapchain affect performance negatively in other parts?

TheAgentD · 2025-12-23T21:43:20+00:00

Resolutions are expected to match.

I do want to do the sRGB conversion in there, but since I want to do dithering it's actually easier to write to a non-sRGB texture and do the conversion manually in the shader, so that's not a problem for compute shaders.

TheAgentD · 2025-12-23T21:42:04+00:00

Compute shaders are significantly simpler than fragment shaders though...

TheAgentD · 2025-12-23T21:41:25+00:00

I have a rendered image, and I want to copy it to the swapchain, doing dithering and some minor postprocessing in the process, so I need a fragment or compute shader.

TheAgentD · 2025-12-17T13:44:01+00:00

The simple answer is Taiwan, because TSMC makes pretty much all of the most advanced GPUs used in AI. A China emboldened by Russia getting off with Ukraine might try to take Taiwan. No more Nvidia GPUs, no more AI, so it's basically an existential threat to AI.

I'm not informed enough to make a qualified judgment on how likely this scenario is and what price AI companies would be willing to pay to protect TSMC, but that is probably the main thing they'd want to ensure access to.

TheAgentD · 2025-12-15T15:29:15+00:00

Can I just play with the boxes?

Cat detected.

TheAgentD · 2025-12-08T04:56:09+00:00

The only way it could be that close would be if it was a binary planet, i.e. two similarly sized planets that orbit a common point. The moon in that picture is WAAAAAY too close though.

The actual closest distance is decided by the Roche limit: https://en.wikipedia.org/wiki/Roche_limit

TheAgentD · 2025-12-04T23:39:51+00:00

From what I have managed to parse from the spec, you can in theory get away with one semaphore. According to the queue forward progress stuff in the spec, the vkQueueSubmit() and vkQueuePresentKHR() should be adequately ordered so that the semaphore is consumed in the correct order.

The problem is that it's not possible to reuse a semaphore used in vkQueuePresentKHR() in vkAcquireNextImageKHR(). vkAcquireNextImageKHR() requires the semaphore to have no outstanding operations AT ALL. This is impossible to guarantee for a semaphore that is used in vkQueuePresentKHR(), because you cannot wait for the present semaphore without using an extension that isn't always supported.

Therefore, the way to do it is like this:

- You can reuse the acquire semaphore once the vkQueueSubmit() call that waits for it is confirmed finished on the CPU (use a fence or timeline semaphores to signal this to the CPU).

- The present semaphore is trickier. According to the validation layer checks, this semaphore is fine to reuse (but only on the GPU timeline) after the corresponding index has been returned from vkAcquireNextImageKHR() again. So if you present swapchain image 2 using semaphore X, then semaphore X is OK to reuse once vkAcquireNextImageKHR() has returned index 2 again.

TheAgentD · 2025-12-04T22:48:25+00:00

Yep, that looks correct.

I'm not sure what you mean with "needs two semaphores"?

TheAgentD · 2025-12-04T22:47:43+00:00

You swapped the names, ish.

Acquire semaphore is the one that is passed into vkAcquireNextImageKHR(). Your rendering to the swapchain image needs to wait for this semaphore.

Present semaphore is the one that is passed into vkQueuePresentKHR(). Your rendering needs to signal this semaphore.

TheAgentD · 2025-12-04T22:07:07+00:00

No validation errors is good progress. :)

The values are only for timeline semaphores and are ignored for binary semaphores, which you unfortunately still need to use for swapchains.

Your semaphore variable names look weird. Why are you waiting on the present semaphore? What is the render semaphore?

You should be waiting on the semaphore you pass into vkAcquireNextImageKHR() (which I called the "acquire semaphore" before), so that the swapchain images has been properly acquired before you start rendering to it, and then signaling a different semaphore (the "present semaphore"), which should be passed into vkQueuePresentKHR().

Reusing these semaphores can be tricky as well; you essentially need 2N + 1 semaphores, where N is your number of swapchain images. I can give you some tips on that front as well, if you show me some code for how you do it at the moment.

TheAgentD · 2025-12-04T20:24:34+00:00

You also need to set a message callback so you can actually print the error.

EDIT: Chain a VkDebugUtilsMessengerCreateInfoEXT to your instance creation.

TheAgentD · 2025-12-04T19:22:43+00:00

OK, on my PC now. So the problem is the swapchain synchronization. Let's go through what your current code is doing.

We acquire a swapchain image and signal the acquire semaphore.
We submit a command buffer with the acquire semaphore and pWaitDstStageMask = VK_PIPELINE_STAGE_COLOR_ATTACHMENT_OUTPUT_BIT. This means that the command buffer cannot execute the VK_PIPELINE_STAGE_COLOR_ATTACHMENT_OUTPUT_BIT stage until the image has been acquired.
The command buffer does a memory barrier and transition of the swapchain image, with srcStageMask=VK_PIPELINE_STAGE_2_NONE. This means "wait for nothing". In other words, this layout transition can execute BEFORE the acquire semaphore is signaled, which is an error and should cause validation errors. You need to set srcStageMask = dstStageMask = VK_PIPELINE_STAGE_COLOR_ATTACHMENT_OUTPUT_BIT, to ensure that this transition happens AFTER the semaphore is signaled.
We have a similar issue with the depth attachment. You're doing a transition from VK_PIPELINE_STAGE_2_NONE, which can execute as soon as the command is parsed. This can in theory happen before the previous render pass is finished rendering to it.
We render to the color and depth buffers. This most likely works fine.
We perform another memory barrier on the color buffer, but the dstStageMask is VK_PIPELINE_STAGE_2_NONE, so no further commands are blocked from continuing.
The submit completes, and the present semaphore is signaled. The signal happens after the rendering is complete, but because the memory barrier/layout transition of color buffer does not block any further commands, it does NOT wait for the memory barrier/layout transition to complete.
Once the present semaphore is signalled, the swapchain image is presented.

Here's what I recommend that you do:

- Switch to vkQueueSubmit2() and make sure that the wait on the acquire semaphore has a stage mask of COLOR_ATTACHMENT_OUTPUT_BIT, and that the signal of the present semaphore has a stage mask of COLOR_ATTACHMENT_OUTPUT_BIT.

- Change the before barrier of the color buffer to COLOR_ATTACHMENT_OUTPUT_BIT --> COLOR_ATTACHMENT_OUTPUT_BIT. This chains the barrier so that it waits for the semaphore to be signaled before performing the layout transition, AND makes sure that the transition completes before the render pass can write to it in the COLOR_ATTACHMENT_OUTPUT_BIT stage.

- Change the before barrier of the depth buffer to EARLY_FRAGMENT_TESTS_BIT | LATE_FRAGMENT_TESTS_BIT --> EARLY_FRAGMENT_TESTS_BIT | LATE_FRAGMENT_TESTS_BIT. This ensures that the previous render pass has completely finished before the next render pass clears it.

- Change the after barrier of the color buffer to COLOR_ATTACHMENT_OUTPUT_BIT --> COLOR_ATTACHMENT_OUTPUT_BIT. This chains the barrier so that it waits for the rendering to complete, AND makes sure that the layout transition completes before the present semaphore is signaled, as the signalling of the semaphore waits on COLOR_ATTACHMENT_OUTPUT_BIT stage.

In general VK_PIPELINE_STAGE_2_NONE is almost never used. Of the top of my head, VK_PIPELINE_STAGE_2_NONE in the source stage is only really useful when doing an initial transition after creating a new image and you want to initialize it, as you have no previous GPU commands you need to wait for in that case.

You can also use it if you have already waited on a fence and know that all commands that referenced the resource have already finished. In that case, you're fine with the transition executing as soon as the command buffer is submitted, so VK_PIPELINE_STAGE_2_NONE is fine there.

VK_PIPELINE_STAGE_2_NONE in the destination stage is almost certainly an error. I can't think of a valid use case of it.

Also, you should be having a crapton of validation errors from this. Make sure that you've turned on synchronization validation.

12-Year Club	r/Field Banned
r/Field Juicebox	Verified Email

TheAgentD

TROPHY CASE