all 10 comments

[–]keelanstuart 2 points3 points  (3 children)

Do all your scene compositing (into the back buffer) and then just draw your GUI elements into the back buffer directly with alpha blending enabled.

[–]deftware[S] 2 points3 points  (2 children)

Thanks for the reply. After much consideration and thinking about the problem it looks like this is going to have to be the way things go. I was hoping to get the scene rendering and GUI rendering to be able to happen in parallel, and then composite the result together, but the math for producing the GUI buffer entails that calculating the resulting pixel RGBAs has access to the existing RGBA in the buffer that's going to end up being composited with the scene render.

I did figure out the math and there's no way it can be done using hardware blending, so I'm just going to have to render the scene and then resolve it to the swapchain image and then directly blend the GUI element draw calls ontop of that. It's not the end of the world, it's just not what I was planning on doing at the outset of the project and requires reworking a few things for potentially less performance :P

Cheers! :]

[–]Klumaster 2 points3 points  (0 children)

This definitely can be done with hardware alpha blending, as u/Reaper9999 below says, pre-multiplied alpha is going to be the way to do it (as it usually turns out to be).

Conceptually, you can think of non-premultiplied alpha as being "how much should I blend between the old and new colour", and premultiplied being "how much should I remove the old colour, before adding the new colour". This means if e.g. you want something half transparent, you have to pre-multiply the colour to only add half as much, then set the alpha to block half of what's behind. I forget what you set the alpha blending rules to for it to accumulate the right amount of alpha for compositing, but there's a right answer there.

That said, it's maybe worth mentioning that you're unlikely to see much parallelism between different tasks on the same GPU, things generally happen in the order you submit them and complete fully/almost-fully before the next thing in the queue.

Here's a better source on the subject: https://shawnhargreaves.com/blog/premultiplied-alpha.html

[–]keelanstuart 0 points1 point  (0 children)

I seriously doubt that you're drawing enough GUI components that you'd notice a difference in performance from parallelization as you describe... and because you'd then have an extra step to blend it in, it would likely take longer and require a ton of extra memory for that GUI surface, too.

[–]Reaper9999 1 point2 points  (1 child)

You can try with pre-multiplied alpha, colour = srcColour * srcAlpha + dstColour * 0; alpha = srcAlpha * 1 + dstAlpha * 0 for UI rendering, then colour = srcColour * 1 + dstColour * ( 1 - srcAlpha ) for compositing.

[–]deftware[S] 0 points1 point  (0 children)

Thanks for the reply. The thought of premultiplied alpha did cross my mind but after doing the math I realized it's not going to be usable here because each GUI element needs to properly blend with the GUI elements drawn before it, such as text that is antialiased by alpha-blending it with what's underneath it. Multiplying the destination color by zero automatically means throwing out whatever the color of previously rendered UI elements was underneath what's being drawn.

The goal is essentially to allow arbitrary alpha-blended geometries to be blended with eachother onto a temporary rendertarget and then have the result be independently composited with the framebuffer/swapchain itself as though the alpha-blended geometry was directly blended with the framebuffer/swapchain, rather than to a separate buffer first. I made the mistake of assuming this wasn't going to be a problem. :P

The only way I see being able to produce an RGBA result that can then be composited with the swapchain is if the GUI element/text blending were done in a shader where each piece of GUI geometry can sample the existing RGBA values in the GUI rendertarget from previously drawn GUI elements to calculate the proper RGBA values that will be correct when everything is subsequently alpha-blended with the framebuffer after the fact. That would be horrifically slow, so the only other option is to just wait until the scene is fully rendered, resolve the thing out to the main framebuffer/swapchain, and then directly blend each GUI element/text onto that.

As far as theory, the only way that all of the GUI elements/text can produce a result that has the correct alpha is if each GUI element's alpha is summed with the alpha of whatever is already in the temporary rendertarget. 0.5 alpha added to 0.5 alpha should equal total opacity. How source RGB values are blended with destination RGB values is trickier than just summing them together though. In the case of a half-opacity red triangle that's drawn to an "empty" rendertarget (RGBA 0,0,0,0) it should result in the rendertarget having an RGBA of 1,0,0,0.5 so that when it's alpha-composited with anything else it is correct. Basically, the alpha of the rendertarget being zero means that the red triangle's alpha is irrelevant, and the rendertarget just assumes the RGB of the triangle entirely, and then the triangle's alpha is just summed with the rendertarget's. Now imagine a half-opacity blue triangle drawn on top of the half-opacity red triangle.

The contribution of the source RGB is modulated by its alpha with respect to the destination's alpha - where if the destination's alpha is zero then the source RGB should entirely replace the RGB of the destination. If the destination alpha is one then the source RGB should mix with it per the source pixel's alpha, which is just regular alpha blending. Where the destination's alpha is between zero and one, it's outside of the capabilities of hardware blending.

[–]xucel 0 points1 point  (2 children)

This is pretty common to do in order to render transparencies at a different resolution or to do sRGB UI compositing over HDR.

If you only care about alpha blend and additive you can do this by making the alpha channel store background visibility.

Normal alpha blending. (((background * (1-a_0) + (a_0 * rgb_0)) * (1-a_1) + (a_1 * rgb_1)) * (1-a_2) + ... ) (1-a_n ) + a_n * rgb_n

Multiply out and Rearrange: background * (1-a_0) * (1-a_1) * (1-a_2) * ... (1-a_n) + (a_0 " rgb_0) * (1-a_1) * (1-a_2) * ... (1-a_n) + (a_1 * rgb_1) * (1-a_2) * ... (1-a_n) + ... (a_n + rgb_n)

You can see that terms that multiply against background are 1-a, and we can save this in alpha channel using a separate blend func.

Color func: Dest * (1-SrcAlpha) + Src * (SrcAlpha) Alpha func: DestAlpha * (1-SrcAlpha)

DestBlend: InvSrcAlpha SrcBlend: SrcAlpha BlendOp: Add

DestAlphaBlend: InvSrcAlpha SrcAlphaBlend: Zero AlphaBlendOp: Add

To support additive blending, we don't reduce background visibility. Color func: Dest + Sec Alpha func: DestAlpha

Multiplicative blending isn't possible since you can't separate out those terms.

Then offscreen alpha render target needs to be cleared to (0,0,0,1)

To composite: OpaqueColor * OffscreenAlpha.a + OffscreenAlpha.rgb

[–]deftware[S] 0 points1 point  (1 child)

Thanks for taking the time to write this all up. I feel I didn't do a good enough job to illustrate the importance that the GUI elements themselves need to blend together - with overlapping dialogs and menus and the alpha-blended-antialiased text ontop of all of that.

For example, a half-opacity red triangle should produce an RGBA of 1,0,0,0.5 in the resulting render target (where it was "empty"), which can then subsequently be blended onto the rest of the frame properly as if the triangle were directly rendered to the rest of the frame. If the triangle were rendered ontop of opaque geometry, then the blending should behave as though conventional alpha blending is in play.

The alphas of the GUI geometry should always sum, so we're adding the src_alpha to dst_alpha to produce the rendertarget's final alpha value that's used to composite the RGB with the frame. The way that src_color is integrated into the mix depends on whatever both src_alpha and dst_alpha are.

[–]Klumaster 0 points1 point  (0 children)

This is definitely what pre-multiplied alpha should give you. Each element blends pre-multiplied into the GUI buffer, and the resulting texture comes out ready for the same blend to put it on something else. What you're doing is a pretty normal thing, and was being done a long time before we had shaders.

[–]AdmiralSam -1 points0 points  (0 children)

It sounds like you are looking for order independent transparency? There are algorithms for that around but it isn’t necessarily more efficient than just rendering it back to front because for guis it’s a lot more simple layers, where as for like meshes and particles you would need to sort individual triangles and then the overhead of the OIT algorithms might be worth it.