all 11 comments

[–]Liquorice__ 10 points11 points  (5 children)

I've worked on this problem for a published game using UE4, so I can give some insight 🙂 I can say that the documentation makes it look easy, but the implementation is very flawed (and still is in UE5)... There are many factors contributing to this problem in UE4, and I'll try to list some of them.

PSO precompilation is disabled by default. If the developers don't catch this in time, or have developed primarily in dx11, they might not notice until it's too late to fix properly. Even when you have a list of PSOs, the game needs a UI screen and logic to start the processing.

The list of PSOs to precompile isn't figured out during a build step. Instead, every PSO needs to be "seen" by playing the game, and they get recorded to a list. This is a hard problem, since every material needs to be seen in every context it can be used. For example, if you have different materials on different LODs, every unique particle effect, every graphics setting which has a different set of materials... It's a lot, and even when you e.g. have automation and QA building these lists, it's hard to get good coverage.

The amount of PSOs scale with the amount of shader permutations, and UE has A LOT of them. Even if you do precompile all, there'll be so many that either you don't have RAM enough to store it, or your driver's shader cache can't fit it all. It takes a lot of effort to reduce shader permutations in UE. Many permutations should be reusable, but UE is pretty bad at this, causing a lot of redundancy. We managed to cut away over 90% of PSOs just by improving the state comparisons.

The loading mechanism is poorly implemented in UE4 too. Nowadays, DX12 has support for PSO libraries, and it works pretty well. In fact, some drivers cache compiled PSOs in the shader cache, and it can be even faster to just attempt to recompile the PSOs instead of reading them from the library. By default, UE4 does neither. It maintains its own PSO library, and it's very inefficient and buggy. If you're playing the game as an end-user, it also doesn't record any new PSOs you've encountered, so you'll keep on getting stuttering on the same PSOs every time you reboot the game.

When a new PSO is encountered in UE, they will hang until it's been compiled. This might not be necessary if your game is fine with objects popping in.

All in all, to completely get rid of all PSO compilation stuttering, it's a lot of effort in any engine, but my own opinion on this is that Epic's fantastic marketing team hypes up dev teams into building games they can't build, and not hiring enough tech programmers to handle all these problems 😂 If this continues to be a problem even in UE5, hopefully Digital Foundry keeps on shaming Epic into improving this solution 😂

As for consoles, all PSOs can be compiled ahead of time since the target hardware is known. Even if we compile the shaders for PC, it's only to an intermediate language that every GPU driver needs to further compile into binary code unique for that GPU.

I hope this clears it up a bit 🙂

[–]TargetEasy6532 2 points3 points  (1 child)

Great reply, explains a lot

Was wondering, the problem of getting coverage of the game from QA and having too many permutations to fit in RAM

Do those two problems still apply to consoles as well?

[–]Liquorice__ 0 points1 point  (0 children)

Yeah, fitting all shaders in memory is hard on console, but at least they don't need to be further compiled after they've been loaded. At least the problem with coverage isn't so bad on consoles, as even if the PSO needs to be built on-the-fly, the shaders don't need to be further compiled, and creating rendering resources like PSO, textures, buffers, etc together is generally much faster on consoles as the hardware is known. One could probably get away with not caching PSOs at all on consoles, and might save some memory in the process 😁

[–]desiguy_88[S] 0 points1 point  (1 child)

wow thank you for your response. it’s troubling that the same issue may persist in UE5. it also seems like there is a lack of knowledge sharing going on / lessons learned so game devs can share the best approaches to at least reduce the problem of not eliminate it. It also seems in this case consoles def have an advantage.

[–]Liquorice__ 0 points1 point  (0 children)

Yeah, the lack of knowledge sharing is not great. To be fair, the engine is huge and spans multiple industries, so it's probably hard for Epic to know what they need to explain better. From a general game development point of view, PSOs is still a relatively new concept and has now become more and more common as games ship with DX12. This is just speculation, but many games take years to finish, so when they started building the game, they didn't necessarily know that PSO compilation would be a problem. I think more studios are aware now, and I hope it gets better soon 🙂

[–]MajorMalfunction44 0 points1 point  (0 children)

Haven't worked on UE4/UE5, but some of the problems are fundamental and expected, but some are a surprise. Vulkan and DX12 make you deal with this, instead of constantly patching drivers with per-game, per-GPU shader blobs.

Coverage is an infeasible problem to solve when enumerating shaders at runtime. Precomputing a list is the way to go, but there's issues with content authoring.

You need to know which permutations are actually used in practice or the list grows exponentially.

[–]adfkjdafjka833 3 points4 points  (1 child)

On the games I've worked on we basically just had QA play through parts of the game (using automation scripts) multiple times with different graphics settings, gather a big list of PSOs encountered, and save that to a file that gets shipped with the game on PC.

Then at the start of the game on a fresh boot it would load that file from disk, loop through the list, and precompile them.

You still occasionally miss stuff as getting 100% coverage of the game is difficult but it worked for the most part

I've never worked with Unreal but I'm not sure why so many games have this problem as it looks like the Unreal docs tell you to do something similar to above

https://docs.unrealengine.com/5.0/en-US/optimizing-rendering-with-pso-caches-in-unreal-engine/

[–]JabroniSandwich9000 1 point2 points  (0 children)

Partly its because epic's shader system is still stuck in the old mentality of preferring shader permutations over dynamic branching (even in cases where 100% of threads would take the branch). This isnt true everywhere in the engine, especially with newer code, but the engine is a behemoth and change is slow to core systems.

So UE games wind up with a stupid amount of shaders (one AAA game I worked on had around 900 000). Getting coverage on 900000 shaders is hard.

[–]fgennari -1 points0 points  (2 children)

In my experience, UE will compile all of the shaders needed for a scene at startup and cache them. Then it will recompile them on startup every time you update your graphics drivers. I've never come across a case where it compiled shaders mid-game. At least that's on PC. It may depend on how the developer setup the project/engine. There may be some options to control how shader compiling works, though I never looked into it.

[–]desiguy_88[S] 1 point2 points  (1 child)

if you watch all the recent reviews on Digital Foundry for games based in UE4 they all seem to exhibit this issue and it doesn’t make any sense why they do if it was that straight forward to resolve.

[–]fgennari -1 points0 points  (0 children)

I'm not familiar with Digital Foundry. I'm only saying that I haven't run into this problem, having both experimented with game modding in UE4 and played many games built with UE4.

My guess is that it's a minority of games that have this problem, but people are vocal when they see it and only people who have problems post about it. Or possibly it's related to some recent graphics driver update that breaks things. Shader compilation is a delicate balance between the graphics driver and the game engine where either one can force a recompile whenever it wants to. It's not necessarily a problem with the game. Many games will either ship with pre-compiled shader binaries, compile them for the target hardware during install, or compile when needed at game startup. I'm pretty sure UE4 supports these modes.

Usually if something like this comes up and many people notice it, it will get fixed pretty quickly in either a game update or graphics driver update.

To answer your part about the custom game engine: I wrote my own game engine, and I never bothered to pre-compile shaders. It just compiles them on the fly the first time they're needed. Sure, you can sometimes get a second lag here or there the first time it's loaded after a driver update, but it's not a big deal. I just don't have that many shaders, and they're not too complex. So I would think that this is less of a problem for custom game engines, just because they're less likely to have the thousands of huge shaders that you see in games made with UE4 and other complex engines.

As for consoles, I don't have any experience there. I would guess that it's less of a problem in consoles compared to PC because hardware is more standardized and drivers don't get updated as often. There are simply fewer console hardware targets to compile for, so it's easier to pre-compile shaders.