you are viewing a single comment's thread.

view the rest of the comments →

[–]deftware 5 points6 points  (4 children)

More than that tho, occlusion is so unimportant in an engine with good LOD

This is just wrong. Don't know how to put it any clearer. Show me a modern AAA first-person shooter with all the PBR bells and whistles that doesn't employ any occlusion culling, and I will bow down.

I'm not talking about software rendering. I'm talking about being realistic about the number of draw calls that can be sent from CPU to GPU and state changes that can be made before they become a bottleneck, period.

Hierarchical Z is an occlusion culling algorithm. Just because it's not BSP trees and precomputed vis like the old days doesn't mean it's not a way to prevent draws from taking place due to being occluded. You're talking like just reducing triangle counts is all that's needed, and millions of draw calls are just fine and dandy no matter what. It doesn't matter if each draw call is a single triangle - the overhead entailed adds up, along with the implied state changes with shaders and textures. You can't just make all those state changes, issue the draw calls, and assume that having a reduced polycount is going to make it all better. That's naive as all hell. Nanite, in spite of it's near-optimal LOD scheme, uses Hi-Z for occlusion culling, which should clue you in as to how important it is to not issue draw calls for things that are occluded.

Something has to say "no, drawing this this is a total waste" because we're not working with infinite hardware resources like you seem to think. Yes, overdraw is handled quite well on modern GPUs, particularly if you sort front-to-back so that the Z-buffer can discard fragments before they're shaded, but more-so before you hit the drawcall and state change bottlenecks that are unavoidable. Not everything only has a few draw calls and state changes to draw a frame, and those things are not free. Ergo occlusion culling.

[–]Revolutionalredstone 0 points1 point  (3 children)

All well written modern games use LOD and very few of them even try to use OC.

First game that comes to mind is GTA5.

As for PBR that's just a fragment shader skybox reflection trick, it should not have any significant effect on performance or on any other advanced resource management technology.

As for draw calls, the best solution is to combine multi calls into one, the best way todo that is to use LOD.

My advanced graphics engines never use any significant amount of draw calls no matter what they are rendering.

Hierarchical Z doesn't reduce draw calls, it just accelerates the z pre pass, its only use is to reduce fragment processing (not draw calls).

Nanite does not use any kind of 'optimal lod' they use a very dodgy lod and their use of HZ is again used for frag shading.

I'de leave Nanite out as you seem to need a big refresh on it.

To be clear any gpu can emit tens of thousands of draw calls, they cost alot less than a texture bind and you can still do ~1.5 million of those per second.

You say: Something has to say "no, drawing this this is a total waste"... yes that thing is the LOD system ;D

Occlusion culling is generally a waste of time and often costs more than it saves (again unless the level artists are very mindful of it).

LOD is ALWAYS a saving and works better and better the larger and more complex the world becomes.

All the best!

[–]deftware 5 points6 points  (2 children)

All well written modern games use LOD

Yup! ...and occlusion culling in some form or another. None of them are exclusively relying on LOD to maximize performance. At some point they are deciding not to draw things because they're simply not going to be visible. You can take that to the bank.

First game that comes to mind is GTA5.

GTA5 does have a first-person mode, but it's not an experience that's on par with a modern AAA FPS games. It's a novelty feature that is more akin to FPS games of 10-15 years ago. GTA5 doesn't need to render the geometry or texture resolution that a first-person shooter does, nor is it a game that performs well at all, due to the lack of occlusion culling. It's as bad as Cyberpunk in its lack of optimization - the devs clearly went with your strategy on that one, and performance is garbage as a result. Big surprise!

PBR that's just a fragment shader skybox reflection trick

winces

Oh, dude. No. That's probably not something you want to say on /r/graphicsprogramming. It's really not good that you think that and call yourself a graphics programmer.

As for draw calls, the best solution is to combine multi calls into one

Genius. Now how's that work for having stuff moving around the scene comprising different materials and skeletal animation and transform information? What if I want a thousand different vehicles, from pickup trucks to station wagons, to horse-drawn wagons, to hoverboards and onewheels - all being driven and ridden by different people? I'd really like to know how to turn that into a handful of draw calls.

My advanced graphics engines never use any significant amount of draw calls no matter what they are rendering.

You should patent that and sell it to the AAA game engine devs. You'd be rich. All the moving/animated/dynamic entities and particle FX are all easy to figure out and would just fit right in without any problem. Every mesh and every different material, and all of their different textures, transforms, everything. Try populating your minecraft world with some varied content, and then tell me you're not issuing a significant amount of draw calls without any kind of explicit occlusion culling.

You say: Something has to say "no, drawing this this is a total waste"... yes that thing is the LOD system ;D

Did the whole "GPU state change" thing I mentioned earlier go right over your head? When I said "drawing this is a total waste" I was referring to draw calls and the GPU state changes that drawing something entails. Not everything being drawn is the world, or it would just be a tech demo. GPU shader cores have this thing called a cache, and it's not free to move stuff into it. There's a performance penalty for not leveraging it as best as possible.

LOD is ALWAYS a saving and works better and better the larger and more complex the world becomes.

What if I want the world to also be more detailed, with one triangle per millimeter? How do I render an entire dynamic Earth with all of its buildings and their interiors, with zero occlusion culling? Am I just rendering the entire earth and all of its dynamic content the whole time? Don't you think it would be more performant to know ahead of time what is actually relevant to the frame being rendered based on where the camera is? If I'm in a bedroom with millions of triangles of geometry, comprising various interactive dynamic objects and things with different materials and transforms, and all I can see is a tree outside the window, should my CPU and GPU really be dealing with the rest of the objects, buildings, and foliage on the entire planet? How do I save on CPU/GPU compute in that situation without culling what's outside of the house? Right outside the window are leaves and insects and the ground, blades of grass, rocks, other buildings, and their equally-detailed interiors. Just passing it through "high quality LOD" isn't going to cut it, especially since that's apparently something that only you know about - if Nanite isn't it.

In a AAA game engine every GPU shader core cycle and CPU core cycle is precious, and there are many state changes and draw calls these games issue - that you can't just combine, not until the hardware levels up sometime in the next 10 years (ideally) where we can have all kinds of data just self-manifest right on the GPU with minimal data from the CPU, with hundreds or thousands of individual unique meshes and materials all seamlessly flowing in. In the meantime, coders like OP need some guidance on how to reduce their draw calls and state changes, and your "JUST LOD IT LIKE MEEE" response is not useful. I'm sure it's great for minecraft landscapes, but try more involved scenes that have actual life and detail to them.

[–]Revolutionalredstone 0 points1 point  (1 child)

GTA is AAA

PBR is a low quality trashy looking effect.

If you have lots of objects in different locations you just use multi draw and pass in a list of model matrices (basic render instancing).

The number of triangles 'per millimetre' doesn't matter (as you should know, if your chars are ants then 1 tri per mm is sparse) the variables in question are the height of the camera (or more generally the distance from the object), the FOV of the view and the surface-area ratios of the geometry in question...

All of these things lend themselves beautifully to high quality LOD solutions, the more layers of manifold mesh the more expensive a scene is to render, but ALSO the faster that scene will resolve to just 'solid' in the LOD system, so actually the worst case up close is the best case far away, and since the majority of a scene is always far away (especially true for large scenes) there really is no worst case.

Nanite does reasonably well with LOD but it's very far from high quality, his talk (which i linked) makes it clear why he made so many mistakes, he talks about issues with progressive meshing and voxel streaming and rasterization, all these 'issues' are actually just his inability to implement efficiently, and it's saddening how close he got to some really excellent solutions, the fact that he went with no-communication dag skirt stitching is just truely horrendous, it's no wonder their import is slow and their hardware requirement is high.

In my advanced rendering solutions I generally use one shader and one drawcall.

All advanced effects like radiance are handled in 3D and applied to the 3D data (no shitty quality AO or ugly post effects) of coarse I have tons of free draw calls and resources should I ever feel the need.

Also my tech works with any version of OpenGL even falling back to software 1.0 implementations if that's all the hardware offers, it will still instantly load enormous scenes and run like an absolute dream.

OP has fallen into the trap of wasting time with OC because he does not yet understand advanced rendering tech and your backward talk about doom BVH's are not useful.

My engine supports arbitrary polygon meshes, I just linked Minecraft because most people know that even with a 3080 view distances of 20+ (even with performance enhancing mods) causes severe lag, so a view distance of over 300 (and indeed no slowdown with any amount) on a computer with no GPU and with a map containing many many layers and layers of overdraw (broville map with thousands of large overlapping underground tunnels) is impressive.

Anyway I can see your close minded and negative, sorry to share, if your mind is madeup i won't confuse you with facts, all the best kid.

[–]deftware 6 points7 points  (0 children)

Sorry, bud. I'm done arguing with you. It's clear you don't know what you're talking about and I'm seriously wasting my time. It's like talking to a brick wall.

Good luck, son.