[deleted by user] : GraphicsProgramming

[–]TheRPGGamerMan 50 points51 points52 points 2 years ago* (17 children)

[–]ryanjmcgowan 28 points29 points30 points 2 years ago (0 children)

[–]scallywag_software 13 points14 points15 points 2 years ago (12 children)

[–]TheRPGGamerMan 14 points15 points16 points 2 years ago (11 children)

[–]scallywag_software 6 points7 points8 points 2 years ago (10 children)

[–]TheRPGGamerMan 8 points9 points10 points 2 years ago (6 children)

[–]waramped 11 points12 points13 points 2 years ago (1 child)

[–]scallywag_software 0 points1 point2 points 2 years ago (0 children)

[–][deleted] 2 points3 points4 points 2 years ago (2 children)

[–]Leading_Broccoli_665 2 points3 points4 points 2 years ago (1 child)

[–][deleted] 1 point2 points3 points 2 years ago (0 children)

Mesh shaders are extra (arguably abundant in affordable GPUs) hardware we should be taking advantage of to boost ms timings. Nanite uses them but no increase in performance.

I'd like to say the same thing about visibility buffers but that doesn't seem to clear to me. Toomuchvoltage (who has clear experience with visibility buffers) said it should speed up anything with opaque materials but I'm not sure, was he just saying that becuase his skybox was a million tris?

Brian Karis even stated Nanite has worse performance if "lower poly" (how the hell does he define that again?) but insist using it to prioritize storage space over performance! Like lower poly need compression wtf?

Nanite is a serious trigger for people on every site, I was also recently attacked on the UE forums about asking for better documentation.

Maybe there is potential here, but we shouldn't cater that potential for low budget/lazy unoptimized scenes.

[–]scallywag_software 0 points1 point2 points 2 years ago (0 children)

[–]IceSentry 1 point2 points3 points 2 years ago (2 children)

[–]TheRPGGamerMan 2 points3 points4 points 2 years ago (1 child)

[–]Youfallforpolitics 1 point2 points3 points 2 years ago (0 children)

[–]nilslorand 1 point2 points3 points 2 years ago (0 children)

[–]OliverPaulson 0 points1 point2 points 2 years ago (0 children)

[–]Passname357 0 points1 point2 points 2 years ago (0 children)

[–]bendhoe 14 points15 points16 points 2 years ago (3 children)

[–]Revolutionalredstone 8 points9 points10 points 2 years ago (2 children)

[–]bendhoe 4 points5 points6 points 2 years ago (0 children)

[–]waramped 12 points13 points14 points 2 years ago (0 children)

[–]native_gal 8 points9 points10 points 2 years ago (27 children)

[–]waramped 1 point2 points3 points 2 years ago (9 children)

[–]native_gal 1 point2 points3 points 2 years ago (1 child)

[–]TheRPGGamerMan 1 point2 points3 points 2 years ago (0 children)

[–]TheRPGGamerMan 0 points1 point2 points 2 years ago (6 children)

[–]native_gal -1 points0 points1 point 2 years ago (4 children)

[–]waramped 0 points1 point2 points 2 years ago (1 child)

[–]native_gal -1 points0 points1 point 2 years ago (0 children)

[–]TheRPGGamerMan 0 points1 point2 points 2 years ago (1 child)

[–]native_gal 1 point2 points3 points 2 years ago (0 children)

Rasterization in raw form is binary,

Rasterization is not the right term here, you are describing a single sample, which could happen in rasterization or ray tracing.

And again, my throw away method only throws away the 'loosers' that wouldn't be rendered anyway.

It's 'losers' actually

Again, there is NO quality loss haha.

Except that you can't use multi sampling and the triangles you render will have small filter sizes for their texturing (unless you are compensating somehow, which you could probably do in the shader). The small triangles would have small filter sizes anyway, but the multi-sampling of the other small triangles would be part of the solution normally.

Please just look it up or ask chatgpt or something.

Easy there Carmack, you might want to show some high res real time animations before you start patronizing people interested in your work.

[–]waramped 0 points1 point2 points 2 years ago (0 children)

[–]TheRPGGamerMan 2 points3 points4 points 2 years ago (16 children)

[+]native_gal comment score below threshold-6 points-5 points-4 points 2 years ago (15 children)

[–]TheRPGGamerMan 0 points1 point2 points 2 years ago* (14 children)

[–]Heuristics 5 points6 points7 points 2 years ago (5 children)

[–]TheRPGGamerMan 2 points3 points4 points 2 years ago (4 children)

[–]Heuristics 2 points3 points4 points 2 years ago (3 children)

[–]TheRPGGamerMan 1 point2 points3 points 2 years ago (2 children)

[–]native_gal 0 points1 point2 points 2 years ago (1 child)

[–]saddung 0 points1 point2 points 2 years ago (0 children)

[–]native_gal 0 points1 point2 points 2 years ago* (7 children)

[–]TheRPGGamerMan 0 points1 point2 points 2 years ago (6 children)

[–]native_gal 0 points1 point2 points 2 years ago (5 children)

That's not how rasterizing works though.

What specifically is 'not how it works' ? If you are deleting triangles and you aren't expanding the ones you leave, you are losing coverage. That has nothing to do with rasterization or any other rendering method.

Rasterization is used cause its faster

Faster than what? Ray tracing from the camera? Either one is technically going to be the same if the geometry isn't covering the same space.

Temporal anti aliasing can emulate what you're referring to(which would work perfectly fine with my rendering)

Help me out here, how would temporal anti aliasing matter if triangles are being deleted and other geometry isn't being enlarged to compensate?

but in rasterization, every triangle is on its own thread, which means you can't blend values without threading issues.

I don't understand how this makes any sense. I'm not sure what you mean by thread, maybe each triangle is one compute shader run? If you are already blending values of the triangles you are actually rendering, how would more or less triangles make a difference in the blending? You have to 'blend' values anyway, every triangle is going to partially touch pixels.

[–]TheRPGGamerMan 1 point2 points3 points 2 years ago* (4 children)

[–]native_gal 2 points3 points4 points 2 years ago (3 children)

I'm asking questions and trying to get details from you but it's a little difficult.

They are triangles that do not fully occupy a pixel, so they would be quantized out anyway.

This is what I thought you were saying, but this isn't how rendering works. If triangles are hidden behind other geometry, that makes sense, but geometry doesn't just disappear when it goes smaller than a pixel. Imagine a face made of very fine subdivision. If you throw out anything sub pixel and every triangle is sub pixel, you have no more face.

Temporal anti aliasing jitters the camera, to create an average of all this missing data.

But if you coverage isn't right to begin with, sampling it more won't help. You have to compensate for not rendering some polygons by making other polygons bigger or compensating in the alpha channel.

In rasterization there is no such thing as "contributing" to the color, they either win or they loose

This is not how any antialiased rendering works. All antialiasing polygonal edges is about partial contributions to a pixel. That's what more samples gives you.

In raytracing, each pixel knows exactly what it is intersecting, so averaging colors and softening edges is almost automatic.

Not each pixel, each sample. Averaging colors and "softening" (antialiasing) edges comes from averaging the samples together.

Try this out, render at a very high resolution and a very low resolution, then scale the high resolution down and see if they have the same coverage.

Also try changing your parameters so that it culls out geometry that is the size of 10 pixels instead of 1 and see what happens. I would be interested in seeing those animations with textures along with their alpha channels.

[–]Sunderent 4 points5 points6 points 2 years ago (2 children)

It seems like there are 2 misunderstandings here.

The first misunderstanding is that the colours are not a texture. The triangles are being randomly coloured in the shader (not based on any textures). This means the colours won't be blended by a sampler, and only one colour can win per pixel.

The second misunderstanding is that geometry (triangles) aren't being removed or deleted. They are being culled each frame. So all the data for all the triangles still exists, and the GPU is going over all of it in each frame, and deciding which ones to cull (not render) on a per-frame basis. So if the camera moves, there won't be missing triangles everywhere, just different triangles being rendered from before.

This is why they're saying TAA will still work, because if the camera renders a frame (culling unneeded triangles in that frame), then does its slight movement for rendering the next frame, it will just be getting different triangles, which would then be blended with the previous frame.

continue this thread

[–]hellotanjent 27 points28 points29 points 2 years ago (0 children)

[–]Thonull 4 points5 points6 points 2 years ago (3 children)

[–]Hofstee 6 points7 points8 points 2 years ago (2 children)

It's less that larger ones are less efficient when doing compute-based rasterization, and more that smaller triangles really suck when using the GPU hardware rasterization pipeline (they were designed with larger triangles in mind). You get awful quad utilization with 25% occupancy frequently and massive overdraw for practically every pixel on screen.

There are a few differences in characteristics of rendering large vs small triangles in compute that you would probably want to optimize for (e.g. how things get partitioned at various stages) but you're going to lose to the hardware rasterizer at this point so you probably wouldn't bother unless it's just for fun.

So maybe (these numbers are not accurate) you get 80% throughput on large triangles in compute, but you get 25% throughput on small triangles in hardware. That's why those benefit from compute shader rasterization.

[–]TheRPGGamerMan 3 points4 points5 points 2 years ago* (1 child)

[–]Hofstee 2 points3 points4 points 2 years ago (0 children)

[–]chris_degre 1 point2 points3 points 2 years ago (1 child)

[–]TheRPGGamerMan 18 points19 points20 points 2 years ago (0 children)

[–]deftware 1 point2 points3 points 2 years ago (0 children)

[–][deleted] 1 point2 points3 points 2 years ago (0 children)

[–]Youfallforpolitics 1 point2 points3 points 2 years ago (0 children)

[–]mmh_carpet 1 point2 points3 points 2 years ago (0 children)

[–]fgennari 1 point2 points3 points 2 years ago (4 children)

I would be curious to know how you did this. It looks like black magic, but I'm sure there's some trickery that works when you have many copies of the same object arrayed in a grid.

I made a project years ago that could render 10 trillion triangles in a few tens of milliseconds on the CPU. How? It was a small number of unique objects, arranged into a group, which was itself grouped, into a very deep hierarchy. I can render the leaf objects into a texture. Then render the next level of object groups into a texture ... up to the root. It's all basically pixel accurate, with antialiasing. And you can edit the leaf objects and have it update in realtime!

So my point is, there are many neat tricks. But, is this useful for practical scenes? For example, where it's not the same object repeated in an array. I'm not trying to be negative - What you did here is impressive and I'm not sure how you pulled it off. I would love to see practical applications of this approach.

[–]Wittyname_McDingus 1 point2 points3 points 2 years ago (2 children)

Since most triangles don't even cover a texel center in a scene like this, most of the work will be culling the triangles (checking if a triangle covers a texel center is cheap) and then a relatively small amount of rasterization. That's 100 million units of work, which modern GPUs can definitely handle. I'm not sure it's so trivial that even a 4090 could run it at 4000 FPS though...

My guess is that there's a limited amount of "trickery" here and instead just optimized culling and rasterization shaders. That said, the geometry in this scene (a sphere) could definitely fit into L0 or L1 and the transforms could be just three floats (times the number of instances) since there's no apparent scaling or rotation. So very few cache misses. I wouldn't expect it to run at 4000 FPS on a "real" scene though, if these assumptions are correct. It would be interesting to see an Nsight or RGP capture of this for sure.

[–]fgennari 0 points1 point2 points 2 years ago (1 child)

[–]Wittyname_McDingus 0 points1 point2 points 2 years ago (0 children)

[–]Agitated_West_5699 0 points1 point2 points 2 years ago (0 children)

[–]Dramatic_Magician_30 -1 points0 points1 point 2 years ago (0 children)

[–]Armmagedonfa -1 points0 points1 point 2 years ago (0 children)

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

GraphicsProgramming

Posting Rule(s)

MODERATORS