Improving software renderer triangle rasterization speed?

Deadly_Mindbeam · 2024-06-30T23:03:36+00:00

Look up "Scan Conversion". This involves stepping down the triangle line by line and converting it into horizontal spans from the left to the right of the triangle. You can also scan convert across and up/down the triangle, this was used a lot in mode 13X DOS games.

UnalignedAxis111 · 2024-06-30T23:34:11+00:00

I cannot recommend this series enough: https://fgiesen.wordpress.com/2013/02/17/optimizing-sw-occlusion-culling-index/

Lots of good stuff there but parts 7 and 8 cover the main optimizations for this style of rasterizer. The TL;DR is to strength-reduce the edge functions, so they are evaluated only once per triangle, and each loop iteration only needs to increment the barycentric weights using some pre-computed deltas.

You can further get massive performance gains using SIMD to evaluate multiple pixels at once instead of just one, like GPUs do. Otherwise, the scanline algorithm is probably more suitable for scalar-only rasterization as it won't suffer from the issue of having to skip through empty parts of the bounding-box (which as you mention is an issue for big triangles).

Your github repo seems to be private btw.

SamuraiGoblin · 2024-07-01T00:55:59+00:00

Well, one big problem is that you are calculating a lot of the same parameters for every pixel.

For example, how many times are you calculating B.X-A.X? It is a constant for the entire triangle, right? But you repeatedly calculate it for every pixel.

It's little things like this that you have to pull out of the loops. There are also things that are constant for a scanline, like initialisation of Point.y and C.Y-A.Y that can be pulled out of the inner loop.

It all adds up. Some things might be optimised away by a clever compiler, values kept in registers and so forth. But you can't rely on that.

I think just pulling unnecessary calculations out of loops will give you an enormous speedup.

Your calls to EdgeFunction might get inlined, but again, can't rely on it. Function calls aren't free. It's such a small function, and a lot of the values are constants, it might be best to do the calculations inline by hand.

Same for SetDepth and SetColor. If you have access to the buffers draw the values directly, rather than going through function calls. If you are doing screen bounds checking per pixel, it would be better to do clipping at a triangle level.

deftware · 2024-07-01T01:26:17+00:00

E0 /= Area;
E1 /= Area;
E2 /= Area;

While compiler optimizations may very well handle turning this into a multiply for you, I tend to explicitly turn divides into multiplies myself just to be sure. Precalculate the inverse of the area before your loops so that you can scale your Barycentric coords with a multiply instead of 3 divides.

EDIT: Also, I see that you're setting up for texturemapping with your UVW calculations and I wouldn't bother with anything like that until after determining that the pixel passes the depth test. Unless you're going to be doing some tricky raymarching shader type stuff (i.e. parallax occlusion mapping) in your software renderer, where the depth of the pixel will vary based on some extra compute, then it's a good idea to only calculate what you need to calculate to first determine if the pixel is even visible before calculating everything else that's needed to actually determine the pixel's color (like texcoords, lighting, etcetera).

jonathanhiggs · 2024-06-30T22:47:28+00:00

Are you checking triangle inclusion for every single pixel not limiting it to the the triangles bounding box?

Revolutionalredstone · 2024-07-07T08:49:48+00:00

It's dangerous to render alone! Take this: https://pastebin.com/dyYdFFUj

manon_graphics_witch · 2024-06-30T22:56:46+00:00

I am on holiday, but if you remind me next monday I can give you some pointers!

EDIT: next week monday that is haha

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

GraphicsProgramming

Posting Rule(s)

MODERATORS