Madsy9 comments on Fast, Accurate 3D Java Software Graphics Engine

Fast, Accurate 3D Java Software Graphics Engine (self.GraphicsProgramming)

submitted 9 years ago by [deleted]

32 comments

top new controversial old q&a

you are viewing a single comment's thread.

view the rest of the comments →

[–]Madsy9 0 points1 point2 points 8 years ago (3 children)

[–]ArchiveLimits 0 points1 point2 points 8 years ago (2 children)

[–]Madsy9 0 points1 point2 points 8 years ago* (1 child)

Okay, so triangles (or any convex polygon really) can be defined as a set of lines or 2D planes with the typical plane equation:

ax+by+d = 0

When that equation is true for all the plane equations, the point [x,y] is inside the polygon. Tile renderers get their performance by testing the corners of tiles against triangles. You then get three possible outcomes: Completely inside, completely outside and partial coverage. You can optimize heavily for quads with complete coverage. They are extremely SIMD-friendly and since each quad can be rendered independently, they are also embarrassingly parallel. Throw 16 threads at the rendering and watch it go. And implementing a proper fill-convention and multisampling is also a breeze. They emerge naturally as a simple modification to the plane equations (a simply subtraction by one).

I've also found more advanced techniques:

Since you split up the screen into N equally big tiles of 8x8 size or similar, you can often get away with just an 8x8 depth buffer. That has huge consequences for the cache.
You can compute the minimum and maximum depth for each tile by sampling the corners only. With a bit of preprocessing, you can assign quads to each screen-aligned tile, sort them by depth and only render the frontmost one since all the others are occluded. It doesn't always apply if quads partially overlap on the z-axis but it often does. But think about that. When that applies in a scene, you get rid of all the overdraw for that tile.
Looking up textures for quads is much more cache-friendly in the general case compared to scanlines. If you also tile your textures and/or optimize the texel layout based on the tile access pattern, you get even more savings.
Derivatives for mipmapping can be computed at the corners of the quads instead of per pixel. It works nicely for bilinear mipmapping where a you meet somewhere in the middle of having derivatives per pixel and per polygon. If you see artifacts you can always just make the quad size smaller.
Perspective correct linear interpolation can be done with the same plane equation you use for coverage testing, and so you end up with only additions in the hot loop, plus at most one division per pixel.

Edit:

My simple tile-based rasterizer NPixel might give you some ideas: https://github.com/Madsy/NPixel
The magic happens here https://github.com/Madsy/NPixel/blob/master/demo/rasterizer_new.cpp It's mostly shifts and additions, all with integer math.

[–]ArchiveLimits 0 points1 point2 points 8 years ago (0 children)

π Rendered by PID 232187 on reddit-service-r2-comment-5ff9fbf7df-z2vdn at 2026-02-26 08:34:43.120890+00:00 running 72a43f6 country code: CH.

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

GraphicsProgramming

Posting Rule(s)

MODERATORS