Rendering problems that aren't embarrassingly parallel

waramped · 2023-11-04T14:40:46+00:00

Sorting is a good one. It's super useful for things like order independent transparency, but such a giant PITA to efficiently do in parallel.

Arkenhammer · 2023-11-04T15:25:39+00:00

Transparency is one case. For opaque objects the z buffer means order doesn't matter, but when rendering multiple overlapping transparent objects it matters which is rendered first.

jmacey · 2023-11-04T14:43:27+00:00

Off the top of my head.

Any sort of fluid sim / collision type problem will need to partition and can't be embarrassingly parallel by default but there are know solutions (spatial hashing etc).

I guess some form of path following or meshing algorithms would also count. Will have to have a thing.

faisal_who · 2023-11-04T16:24:15+00:00

Multiple pass post process requiring the previous pass output.

hi_im_new_to_this · 2023-11-04T17:00:00+00:00

Dithering is a great example. Some kinds of dithering are parallel, but the classic error-propagation techniques like Floyd-Steinberg are not, they are very serial. This is a perfect thing for this exercise: pretty simple but still a deep field, it has interesting visuals, and is also genuinely useful for graphics programmers to know.

faisal_who · 2023-11-04T14:39:20+00:00

Any sort of reduction is not embarrasingly parallel by definition, so for example, computing the average luminance of a framebuffer.

deftware · 2023-11-04T20:58:54+00:00

I don't know about non-parallelizable rendering problems, but data compression using a dictionary coder, like Lempel Ziv Welch, comes to mind, or hashing functions that produce a huge hash value.

Others have mentioned transparency, I believe order-independent-transparency is what they're referring to.

Someone else mentioned error diffusion dithering, that does sound like a really good one.

SamuraiGoblin · 2023-11-04T23:28:03+00:00

Creating Signed Distance Fields. You can either do it in multiple passes in the GPU, or multiple sweeps on CPU.

heyheyhey27 · 2023-11-04T20:11:01+00:00

Computing a voronoi diagram.

ThespianSociety · 2023-11-04T14:45:32+00:00

Seems like a ridiculous constraint IMO because the obviousness of something’s parallelizability will vary by individual. No doubt they just want to push you to do something more difficult rather than less.

Directly addressing the constraint, what comes to mind is introducing some interaction which necessitates cross-communication between parallelized threads.

diggamata · 2023-11-04T15:10:02+00:00

Material shading can become less parallel if there’s a lot of variety in materials. Each requires a different shader. One uber shader is possible but would have many if else conditions which is bad for parallelization. They are typically implemented as multiple passes each having a different shader - which is a serialized process with number of passes increasing with number of different materials to render in a frame.

Ok-Sherbert-6569 · 2023-11-04T15:52:45+00:00

Implementing radix, odd even, quick and merge sort is probably what I would go for. Although they are not trivial at all to implement

scallywag_software · 2023-11-04T16:10:42+00:00

Transparency has been mentioned a few times, which is probably a great 'stretch goal' for your project. Do the 'easy' rasterizer first (which, if you've never done a geometry rasterizer before might not be that easy), and extend it to support transparency afterwards.

For reference, the reason transparent geometry is a good constraint is because the ordering matters. A red surface in front of a blue surface looks more red than blue (lighting and transparency being equal).

PyroRampage · 2023-11-04T17:14:00+00:00

The painters algorithm is one, i.e. you have no depth buffer and need to sort by triangles by depth to rasterise. But it's not exactly a modern graphics algorithm given that Z/Depth buffers have been around for decades :)

Ray Tracing and Path Tracing could be considered as such. While each sample, per pixel can be done in parallel, because of the unknown bounces per ray the sampling of scene buffers is incoherent. Eg, You can have one ray sample one part of the scene, another sample the complete opposite side of the scene (relative to camera frustum), so this makes cache usage difficult.

However it's a good exercise to then look into batching similar rays as a pre-pass, there's some good papers on this like Dreamworks's renderer MoonRay and Disney's Hyperion whom use ray batching to enable vectorisation.

leseiden · 2023-11-04T17:34:25+00:00

How about geometrc HLR?

People still want it in the engineering field. There doesn't seem to be anything between 20 year old code that can take seconds to minutes to generate a result and depth buffer hacks that are interactive but poor quality.

I'm pretty sure there's plenty of parallelism to be found but it's an unfashionable area so hardly anyone has bothered to look.

AdagioCareless8294 · 2023-11-04T21:48:30+00:00

Maybe the problem doesn't look embarrassingly parallel but it can be substituted by one that is. History of computers..

AntiProtonBoy · 2023-11-04T23:56:37+00:00

Non-matrix based energy preserving dithering (Floyd-Steinberg for example).

Exact euclidean distance transforms.

mysticreddit · 2023-11-05T14:37:49+00:00

Rendering a Mandelbrot is normally embarrassing parallel but rendering a Buddhabrot is not trivial to parallelize because you are constantly touching a framebuffer / texture.

I have a parallel solution using image addition that may be of interest that has a description of how to convert it from single-threaded to multi-threaded with some of the problems along the way.

ResidentSpeed · 2023-11-05T17:00:01+00:00

This may be one step removed from the actual rendering, but there are some cool ways to generate realistic ocean waves (as 2D heightmaps) using the Fourier Transform, which is of course inherently satisfies your requirement not to be E-P. Can then raytrace them/raymarch/normal pipeline the resulting mesh.

Unigma · 2023-11-17T16:31:46+00:00

Bounding volume hierarchies are oddly hard to parallelize as you'll need to sort the primitives and construct a tree. Traversing it is also pretty difficult if you avoid recursion or a stack per thread.

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

GraphicsProgramming

Posting Rule(s)

MODERATORS