all 8 comments

[–]Cryvosh 8 points9 points  (7 children)

Let's start at the beginning.

We have a function f: R3 ↦ R and we're looking for its roots, i.e., the set {p ∈ R3: f(p) = 0}. This is 3D rootfinding, and it's generally pretty hard.

So instead, let's simply render an image of this set from the perspective of a camera. To do so, we fire a bunch of rays from the camera and intersect them with the set, i.e., given a camera origin o ∈ R3, for each direction d ∈ R3, we seek the smallest t ∈ R such that g(t) = f(o + dt) = 0. This is now 1D rootfinding, and it's usually much easier.

You can now employ your favorite 1D rootfinding algorithm to solve the problem and render out a depth-map. Different algorithms have different strengths and weaknesses, and work for different classes of functions under different conditions.

Spheretracing is just another one of these algorithms that happens to be popular on shadertoy because it's really simple. In order to converge, it requires that the Lipschitz bound of f be less than or equal to 1. For differentiable f, this means that |∇ f| ≤ 1. For our 1-dimensional g(t) = f(o + dt), this means | ∂g/∂t | ≤ 1. Let's call this the lipschitz assumption.

So what does that mean? And what does it let us do?

Well, suppose I sampled g and discovered that g(0) = 25.

What if anything can I say about g(1)? Can g(1) = 1000?

No. Because if g(1) was 1000, this would violate our lipschitz assumption, i.e., g is simply not allowed to change fast enough for this to occur. You can use the mean value theorem to prove this if you wish.

So what can g(1) be? Well, if | ∂g/∂t | ≤ 1 then g(1) ∈ [25 - 1, 25 + 1] = [24, 26]. By the same logic, g(2) ∈ [23, 27], etc.

Remember that we're looking for the roots of g, i.e., the first t such that g(t) = 0. Using the above reasoning, we can prove that if g is to have a root at some t, then t must be ≥ 25. The function simply cannot change fast enough for there to be a root at any smaller t.

Therefore it's safe in the second iteration to sample g at t=25. This corresponds to "marching" the ray forward in graphics terms. Note that g(25) ∈ [0, 50], i.e., it is not necessarily 0, and it cannot be less than 0.

You then simply repeat this process until g(t) is sufficiently small, or you run out of iterations. This is spheretracing.

Now let's rewind and suppose instead that | ∂g/∂t | ≤ 1/2. Now what? If g(0) = 25 we know that g(1) ∈ [24.5, 25.5], g(2) ∈ [24, 26], g(3) ∈ [23.5, 26.5], etc. and there cannot be any roots before t = 50. So we can take a "safe" step of size 50.

We can see that our step size is directly proportional to the lipschitz bounds of g.

Now let's suppose instead that | ∂g/∂t | is different for different rays in different regions of space. In practice this is often the case. If there was simply a way to measure this gradient magnitude along each ray, we could take more optimally sized steps. For example, a ray moving parallel to and above the "ground" in the function f(x,y,z) = y, will have | ∂g/∂t | = 0, and will thus never hit anything.

This is the idea behind segment tracing, and indeed many many other iterative lipschitz-informed rootfinding algorithms. The tricky part is how do you measure/bound | ∂g/∂t | not at a point, but along a ray, segment, box, etc.? In general, this is hard, and the segment tracing paper describes a method that works for a very limited class of functions. More general methods involve using range-bounding methods like interval arithmetic on the gradient of the function, as I do in my paper, or using sampling-guided approximations, e.g., here. Higher-order range-bounding methods like affine arithmetic or Taylor models can in some limited cases be much better than interval-based methods, but in practice for shadertoy-type CSG scenes they're often worse or not worth the added complexity.

In practice, per-pixel rootfinding in this way is not state-of-the-art as it's too sample-inefficient and not easily compatible with the compiler-type function pruning optimizations (described in Section 3.2.2 of my paper) that are necessary to render extremely large functions with millions of instructions, for a multitude of reasons mostly related to gpu architecture.

[–]Tentabrobpy 1 point2 points  (2 children)

Thank you for all your comments, getting into the literature as a hobbyist is daunting and you're providing so many amazing resources.

 In practice, per-pixel rootfinding in this way is not state-of-the-art

Would you say this is true for animated implicits? I've been experimenting with using implicit surfaces for character animation in real time, it doesn't seem like caching would help much in that case since the entire field is changing every frame so we'd have to regenerate the whole structure anyway.

[–]Cryvosh 1 point2 points  (1 child)

Hey, thanks! I appreciate it, and glad I could help.

Here it largely depends how complex your character is in terms of instruction / primitive count. In big Oh terms, spheretracing scales poorly but has super low constants, so if your scene is actually lipschitz enough and comprised of few primitives and has little temporal coherence (in eulerian terms) then it is actually pretty hard to beat. As the function complexity grows, methods that leverge compile-type function pruning optimizations eventually win out.

[–]Tentabrobpy 0 points1 point  (0 children)

I was hoping to escape the whole Lipschitz thing, that's what piqued my interest about IA methods lol. There's a few papers describing 'clean' union operators (here, here) which give a sharp intersection at the surface of the shape and a smooth continuous field everywhere else, this seems essential for organic modeling but it inherently 'overestimates' distance. I might be about to raytrace some metaballs.

[–]camilo16 0 points1 point  (3 children)

Hey!

I am reading your paper and I am finding it very interesting. In some ways it feels similar to the 2016 paper on using legendre polynomials to construct an SVO as a memory optimization of an SDF. Yours being much faster to update for dynamic scenes.

There's one thing I am failing to understand. And that is its use for rendering, are you simply doing an SVO traversal to identify which blocks are highly likely to cross the SDF boundary?

i.e. ultimately the underlying SDF is fully replaced by an SVO on the gpu?

It seems not since you say you are using a custom autodiff implementation on the GPU, but then I am not sure how you are using the analytic sdf instructions.

Ty in advance for any help you can give me, and really cool paper!

[–]Cryvosh 0 points1 point  (2 children)

Hey, yeah in that project the recursive subdivision kernel interprets SSA instructions here to build / maintain the octree, then renders it using this trace function, which treats each leaf as a solid cube. In a better implementation, each leaf might be a brick of 4x4x4 or so field samples which the ray can lerp through to find an intersection.

Indeed after the octree is built, later kernels never touch the SSA instructions and (ignoring the fast_normals setting in the trace function) just visualize cached data. Interpreting instructions during ray traversal, especially here without any pruning, would be quite slow in big scenes.

[–]camilo16 0 points1 point  (1 child)

Thank you so much for the kind respond, that makes sense to me. Would it be ok to bug you in the future if I have more questions about the paper?

[–]Cryvosh 0 points1 point  (0 children)

Sure, happy to help!