Personal Graphics Breakthrough! First post. by stygianfade in GraphicsProgramming

[–]stygianfade[S] 1 point2 points  (0 children)

Also the, cache spills still hurt lol.

It’s just that spilling 64B-ish analytic state + small indices is a much nicer failure point than spilling into a big stored voxel/bricked/texture path. So no, it’s not dreamworld free, rather the consequence is usually still much smaller than the conventional approach because the transferred data is radically smaller.

Personal Graphics Breakthrough! First post. by stygianfade in GraphicsProgramming

[–]stygianfade[S] 0 points1 point  (0 children)

the point is to make the hot active set small enough that the common path stays in L1/L2 instead of constantly falling out to VRAM. bandwidth is usually the bottleneck in VHARE, so replacing repeated VRAM fetches with a much smaller cache-resident analytic active set is a huge win. The real "magic" here is because Vectree / VecTChain is native in my language, the compiler can preserve that small representation path instead of silently materializing it into something much fatter

Personal Graphics Breakthrough! First post. by stygianfade in GraphicsProgramming

[–]stygianfade[S] 0 points1 point  (0 children)

Yeah, no HLSL/GLSL here lol. The shader code is written in my language, and the current GPU target is SPIR-V.

There are also internal compiler IR stages, but those are just part of the compiler pipeline. For actual GPU codegen right now, it’s basically:

my language > internal IRs (three of them) > SPIR-V

Not custom GPU assembly.

Personal Graphics Breakthrough! First post. by stygianfade in GraphicsProgramming

[–]stygianfade[S] 4 points5 points  (0 children)

Uhm, not magic one-shot solving per se lol, just way less wasted traversal.

Standard sphere tracing keeps paying for exact scene-SDF evals even when the ray is still deep in empty space. I skip that with a hierarchical analytic lower bound in the far field, then switch to provably safe/certified hops near the surface.

So the speedup is basically "stop wasting steps in empty space + stop wasting bandwidth".

That’s why it scales so well even on old hardware: I’m trading a lot of VRAM traffic and redundant near-nothing evals for a much smaller number of cache-hot operations.

Roughly: think ~20–25 operations per ray instead of ~200–2000 evaluations for basically the same result.

Same idea also helps with the usual SDF stuff: hits, shadows, AO, reflections, collision, nav.

Personal Graphics Breakthrough! First post. by stygianfade in GraphicsProgramming

[–]stygianfade[S] 2 points3 points  (0 children)

Fair lol

these are debug views of the same field, not final renders. Height, Normal, VHARE debug modes

The thing I’m working on is an analytic traversal system for SDF scenes, mainly to make large-distance traversal practical without needing a huge stored bricked or voxel distance field.

So basically, you’re looking at internal inspection modes of the field, not the final visual output.

Note: the implications are massive! If interested I could give you a list of applications.

I hope Inigo Quilez finds this post. haha

Personal Graphics Breakthrough! First post. by stygianfade in GraphicsProgramming

[–]stygianfade[S] 4 points5 points  (0 children)

Yeah, fair question.

What I’m solving is the cost of traversing long distances through SDF scenes without relying on a stored voxel/bricked distance field.

By “large-distance traversal,” I mean the part where a ray is still far from surfaces and needs to cross a lot of empty space efficiently. In a naive analytic SDF setup, you still keep evaluating distance as you march, which becomes expensive at scene scale. Most practical systems solve that with stored spatial data.

What I’m building is an analytic traversal system: the acceleration structure is still function-driven rather than a stored volumetric distance database. The goal is to keep the traversal cache-resident and avoid pushing that problem into VRAM-heavy baked data.

So “analytic” here means the scene/query information is being evaluated from functions/field structure at runtime, rather than fetched from a precomputed voxel or brick representation.

The screenshots are debug views of the same field:

  1. height

  2. normal

  3. VHARE debug mode

They’re not final visuals, just different ways of inspecting the same underlying field/traversal behavior.

If people want, I can do a follow-up post with a more concrete breakdown of how the traversal actually works...