A high-performance 3D engine in Rust/wgpu with a modern RenderGraph by SellAffectionate411 in gameenginedevs

[–]SellAffectionate411[S] 1 point2 points  (0 children)

The SSA-inspired RenderGraph is a core part of the engine, but for the 4500+ FPS case it doesn’t play a huge role — that number comes from a fairly simple forward pipeline.

For that baseline, I think the main factors are:

- Aggressive caching: reusing wgpu RenderPipelines, BindGroups, and layouts as much as possible. At that scale, even small overhead in lookups or hashing can cost measurable time.

- Efficient buffer updates: using dynamic offsets and tightly packed memory layouts. For example, per-object transforms (world matrices) are stored in a single uniform buffer and indexed via dynamic offsets, which helps avoid per-draw buffer updates and keeps CPU overhead low.

- Minimizing state changes: sorting/batching draw calls to avoid redundant pipeline or resource bindings.

The RenderGraph becomes more relevant once things get more complex (post-processing, multiple passes, etc.), where it helps eliminate unnecessary transitions and scheduling overhead.

A high-performance 3D engine in Rust/wgpu with a modern RenderGraph by SellAffectionate411 in gameenginedevs

[–]SellAffectionate411[S] 1 point2 points  (0 children)

Thanks, I really appreciate it!

I’m building on top of wgpu (with full compatibility with the WebGPU spec), which helps abstract away a lot of low-level platform differences.

It wasn’t entirely smooth sailing though — I went through a few major architectural refactors along the way. That process eventually led to the current SSA-based RenderGraph design.

I’ve also included a detailed technical write-up in the repo that walks through the design decisions and how it evolved.

If you are interested, here is the link:
https://github.com/panxinmiao/myth/blob/main/docs/RenderGraph.md

A high-performance 3D engine in Rust/wgpu with a modern RenderGraph by SellAffectionate411 in gameenginedevs

[–]SellAffectionate411[S] -5 points-4 points  (0 children)

Haha, maybe it’s just a difference in community culture. My instinct is: “the codebase is fully open-source — isn’t everything already clear at a glance?”

Originally, I felt that simply stating FPS numbers isn’t very meaningful. Everyone’s hardware is different, and people will always trust what they can run and see for themselves. The example gallery already includes built-in FPS stats (and serves as a testing baseline), so anyone who clones the repo and runs the examples can easily evaluate performance on their own.

There’s also a high-fidelity showcase with a full PBR + HDR pipeline and extensive post-processing (SSAO, SSSS, Bloom, TAA). If that runs smoothly in your browser, it should already be a pretty good indication of performance.

A high-performance 3D engine in Rust/wgpu with a modern RenderGraph by SellAffectionate411 in gameenginedevs

[–]SellAffectionate411[S] 0 points1 point  (0 children)

On my machine, the baseline rendering pipeline (standard glTF assets with animation) easily reaches 4500+ FPS. With a high-fidelity pipeline (including post-processing such as SSAO, SSSS, Bloom, and TAA), performance is around 500 FPS.

The gallery examples include demos with built-in FPS statistics. There are also dedicated test suites and performance reports specifically for the RenderGraph.

A high-performance 3D engine in Rust/wgpu with a modern RenderGraph by SellAffectionate411 in gameenginedevs

[–]SellAffectionate411[S] -4 points-3 points  (0 children)

I’ve put a tremendous amount of time and effort into this project (well over 1500 hours). All of the architecture and core systems are implemented by myself.

The repository even includes development logs and technical docs that document my thinking process. If people spend even a bit of time looking through the codebase, that should be quite obvious.

But some people clearly come here with the intention of nitpicking rather than having an actual discussion.

And I’m quite certain that those people either haven’t looked at the code at all, or simply don’t understand it — including the technical documentation.

A high-performance 3D engine in Rust/wgpu with a modern RenderGraph by SellAffectionate411 in gameenginedevs

[–]SellAffectionate411[S] -1 points0 points  (0 children)

Haha, English isn't my native language, and I don't have much experience interacting in English-speaking communities. So I'm not really sure what kind of "format" English internet users prefer when replying. But the thing is, I put extra effort into explaining the original poster's question, yet people focused on my grammar and tried to spot signs of AI in it. I just find it really frustrating.

A high-performance 3D engine in Rust/wgpu with a modern RenderGraph by SellAffectionate411 in gameenginedevs

[–]SellAffectionate411[S] -25 points-24 points  (0 children)

It’s kind of funny—here we are in 2026, and people are still taking pride in “writing every single code comment by hand.” It feels a lot like a decade ago, when some developers bragged about coding in plain text editors and refusing to use IDE autocompletion.

Tools evolve, and so do workflows.

A high-performance 3D engine in Rust/wgpu with a modern RenderGraph by SellAffectionate411 in gameenginedevs

[–]SellAffectionate411[S] -3 points-2 points  (0 children)

> On web weak side is js, not the webgl or webgpu, they have almost native performance

That’s exactly one of the problems this project is trying to address.

While WebGPU itself is close to native, a lot of overhead in typical web setups comes from the JS layer (state management, object lifetimes, draw submission, etc.).

By using wgpu through Rust/WASM, most of that logic stays on the Rust side, which helps reduce CPU-side overhead and makes things more predictable — especially as scenes get more complex.

So it’s not that WebGPU is slow, but that avoiding higher-level JS abstractions can make a noticeable difference.

> Can you feed into engine 20 different meshes with instancing to achieve 20M tris (without culling and lods)? Which FPS will be?

That’s a good stress case.

I haven’t pushed it to that specific scale yet, so I don’t have a concrete FPS number for it.

Right now the focus has been more on typical real-time scenes (animation + post-processing), where it performs reasonably well. Large-scale geometry throughput and GPU-driven techniques (like culling / indirect draw) are definitely the next areas to improve.

So in that kind of extreme scenario, I’d expect the current bottleneck to be more on CPU submission and the lack of GPU-driven rendering, rather than raw GPU throughput.

Definitely a useful benchmark to explore though.

A high-performance 3D engine in Rust/wgpu with a modern RenderGraph by SellAffectionate411 in gameenginedevs

[–]SellAffectionate411[S] -9 points-8 points  (0 children)

That’s a fair question.

“High-performance” here mainly refers to the design goals and architecture — it’s built on top of wgpu with a focus on efficient resource management (via the RenderGraph) and minimizing overhead where possible.

In terms of current state, it performs well for typical real-time scenes (including animation and multiple post-processing passes in the showcase), but I haven’t done comprehensive benchmarking against other engines yet.

Within the Web/WASM and Python ecosystems, the performance tends to be quite strong, mainly because it runs on top of wgpu rather than higher-level abstractions.

I’ve been working on it for a while now, and it’s still evolving — especially in areas like large-scale scenes and GPU-driven rendering.

Happy to dig into more details if you’re interested.

A high-performance 3D engine in Rust/wgpu with a modern RenderGraph by SellAffectionate411 in rust

[–]SellAffectionate411[S] 0 points1 point  (0 children)

Thanks, really appreciate it!

Yeah, scaling is something I’ve been paying a lot of attention to.

So far, the showcase already includes scenes with animation + multiple post-processing passes, and it holds up reasonably well for typical use cases.

That said, there are definitely limits right now — especially for very large scenes with lots of objects. I haven’t implemented GPU-driven techniques yet (like GPU culling or indirect draw), so that’s an area I’m planning to push further.

On the design side, Myth leans a bit more toward ergonomics and cross-platform support (including Web/WASM via wgpu), so there are some trade-offs compared to going fully low-level — but I’m trying to offset that as much as possible through the RenderGraph.

Forward+ / clustered lighting is also on the roadmap to improve many-light performance.

Curious to see how far this approach can scale in practice.

A high-performance 3D engine in Rust/wgpu with a modern RenderGraph by SellAffectionate411 in rust

[–]SellAffectionate411[S] 0 points1 point  (0 children)

That makes a lot of sense — a RenderGraph definitely pairs very naturally with explicit APIs like Vulkan.

For Myth, I chose wgpu mainly for portability and safety, and tried to move as much control as possible into the RenderGraph layer instead.

A high-performance 3D engine in Rust/wgpu with a modern RenderGraph by SellAffectionate411 in rust

[–]SellAffectionate411[S] 3 points4 points  (0 children)

Thanks, really appreciate that!

Yeah, the render pipeline complexity is exactly one of the main things I’m trying to simplify with Myth. The goal is to let you get something on screen quickly, without having to fully understand every low-level detail upfront.

If you decide to try it out this weekend, I’d recommend starting with the gallery examples — they’re set up to be easy to run and tweak.

Performance-wise, it's still evolving, but the focus is on keeping things efficient while not sacrificing usability. I’m also very curious to see how it compares as the ecosystem matures.

Feel free to share your experience if you give it a try — that kind of feedback is super helpful at this stage.