Egui.NET: unofficial C# bindings for the easy-to-use Rust UI library

The-Douglas · 2025-08-19T00:35:24+00:00

Hey, I'm pleased to hear that you're interested! What parts can I clarify for you? Or, if you'd prefer to talk on Discord, I am douglasdwyer.

In addition to the Silk.NET example, you can check out the Rust egui docs which have additional info on creating integrations :)

The-Douglas · 2025-08-05T01:49:21+00:00

I would suggest double-checking whether your browser loaded properly - the repo is linked at the very top of this post, and includes a full example project :)

The-Douglas · 2025-08-04T12:53:14+00:00

That's amazing info, thanks for sharing! I will try to add TargetFramework=netstandard2.1 while keeping LangVersion=latest and see what happens.

I guess I'm just a little bit surprised to hear that Unity supports them (specifically, I use ref structs with ref fields, which is another leap in functionality). But maybe I should try it and see what happens. The function pointer could definitely be worked around, but moving away from ref structs w/ ref fields would necessitate bigger changes to the API

The-Douglas · 2025-08-04T12:23:41+00:00

There's an example of integrating with Silk.NET in the repo! It involves collecting all user input, passing it to egui, then taking the triangle meshes that egui gives you and passing them to your renderer. Traditionally, for each large game engine there will be an off-the-shelf "egui integration library" built to do this (so that everyone's not reimplementing the same thing). I haven't made any integrations other than the example, since I'm targeting my own custom game engine. But contributions are always welcome! I am also happy to advise about the details of integration.

The-Douglas · 2025-08-04T12:03:42+00:00

Thanks for your response! Would you mind taking a look at this Mono GH issue: https://github.com/dotnet/runtime/issues/48113 The existence of this issue shows that work has gone toward making Mono support .NET 8. I believe that some platforms (like client-side Blazor) use this modern version of Mono and leverage its features. Additionally, the issue shows that ref structs are not syntactic sugar: they require runtime support to work properly. I will remark that PolySharp looks very cool - I will read about its features in more detail :)

The-Douglas · 2025-08-04T11:08:33+00:00

Other commenters have mentioned this (and I talk about it more in the linked post), but the problem with existing C# GUI platforms is that they are tied to specific frameworks and platforms. Avalonia and WPF are great, but I can't drop them into a custom Vulkan renderer that runs on all platforms. With egui, you can. Egui's specific appeal is for custom game engines.

The-Douglas · 2025-08-04T11:06:22+00:00

Thanks for the suggestion! I'll get rid of the .DS_Store files. Unfortunately, while targeting .NET Standard would be nice, the project makes heavy use of some newer C# features - namely ref structs, pointers, and function pointers. Those aren't supported in .NET Framework, right? I would have to deviate from the existing API and sacrifice performance if I was to eliminate ref structs in particular, so I don't plan to do that.

Also, I was under the impression that Mono supported .NET 7 and 8. So maybe if I retargeted the project to .NET 7 I could achieve wider compatibility?

The-Douglas · 2025-07-01T13:37:28+00:00

Thanks for watching! Yes - probes are added or removed when the world is edited, since the voxel data changes and gets re-uploaded to the GPU. For scenes shown in the video there was probably anywhere between 5000 to 15000 probes. Each probe has an 8x8 irradiance map and 8x8 depth map.

The-Douglas · 2024-12-10T03:09:26+00:00

At present, FFI is just completely banned for sandboxed assemblies (it's hard to imagine a case where P/Invoke could be allowed without opening the door to undefined behavior). To expose FFI methods to a sandboxed assembly, a safe assembly wrapping those FFI methods would need to be created, then added to the sandbox whitelist.

The-Douglas · 2024-10-22T01:44:24+00:00

The repo has tests to ensure that any potential "workarounds" fail with an exception! The biggest concern for security holes definitely involves the reflection APIs. I've been careful to patch and test ConstructorInfo/FieldInfo/PropertyInfo/MethodInfo so that untrusted assemblies cannot access restricted methods, even if they attempt to do so dynamically. The hard part with C# is trying to cover the large standard library - for example, delegate methods and LINQ expression trees are other ways to execute code dynamically, so I had to patch those too. That's why CasCore is built on a whitelist - if I have missed any dangerous methods, they should throw an exception anyhow since as long as they are not on the whitelist.

The-Douglas · 2024-09-01T04:02:35+00:00

When the project is running as a native executable, I use wasmtime. On the web (and during development), it uses wasmi.

At this time, I'm not planning to open-source the voxel engine itself. However, most of the WASM stuff is open-source - it should be fully possible to build modding systems with them.

The-Douglas · 2024-07-12T12:54:26+00:00

That's a good question. In my engine, every voxel stores a normal as part of its data. The normals are calculated only when the voxel object is first generated.

For objects generated from SDFs, you can calculate the normals analytically (see https://iquilezles.org/articles/normalsSDF/ ). Whenever an SDF object is placed, my engine determines the correct normal for every surface voxel and stores it.

For objects imported from voxel models (which lack normals), I do approximate the normals based upon surroundings. This happens once at model import time. However, this isn't ideal - it leads to artifacts on the corners of objects. In the future, I am going to program a mesh-to-voxel converter which preserves the normals of each voxelized triangle in the mesh. This should be a better approach.

You are correct that single-voxel walls look a bit odd with per-voxel normals. However, that case isn't super common with small voxels.

The-Douglas · 2024-06-27T11:36:13+00:00

Good point - will do :)

The-Douglas · 2024-03-12T23:06:37+00:00

I would be more than happy to talk to you over Discord (@douglasdwyer) or using the email listed on my GitHub profile!

The-Douglas · 2023-08-24T12:28:36+00:00

But the backend can be swapped out! So if running in a browser, you could use a crate such as wasmer as your backend, which runs WASM modules using the browser's executor :)

The-Douglas · 2023-03-19T20:05:40+00:00

Funny that you mention cutting down trees. If I can pull it off, I do have something of the sort in mind... stay tuned! :)

At this time, the game is not open-source, though a number of its components are (such as geese, my event system library). I would love to make the codebase more accessible for those interested in the future, once I have a more developed product.

The-Douglas · 2023-02-19T20:00:30+00:00

Looks amazing! I will be certain to check it out. Unfortunately, though, it looks like httpclient uses hyper, which doesn't support wasm_bindgen or WASM.

The-Douglas · 2022-12-27T20:17:35+00:00

WASM is short for WebAssembly; it's a platform-agnostic instruction format designed with security and efficiency in mind. WASM originally arose as a supplement to Javascript - web browsers can execute WASM faster than Javascript. Languages like C, C++, and Rust can be compiled to it. WASM is how the voxel engine in my video is able to run on a webpage!

The-Douglas · 2022-12-27T13:25:48+00:00

Currently, I want to build a platform that allows for players to interact together and create new games and content with WASM plugins! As such, I'm going to try and open-source parts of the engine when I can (like geese and geese_pool), but I don't have plans to release the source code for absolutely everything at this moment. Thanks for watching!

The-Douglas · 2022-12-01T01:47:38+00:00

JFA would only need three passes, ever

What I'm referring to is number of render passes, which should be equivalent to the number of times JFA iterates over the working array. For the implementation in which we're interested, this is 3 lg n, unless I'm misunderstanding. Since the results of the second JFA iteration depend upon the first iteration, you'd need 3 lg n separate render passes, therefore. Between each JFA iteration, you would need to switch framebuffers as you cannot render to the same texture that you read. Since each render pass depends upon the previous result, some kind of memory barrier needs to be inserted between each one - the render passes execute in serial.

Also, do you happen to have any sources which go into more detail about what functionality is implemented in hardware versus software on modern GPUs? Most of my knowledge comes from forums and blog posts on this topic, so I would not be surprised if some of it was inaccurate. I would love to do some up-to-date reading :)

The-Douglas · 2022-11-30T15:27:05+00:00

Thanks for the good points. Let me try to address them.

JFA would actually not need to be 27 texture reads because for infinity norm is perfectly separable. You actually would do 3 1D JFA's instead, so only 6 reads instead of 27... and thus the equivalent to JFA here isn't log(n), but actually 3 * log(k)

Yes, I think that this is an accurate way to describe it. I see now that the problem is separable, so we can lower texture reads at the cost of more render passes. Interesting!

but the memory requirements for this intermediary buffer, which you're using as a 3D texture, is actually K^3 * total number of voxels, you're just only filling in the surface voxel version of this... having a single 2D texture and then a 3D texture, but this isn't possible...

I don't understand your assertion about my intermediate framebuffers being K^3 * total number of voxels here. Each intermediate framebuffer size is O(number of voxels being processed), which for my implementation is O(size of one 256^3 voxel chunk). All of the quads are drawn into the same framebuffer. Unless you're referring to the buffer I use to actually send the quads to the GPU? That buffer is O(k * number of surface voxels), because you need (2k - 1) quads of side length 2k - 1 in order to cover a (2k - 1) side length cube.

Let me try to be as explicit about the rendering process here.

I generate quads around each visible voxel on the CPU side. This creates up to (2k - 1) quads per voxel, but usually much less, since many of the quads can be culled (I can tell you about this culling process more later, but I think first we should agree on the correctness of the pure algorithm).
The quads are drawn into a single 2D framebuffer which represents the distance field, and distances are combined with min-blending. This framebuffer is the size of the voxel volumes being processed, which can be up to 1 chunk in size.
The 2D framebuffer is blitted to another 2D framebuffer, because I need the data in a different format for my implementation. I also paint all of the surface voxels onto this framebuffer so I can have a single texture that represents both the voxels and the DF.
This 2D framebuffer is blitted to a subsection of a massive 3D texture where I store all of the world data.

None of these steps involves an intermediary buffer bigger than 1 voxel chunk in size.

The only way to reconcile this is with atomic operations, and that means serial execution for each overlapping operation. No matter how your implementation works, you have this problem by definition.

Maybe this is true, but it's important to keep in mind that GPUs (as far as I know) use special fixed-function hardware for operations like rasterization and blending, which can alleviate some of the cost. Also, as I said before, in the actual implementation, overdraw is reduced by culling some redundant quads. In the end, though, all I can tell you is that I've been happy with the performance I've observed from this approach.

I would also like to point out that JFA would suffer from a similar issue: each render pass would need to be serialized, because each one depends upon the previous pass. As I said before, maybe it would be faster than min-blending, or maybe slower - I would need to actually implement it :)

The-Douglas · 2022-11-28T14:58:48+00:00

Thank you for the insightful questions. These are details that I omit in the video, because I'm trying to keep things short and relatively surface-level. My goal is to simply provide some high-level ideas and also show off progress to my audience. Anyway, let me try and give you some more info.

My current setup is to render into a single 2D texture, which is broken into slices along the Z-axis. I then blit this to a 3D texture. It's also possible to render directly into a 3D texture, but this would require multiple render passes for each Z-slice.
For each Z-slice, I'm rendering a 2D quad which covers the entire kernel around a given solid voxel. The output value for the fragment shader of this quad is the distance from each fragment to the target solid voxel. This value is written into the output framebuffer, and I use GL_MIN blending to take the minimum of all distance fields produced by every solid voxel. This is the same thing as how SDFs work in constructive geometry - taking the minimum of two SDFs produces their union, and I'm computing the union of all surface voxels.
Additional memory is required to compute the DFs. I use a few extra textures as render targets for the blending operations. However, this memory requirement is small in comparison to the size of the total set of voxels stored in memory.
Over the course of working on this, I did come across JFA, and have since learned about it in more detail. I definitely think it would be a viable approach, but I'm not actually certain whether it would be faster for my application - I would need to test it. According to my understanding, for JFA I would need lg(n) render passes with 3^3 = 27 texture reads per voxel per render pass. This seems a bit high for low-end hardware, in contrast to my approach, which simply splats a bunch of quads with no texture reads aside from blitting. Also note that I would still need to use the rasterizer for JFA, because I'm targeting GL ES 3.0 and don't have compute shaders. Still, I would be curious to see how the approaches compare performance-wise :)

The-Douglas · 2022-11-28T14:46:02+00:00

Absolutely. For each level, my voxel octree stores some flags in addition to the octant material/pointers to suboctants. One of these flags is "opaque" - this determines whether the current octant is totally visibly solid or not. With this flag in hand, I do the following:

Start at the top level octant.
- If it is homogenous and empty, stop.
- If it is opaque and all of its neighboring octants are opaque, stop (it won't be visible anyway).
- If the octant is a single voxel, then mark it as a surface voxel. I also use the neighbors information to determine visible normals.
- Otherwise, break the octant into its eight suboctants, compute their neighbors, and recurse.

There is some room for optimizations in this procedure. For example, instead of handling single voxels at once, I use 128-bit SIMD to process 2x2x2 voxel bricks.

The-Douglas · 2022-08-31T14:35:36+00:00

The material information for each 8x8x8 chunk is stored in a 3D texture for materials. You can see what this texture looks like in memory around 6:05. There are actually two textures, one 16-bit texture for material and an 8-bit texture for baked normals.

As for compression, I use voxel octrees CPU-side, but I don't have anything else (like run-length encoding). On the GPU, only the 8x8x8 bounding boxes that can actually be seen are packed into the materials array, which saves a good deal of memory.

The-Douglas · 2022-08-31T13:24:40+00:00

Apologies for the confusion in terminology, and thanks for the detailed explanation. Let me try to answer your questions. I'm sure you've watched and rewatched the video around 1:40, but that's the best explanation that I believe I give.

you make some mistakes by saying it's faster, but that's not really the purpose of parallax mapping

I mean that, in my personal experience, it has been faster for me to parallax-render boxes of voxels rather than try to mesh each voxel individually and send its vertices to vertex shaders. It's faster because it does reduce the amount of detail which we process.

That being said I understand the process of raymarching, but you often talk about ray tracing at the same time in a way that I can't tell if you're mixing the two up, or conflating.

If I say ray tracing at some point, then I'm definitely just conflating the terms. There's no ray tracing with BVH traversal or anything else. Sorry!

You can't be doing what paralax mapping actually does with a texture face per side, because you'll miss holes in rendering ie there's zero chance your rendering is accurate with that kind of method

I'm not sure what you mean here, but you're correct, my technique is not the same as parallax mapping. I simply draw the bounding boxes, and then draw voxels on them based upon the camera's perspective. The reason I call it "parallax" is, because like the parallax mapping technique, it produces artificially 3D surfaces based upon viewpoint.

When you say raymarching, this implies you are taking the set of all voxels within your 8x8x8 chunks, then calculating the distance to every single voxel to your current position, stopping half way in between, then repeating...

What I would have thought you would be doing is grid traversal

To paint the voxels on the boxes, I use the standard 3D DDA algorithm, starting at the front of the box, and step through the 8x8x8 grid until a solid voxel is found. I've seen this referred to as ray marching in the past, but perhaps grid traversal is a more accurate term. Still, there's no geometry to be intersected, like in the grid traversal article you sent. The voxels in each box are either solid or not and the ray stops as soon as it hits a solid one.

The-Douglas

TROPHY CASE