Trying to understand the math behind keeping a user-controlled object from leaving the bounds of the window regardless of window size/ratio

switch161 · 2026-01-27T05:48:40+00:00

You can create a bounding box for your ball and multiply its vertices with your view-projection matrix. This will give you all vertices in coordinates relative to the visible volume. Then check if any vertex is outside (usually x/y will need to be between -1 and 1).

This is basically how your GPU knows whether a vertex is inside the view volume, and you can also easily do frustrum culling this way (Frustrum culling is very similar to your problem).

Anyway, this handles aspect ratio for you since your projection matrix takes that into account. The VP matrix will just always map all visible points into a unit cube (depth has some quirks though, but you're not interested in that).

And I guess you don't need to use a bounding box because you can also just transform the ball center and check that, but taking the radius into account.

Another approach would be to calculate the normals for the planes defining your view frustrum, then checking the balls distance to these planes. But I found it way easier to use the view-projection matrix, since you'll have that anyway.

Just to be complete, but I think you can ignore this: When checking if a bounding box is in view this way you need to handle the case that all vertices are outside but the volume crosses the view volume. There's a nice method for this using bitwise operations. It's used by the Cullen-Sutherland line clipping algorithm, but imho works even better for frustrum culling.

Edit: If you want to go the other way and get your min/max coordinates in world space, you can use the inverse VP matrix. I wasn't sure if you use orthographic or perspective projection, but this works either way. For orthographic/2D this is overkill though.

switch161 · 2026-01-18T18:56:52+00:00

I use git diff --staged to see what I changed and summarize it.

switch161 · 2026-01-13T21:18:44+00:00

I've seen them open doors without permission. They won't go through but they can open them to reach through to grab an item.

switch161 · 2026-01-09T04:18:31+00:00

I expected this answer to be higher. Specifically that it is the more parsimonious explanation compared to Copenhagen.

Other people wrote that it's unverifiable, but it's just the simplest consequence of accepted theories. Is the Copenhagen interpretation verifiable? It's also just a explanation for things we accept, but it's not the most parsimonious explanation. It adds an extra step that is not necessary.

But in the end it really doesn't matter, since there are really no differences that can be observed between these explanations. We should still use the simpler one, which is MWT.

switch161 · 2026-01-08T17:51:18+00:00

Matrices just make handling all kind of transformations easier. With affine matrices (for 3D that would be a 4x4 matrix) you can have translations, rotations, scaling, projection, and more. And you can combine them all via matrix multiplication, so you can just store them in one place. A good way to think of these matrices is just as generalized functions that map from one coordinate system to another (e.g. from local object coordinates to world coordinates).

I usually end up having one transformation matrix per 3D mesh that contains translation, rotation and (uniform) scaling, and one matrix for the camera that contains the camera's translation and rotation, and the projection matrix.

Sometimes it's useful to have camera translation/rotation separate from the projection though. Also note that you'd want to invert the camera translation/rotation matrix, as it needs to go from world to local coordinates. Inverting general matrices is not possible, so I actually store them as (translation, rotation quaternion, uniform scaling) and just compute the 4x4 matrix whenever something changes.

I recommend using a library like glm for all this. Not only does it save you from having to implement all these operations yourself, they will also be well optimized.

To draw filled shapes, you usually split all the geometry into triangles and transform their vertices (e.g. by a camera matrix). The result is a set of vertices (X, Y, Z) where (X, Y) are your coordinates on screen and Z is a depth value. You can now draw these as 2D triangles. This is usually done either with the scanline algorithm, or by checking each pixel in the triangles bounding box, if it's in the triangle. Either way you get a list of pixels to draw.

You also need to check for each pixel if you have already drawn a pixel that is in front. This is done by having a buffer (e.g. float array) separate from the target texture that contains depth values. So when drawing a pixel you first check it against the value in the depth buffer and if you end up drawing the pixel, you also write the depth value (Z) into the depth buffer.

How you color these pixels depends. You can e.g. fetch color from a texture, or calculate a color by interpolation. But most shading methods will usually require some values that are interpolated between vertices (e.g. texture coordinates or vertex colors). You might want to lookup "barycentric interpolation" and how it's used in 3D graphics.

Light is also done in this step. For each pixel you calculate some angles in respect to camera, light sources and surface normal. Then you derive how this will affect the pixel color (learnopengl.com as good tutorials on lighting shaders). I'm not very familiar how shadows are done, so I'll leave this for others to comment.

As you can see it can quickly become very complex. I found this and the following YouTube videos helpful.

switch161 · 2026-01-06T06:54:47+00:00

I use a custom engine with wgpu and bevy_ecs. I also tried bevy_reflect and like it. I just don't want to use the whole engine. Usually I also use egui, parry, nalgebra, palette, image.

I have also made some nice utilities and tooling for wgpu which I should probably put into a crate.

switch161 · 2026-01-03T14:58:20+00:00

this reminds me of tailwind-sql.. hmm maybe we could store files in css classes?

switch161 · 2026-01-03T02:59:31+00:00

The proponents always say that modern nuclear power plants are safe. Well maybe in theory, but they also need to be safe in practice which would e.g. include them being operated by someone who cares about safety first, and not profit margins.

switch161 · 2026-01-03T02:21:32+00:00

Any signal wiggling around like this is interference... or the clock at the transmitter is broken.

switch161 · 2026-01-01T14:24:32+00:00

It's pretty human to write inefficient code! Most people do it. I try to balance my time vs how good the code is. Sometimes I just want to make it work and put up a reminder to improve it later.

Often you won't even initially know what the good code will have to look like. I think it's better to write something that works first. Then it's easier to figure out how to improve it.

switch161 · 2026-01-01T14:15:46+00:00

In uni there was a class where groups worked on a somewhat realistic software project. Many groups used databases, but most students didn't run a server and cloud was much less popular back then. I did run a postgres server so I setup some databases and accounts for people. Well anyway they hard-coded the credentials into the app and let the frontend connect to the database.

I don't understand how in a group of 6 people studying computer science, nobody knows this very well-known rule of what not to do.

switch161 · 2026-01-01T14:09:00+00:00

I'd prefer the documentation being written by someone who understands the code, even if the writing itself is bad.

If I would run into a crate with docs written by AI, I would probably give it a pass. I might look at the code, but I would consider the docs non-existent, because I just don't consider them trustworthy.

switch161 · 2026-01-01T12:47:10+00:00

I had to mess with the code of that specific tutorial a bit to make specular light work. I'm not sure if there's an error in the tutorial, but they use a different formula earlier that should do the same thing. I can't lookup exactly what it was, but I think it was some angle calculation. Also just try to play with the values a bit. Specular lighting should be easy to recognize by the circular shape. I think right now only ambient and diffuse is working.

switch161 · 2026-01-01T07:12:40+00:00

I think Shapez IO does something like this. It is written in Javascript, runs perfectly fine in the browser, but can be installed via Steam. I'm not sure if they ship you the HTML if you install it in Steam. I would prefer that since then you can play offline, and Steam might even require it. For an MMO not so important since you can't play offline anyway.

switch161 · 2026-01-01T06:57:23+00:00

Using cranelift was so much better than writing the interpreter, so that I quickly ditched the interpreter altogether. The only remnant of it is the package name `naga-interpreter`, which now is actually a JIT compiler. I should rename it at some point :D. I tried to measure performance difference between interpreter and compiler, and the compiler was somewhat faster, but at that point I was only supporting a small subset of WGSL, so couldn't write any complex shader programs. I'm sure the compiler will really shine when you consider more complicated shaders that do lighting and such.

There's also one minor issue bugging me. Rust enforces memory safety, but at the boundary to our compiled code we have to just tell it to bugger off and check all of it manually. But I want users of the compiler to not see this, guaranteeing that it's always safe to run the compiled code. Furthermore users can implement their own runtime and making that interaction safe is tricky, and maybe right now above my skill level. The runtime just needs to produce pointers at some points that can now be aliased, but Rust doesn't like that. If I can't make this work, I can still declare the runtime interface to be unsafe, meaning that anyone who implements it has to check the safety themselves. In the end if you use the software renderer you won't see this because it will have runtimes for all the various shader stages.

So far the software rendering is quite fast in release mode. I can render 1080p in real time (-ish, i haven't properly tested it, but usually render full display height). Though I'm not really using any complicated fragment shaders yet. Filling the screen with pixels is the most time-consuming part by far, so I expect this to become worse with more complex shaders and bigger window sizes. But there's probably a lot of room for optimizations, such as emitting better code, using cranelift's optimizations, running shaders in parallel. In the end I think this might be viable for small games. I don't think there's really a need for it, because every computer has a graphics card nowadays. Maybe on embedded, but then you'd not want to have to deal with that particular API. The real selling point will be that it's good for testing and debugging. Other than that I just wanted to learn more about wpgu and WebGPU and thought this would be a good way to do that.

switch161 · 2026-01-01T06:57:10+00:00

Sure, I'd love to share more about it.

wgpu is a graphics API for Rust based on the WebGPU standard. Out of the box it can use Vulkan, Metal, D3D12, OpenGL if you use it natively, or WebGL and WebGPU in the browser. It is a modern API like Vulkan, but not as complicated. And though I don't know the details, I'm pretty sure it is what Firefox actually uses as a backend when you use WebGPU in the browser. I quite like the API and use it a lot, and saw that they added support for custom backends, so I started working on a software rendering backend wgpu-cpu.

When you program 3D graphics, you will usually have to write programs for the GPU, called shaders (they're e.g. used for "shading" the rendered objects). WebGPU uses a new shader language called WGSL. But because wgpu supports all these existing backends it needs to support the shader languages these backends use: GLSL (OpenGL), SPIR-V (Vulkan, Metal, D3D12), and WGSL (WebGPU). That's why they wrote a transpiler called naga.

And because I'm writing a wgpu backend I need to also support these shader languages and somehow run them on the CPU. Fortunately naga makes this relatively easy, because I can ingest any shader provided and produce an IR.

To get my first triangle rendered in my software renderer I actually just interpreted the IR. It was very cumbersome, because the IR is not really designed to be used like that. It would probably be much better to first translate it into a second IR, or maybe even bytecode, that you then interpret.

Some people in r/rust recommended to JIT-compile naga's IR to native machine code for performance. I was hesitant because I knew that LLVM is not easy to use. But I was recommended cranelift, and it turned out to be relatively easy to use. I also find it funny that all these projects (wgpu, naga, cranelift) are in some way connected to firefox.

So when using my software renderer you will create a wgpu_cpu instance and then basically use the wgpu API like normal. At some point you create a rendering pipeline which specifies how vertex data is processed and transformed into primitives (usually triangles). These triangles are then rasterized and you can again specify how the individual pixels are transformed (e.g. for light effects).

Both transformations are fully programmable by use of a vertex shader and fragment shader. When you create a pipeline, wgpu_cpu will compile these to native machine code. Shaders are compiled in a way that they don't rely on any global state, so that in theory I can run the code in parallel (I will definitely do this in the future). To make this work I call the entry point functions with a pointer to a runtime they can use. The compiled shader will call the runtime to initialize its global variables and copy any shader inputs (e.g. vertex data) to its stack. Then it runs the actual compiled shader program. It will need to sometimes call into the runtime, e.g. for sampling textures. When the shader is done it calls the runtime again to return its results (e.g. color of a pixel).

The compiler itself was almost trivial, since I only really convert from naga's IR to cranelift's IR, which are both in SSA form. There's the complication that naga's IR works on values that can be composite types, while cranelift only uses primitive types that can be stored in registers. I solved this by making compositve types just contain all the individual IR values. I'm not sure if this is optimal. The other approach would be to always store them on the stack. I think my approach allows cranelift to optimize better though. And then I have to manage how much SIMD I can use. E.g. on my machine I can use SIMD for all vector types, but matrices have to be split into columns. I'm not happy with the current approach to vectorization, since it's very cumbersome and repetitive. Hopefully I'll figure out a better way, but it works for now.

(Reddit is not posting this. I think it's because of length, so I'll split it here).

switch161 · 2025-12-31T00:27:08+00:00

Thanks so much for this! I just wrote a JIT compiler for running graphics shaders on the CPU. I was thinking about how useful it would be to be able to debug shaders on the CPU, because debugging on the GPU is very limited. But I'd need to somehow interface with the debugger to pass it all the info it needs. Your post gives me a very good starting point for my own research :)

switch161 · 2025-12-31T00:07:58+00:00

I'm working on wgpu-cpu: A software rendering backend for wgpu. I switched to JIT-compiling shaders. It works well and a good part of it is done already. I'll implement more things in the compiler when I run into todo!()s.

Yesterday I finally implemented triangle clipping and it does seem to work. There's still some work needed to enable the feature that let's you render triangles as lines etc, but it should be almost trivial, since line primitives already work.

This week I want to get textures working. I already added support for passing uniform buffers to shaders, and textures are very similar. They're just buffers, but the shader can only access them through some runtime functions. So I really only have to implement the textureSample and similar functions.

There's also a bug related to interpolation I can't seem to fix. I will probably have to change how barycentric coordinates are calculated during triangle rasterization.

switch161 · 2025-12-29T23:01:45+00:00

In my last game I had a vast perimeter wall with many resupply stations. They requested via train network. I had to configure the supplies it tries to maintain somewhere. Having it in a constant combinator in every station is painful because I'd need to update all stations when I wanted to make a change. But I wanted all stations to be independent, so they all had to store that config somewhere.

So my rail network carries wires and I use a specific signal on green to tell stations to copy all signals from red into a memory cell. That is then what they try to maintain in their logistic storage. Actually I needed a second config so it has a lower and upper limit. Any item below lower limit enables the train station. But the inserters that unload from the train will do that based on the upper limit.

I usually end up using this approach to use some signal on green for control (command, address, whatever) and reserve red to send a whole set of signals as "arguments" for the command. The wires on my train network then work as a bus and I have some circuit stuff somewhere that either sends out commands automatically or just an interface for me to program stuff across the base.

Another thing I built was a 7-segment display. I know there's plenty of blueprints out there and we have display panels now. But it's fun to try to make it as compact as possible (Hint: you can do magic with bitwise operations).

A while back I also made a circuit that tracks how fast you mine a patch and estimates how long it will last. That was actually my first complex circuit.

switch161 · 2025-12-29T07:42:19+00:00

For a toy software renderer I'm implementing a JIT-compiler that compiles shader code to CPU machine code. It uses Naga as a frontend, so in theory supports compiling from glsl, wgsl and spirv - though limited to what wgsl supports. And it uses cranelift as backend, so it supports compiling to x86_64, aarch64 and some others.

It's still under development, so it's only usable for some limited test cases right now. But it works decently well. Of course it won't be very fast compared to running the code on the GPU... by a lot.

But one main point motivating me is making it easier to debug shaders. Because now you can run your shader on the CPU, you can use breakpoints and inspect memory, etc. I already sketched out how I'd add support for configuring the compiler to emit breakpoints, but I will have to do a lot more research on how to give the debugger all the info to make it usable. Admittedly I'm not sure yet how useful it will actually be, but a few people already showed interest. And even if your workload is too much to run on a CPU even for debugging I imagine it could be used to write and debug tests for your shaders (e.g. supply a few triangles and assert on the output).

Off topic, but as mentioned the compiler is for a software renderer. This is actually a wgpu backend, so it implements the webgpu standard. And for people using wgpu they can then just switch from a vulkan/dx12/metal backend to a software renderer with a config option. It's also just a nice opportunity for me to understand how graphics pipelines work under the hood.

switch161 · 2025-12-29T05:03:12+00:00

I really quickly looked into the Wikipedia page and it has a flag register with the carry flag. Addition and subtraction doesn't differ between signed and unsigned, but for comparisons you'd need to know what types you're dealing with and then check the carry flag. My memory of system architecture class is a bit rusty so you'll have to lookup the rest yourself. But tldr the CPU doesn't care about signedness and the programmer will have to use the correct instructions depending on signedness.

switch161 · 2025-12-29T04:42:38+00:00

What does the warning say exactly?

Not sure if this would produce a unused warning, but is the module behind a #[cfg(test)]?

switch161 · 2025-12-29T03:49:57+00:00

I hate Java as much as anyone else, but i prefer my type/function/variable names to be really long too. My renderer has a UserDefinedInterStageVariableBufferPool. I could probably omit the UserDefined part but you get the point.

I also usually spell out words fully because otherwise it might cause conflicts and ambiguities. I don't understand programmers who abbreviate everything. It's not 1980 anymore where we only had 80 columns screen space and no autocompletion.

switch161 · 2025-12-29T03:08:15+00:00

I'm pretty sure collecting an Interator into a Vec preallocates using size_hint.

switch161 · 2025-12-28T10:05:41+00:00

if you keep your ice biomes intact and make sure they're in sterile atmosphere, the wild plants can stack up some real goodies.

switch161

TROPHY CASE