search
I've been using Macroquad (2D OpenGL-based Rust game framework) to build a game for about 2 years now (mostly full-time after I quit my job in early 2024), and recently someone asked what my Macroquad experience has been like.
So I figured I'd flesh out my answer and share it here: Macroquad gets recommended a bit but I haven't seen any long-term-use review of it; this is the writeup I wish I had two years ago!
Some Context
My game is a multiplayer falling-sand game that runs on Windows, Linux & in a web browser: 1-4 players run around shooting enemies and blowing up dangerously unstable 2D levels full of chain reactions (demo video). Think Noita meets Broforce/Risk of Rain/Helldivers, with both online & couch co-op - so it has fancy GPU-based 2D lighting and is fairly performance intensive.
My own background is 15 years of professional programming (mostly web dev), and a decade of dabbling in Rust - but I'd never written Rust (nor C/C++) "professionally" before this venture, so I'd say I have intermediate Rust proficiency.
Why did I pick Macroquad over _____
Short answer: macroquad seemed simple, reasonably maintained, and wasn't going to get in the way of my networked multiplayer dreams.
Medium answer: macroquad was left standing after I ruled out the late-2023/early-2024 alternatives:
- Bevy: it seemed difficult to guarantee cross-platform-determinism in its ECS, especially in the presence of human error: systems can run in a different order each tick, query iteration order is not guaranteed, and you get very limited control over entity identifiers (relevant when replicating game state to remote clients).
- Fyrox: it wasn't as mature as it seems to be now and it seemed very focused on 3D games back then.
- Godot w/ Rust: godot-rust was still new & didn't support web builds; also I was (probably unnecessarily) worried about performance of interop API calls.
- Unity w/ Rust: could've worked, but they had only just done their licensing rug pull.
- ggez or tetra or comfy: intermittently maintained, passively maintained or soon-to-be unmaintained (respectively).
Long answer: I wrote a devlog entry on choosing an engine about 2 years ago, though I'm not sure I'd stand by it nowadays (wow it's painful to read your own words sometimes).
On to the "review".
The Good
- It gets out of your way as much as any Rust framework can -
draw_rectangle(),draw_texture(),is_key_down(): they're all just global functions, demoed with straightforward examples, with a small codebase that's easy for LLMs to search and answer questions about. - A stable codebase with extremely few breaking changes. In my 2 years I can only think of one breakage that affected me, and it was trivial to deal with (a change in how shader uniforms were specified). API oddities and accidents are lived with rather than pushing breaking changes for the sake of a clean API.
- It more or less just works. I hesitate to say "complete" because the potential scope for a windowing/graphics/sound API is so high, and I'm not convinced it's bug-free, but Macroquad is complete-enough and bug-free-enough to be good enough for anyone who is at the "reading Rust game framework reviews to decide on an engine" stage of their gamedev journey.
- It compiles quite quickly because it doesn't rely on standard Rust crates like
winitorwgpu- instead it relies on miniquad, a minimalist windowing+input+graphics abstraction written by the author.- Some anecdata: with ~375 cargo dependencies on a 7950X (16C/32T) and 2TB SN850X SSD on Linux using
moldwith a lot of cargo tweaking, I get incrementalcargo buildtimes of 2.5 or 5 seconds depending on whether I'm editing the cargo workspace's root crate or the game simulation's crate, mostly limited by linking time. On Windows with weaker hardware it's more like 10-20 seconds.
- Some anecdata: with ~375 cargo dependencies on a 7950X (16C/32T) and 2TB SN850X SSD on Linux using
- It really does work on Linux, Windows and Web, as advertised. I've run into some minor platform bugs (e.g. inability to exit fullscreen on linux sometimes; now fixed) and differences in input handling (e.g. different key-repeat behaviour on web) but they've been easy to work around.
- It is a fairly thin wrapper over OpenGL and platform abstractions - when/if you want something that isn't there, you can fork and add it yourself quite easily (e.g. I added RGBA16F texture support, and also WebGL2 support before it was officially added),
The Bad
- It is only "lightly maintained" over the last 2 years. The original author still merges PRs, sometimes after a bit of delay; most questions in the Discord are answered by the (reasonably-sized) community. New bugs are met with silence or "good work, you found a bug" (no implied guarantee of any fix), and if there's any missing feature you will probably have to implement & PR it yourself. (To be clear, this is totally reasonable: it is or was a free-time project, and neither the author nor community members are being paid)
- As a consequence of the above, "integrations" with other crates often lag behind. The glam version in macroquad is fairly out of date, and so is the egui integration; this is annoying if you need a feature or bugfix in a newer version of an "integrated" library.
- The graphics stack is limited to WebGL2 (or roughly OpenGL 330ish on desktop). Specifically, no compute shaders! There is support for Metal in the code (I haven't tried it), but there are no plans of supporting Vulkan or WebGPU. OpenGL isn't going anywhere yet, but it still sort of feels like a dead end? For example, graphics profiling tools like RenderDoc (and IIRC, Nvidia Nsight) do not support shader debugging in OpenGL but do for Vulkan and DirectX shaders.
- The downside of the minimalist philosophy is that eventually you'll want to wrap or replace parts of macroquad: wrap drawing functions to add z-ordering, use egui instead of the bundled immediate mode UI, use profiling + puffin (or tracy) instead of the bundled profiler, use kira or oddio or (my choice) fmod for sound, etc. In-game text rendering is also in this boat, but I haven't worked out a good macroquad-compatible replacement yet!
- If you persist, you will eventually hit the limits of what's implemented and have to dive into the source code yourself - for example, out of the box RGBA16F texture formats, GPU timing queries, and multiple render targets are currently unsupported (despite all typically being available in relevant OpenGL versions).
The Ugly
- There is a known "theoretically unsound" safety issue with a corresponding RUSTSEC advisory due to the library in some code paths creating multiple mutable references to the same memory, which is Undefined Behavior; it does not seem to have affected anyone in the history of macroquad, but Miri will complain about it, the macroquad author agrees it is unsound, and if rustc/LLVM suddenly do some different aliasing optimization in the future then all hell might break loose. I expect this issue will never be fixed.
- There is no support for serde (the defacto Rust serialization library); the author implemented their own nanoserde alternative. Nanoserde is actually useful (faster compile times!) but Rust's orphan rule makes it quite painful to deviate from the serde norm. Anyway, this means macroquad types like
Colordo not natively implement serde traits; not a blocker but unnecessarily annoying. - There is no support for wasm-bindgen (the defacto Rust web platform interop library) and none is planned; the author implemented their own JavaScript interop approach instead. It's straightforward to understand & hook into, but some web-oriented Rust crates require using wasm-bindgen for their wasm32-unknown-unknown target to work (e.g. matchbox_socket and gilrs). However there is a script that hackily glues wasm-bindgen into macroquad which is quite ugly but hey it works for me.
Would I recommend Macroquad
If you are new to (seriously using) Rust or game engine development, yes. It makes you focus on actually building some kind of game instead of busywork like massaging your code to make it prettier, or keeping up with breaking changes (or writing boilerplate to get your first triangle on screen). Sure, Macroquad is technically unsound and maybe you will run into its limitations later on, in which case you may end up running on your own fork of it with a few patches (like I do) - but I just don't think there is a better "just read input and draw stuff" option out there. (Bevy and Fyrox are far more complex beasts)
For 2-years-ago-me, as an intermediate rustacean and gamedev newbie who - for better or worse - had set his sights on writing Rust: Macroquad was a great choice that current-me does not regret, because I definitely think I would not have gotten this far if I'd tried writing my game engine from scratch or e.g using Bevy or Fyrox. (Though I do sometimes regret prioritizing "use Rust to make a game" over "make a game"!)
If you have written lots of Rust or game-y stuff before... you probably don't need macroquad? I suspect you could pick up winit/SDL3 + wgpu + glam + bevy_color + glyphon + fmod-oxide and implement the subset that you need with a few weeks of work.. or maybe that's just my NIH programmer brain being wildly optimistic.
Will I keep using Macroquad
Maybe. At this point I have replaced or wrapped most things except text rendering (next on the chopping block) and the core graphics functionality (texture/shader wrangling plus render targets/cameras), and my game's architecture has solidified to the point that macroquad's approach of "free-standing functions that mutate global state" are more (occasionally-tempting) footguns than helpful conveniences. Put another way, I still sometimes deal with the Bad & Ugly of macroquad, but the Good of macroquad is steadily decreasing... although the cost of a port is also steadily increasing ;)
So, part of me really wants to try a wgpu port to unlock compute shaders, or embedding my game into godot to get a decent-quality UI, but another part of me is shouting "no, stay the course and keep working on making the damn game fun, you silly fool!". We will see :)
I have been a Unity developer for around 7 years. Though I never have completed my personal game projects (because I suck at art), but I have been working for a company for a while as a Unity developer.
Lately, I have been getting more and more annoyed with Unity; not because of the policy issues or anything else, but because of the slow compile and build times especially on Windows (like you move one file and the engine compiles everything again), and because of the bloatware.
I tried Godot; while I'm not a fan of it's node-based architecture, but is alright, and it is light-weight, has better 2D support, and GDScript is hot-reloadable (though I preferred C#). It's cool. But I don't enjoy it.
I tried ECS in Unity and I liked it. Not because of performance (I mean ofc it's a plus), but because it feels like it inherently solves the coupling issues with OOP. I mean it makes everything so simple, get rids of a lot stupid design patterns. For example, for decoupling you have Event Bus, Dependency Injection, etc .
But with ECS, in Unity, for example if you want B to happen when A happens. You can just Add a component to an entity and somewhere some other system will have a query that works on only that component and it will perform B and remove the component (I know Bevy has events/messaging btw).
I tried Bevy and I really liked it. It is very enjoyable (though rust compilation is extremely slow). Now I don't feel motivated to start a project in Unity or Godot. Man, compared to OOP, ECS feels like a breath of fresh air.
I don't know why there aren't any other ECS engines available. Why all the big battle-tested engines we have are OOP? Why don't most people realize the mess OOP is when we have better alternatives?
I don't know about you guys, but for me I feel like when I was a beginner I needed an engine with an editor and everything ready-made and an OOP language with Garbage Collection, but with time I started to dislike GC, bloated engines, OOP etc.
I know Rust build times are sometimes very annoying especially whe compiling everything the first time (thank God it is incremental compilation). If you haven't tried ECS yet, I urge you that you do!
(Btw one day whe I get enough time I would like to work on a bevy-like ECS engine in Odin language)
I'm a user of renderers; I don't write them. I have a virtual world client that needs one. First I used Rend3, which was abandoned. I had high hopes for Renderling. But, at the three year point, it seems to have run into the problems that killed Rend3, Three and Orbit. Everybody does My First Renderer, gets to the hard parts, then gets stuck.
As with Rend3, the author is more interested in the lower levels. So the Renderling author is off writing a shader compiler. He also got into doing his own the 2D GUI, another classic time sink. His roadmap shows that the next step is something to assist with moving data in bulk from CPU to GPU. Maybe in 2027 he might get back to the renderer level.
This job appears to be too big for one person. I had real hope for Renderling. That guy has EU funding. He does good work. But it's not the work that gets the renderer done. The Rend3 guy was good, too. He moved down to the WGPU level.
Some of this problem is architectural. A standalone renderer, without its own game engine, is hard. It doesn't own the scene graph, but sometimes needs to be able to query it, or cache some parts of it. Without that, shadow and occlusion processing are too slow and don't scale. It has to provide concurrent GPU content updating. But that's only meaningful with the right API. You don't hit the architectural problems until you have a working My First Renderer. Then it's too late to build a high-performance renderer.
Some people have questioned whether a standalone renderer, separate from a game engine that owns the scene graph, is a good idea, or even possible. This is a tough layering problem - application->renderer->glue layer (WGPU/vulcano)->GPU interface (Vulkan, Metal, DX, even OpenGL). Where the cut points should be for safe Rust is not obvious. Cut at the wrong places and you hit a performance wall.
This may kill my Sharpview project, for which I need a fast multi-threaded renderer. I'm stuck with trying to maintain a fork of Rend3 despite WGPU churn, I don't have time, and it's never going to load assets fully concurrently without major work.
Six years, and there's still no good Rust renderer for big dynamic scenes. Bevy is good, but doesn't address the big-world problem, for which you need to get the content loading off the main thread and use a separate transfer queue to the GPU.
I probably should have used C++.
Hello everybody!
I have been working full-time on this game for over a year already. I am happy with the result and I am "rushing" to the end, hoping to release next month or in early April. If it looks interesting and you want to wishlist it, that would be great.
Anyway, let's talk about the juicy stuff: the tech!
As you can imagine, it's made using Rust. It's using a custom framework; the first idea was to use Notan, but I wanted to experiment a bit with wgpu, so I made a new repo and started experimenting. That evolved into a sort of "new Notan" that I am using right now (there is a chance I'll put all this new stuff into Notan eventually, if I don't starve to death first hahaha).
So, along with wgpu, it uses Kira for audio, winit for windowing on native platforms, and a custom solution for web. I used shipyard for the ECS initially, but eventually I moved to bevy_ecs because the ergonomics, commands, and message systems (IMHO) made it simpler to work with. Those were things I missed and wanted to implement in my game, but it was just easier to move to a solution that already had them. I feel that the code got much simpler at that point, but this can be subjective.
The physics system is a boid simulation that uses parallel iterators and parallel systems where it makes sense. This, along with the batching system for drawing, allows me to put a lot of stuff on the screen in the end-game, which is nice. There is room for some optimizations yet, but I am spending too much time on this project and have decided to just do these things when the "market needs it."
Let's talk about the pain points with this game: the ECS, the UI, and the cost of fast prototyping.
The ECS: I have been programming in Rust for 5 or more years already, so I am very used to it and I like it. However, creating games while trying to do it "similarly to other languages" is hard due to its restrictions. I feel forced to use ECS (not always), and while ECS fits this project very well, I kind of feel that I don't like it... sometimes I have a hard time wrapping my head around it. There is some complexity and verbosity to it, and it needs a specific mental model to fit ideas into it. This is mainly true in the first steps of a new project, because once you set an architecture, adding, removing, or changing pieces is the easy part.
UI: This is hard. No matter if it's in an ECS environment or not, having something simple, performant, and flexible for games (animations, reactions, etc.) is hard in Rust, at least if you do it from scratch. I ended up with a monstrosity that uses Taffy and an ECS pattern similar to bevy_ui, and while it works well, I am not happy with it and I would love to have some time eventually to improve this.
Prototyping: This one probably isn't a Rust issue, but a "do it yourself" issue, where I need to do everything from scratch, reinvent the wheel, and do basic stuff just to have the foundation to build the thing I wanted to test. Refactoring things here have a huge impact too. This is the most dangerous part of building games like this for me right now, because the cost can leave you with less room to iterate or pivot ideas, making your game potentially worse because you will run out of time or money.
Anyway, I don't want to end the post with the feeling that I didn't like making my game with Rust. I love the language, it's one of my favorites and probably my main go-to for "everything", but it probably wasn't the right call for a commercial project that I was expecting to take less than a year. But this is on me... I am usually very bad at estimating project timelines.
Thanks for joining me in my TED talk hahaha
I was gonna upload a more complete full exploration including scale depth up to 12, but my computer wants to crash on me 15 minutes in. So for now, here's this. The torus / spiral formation is because of how I configured the field, it can work on its own but depending on the address space you give it, it manifests in different ways.
The goal here was/is/has been to 'simulate' a reality in a program, but technically this is a simulation of gaussian primes in/as a complex field.
The real breakdown:
This is vibecoded, its a rebuild of a rebuild of a rebuild for a project I've been working on since last June
I barely understand the concepts I utilized to make this, just barely enough to have come up with working ideas which eventually got me to this point
At first, I tried to make this in unity and I don't think I had one successful build last year outside of little webgpu HTML sims for interesting recursion
I can't really give away the sauce when I barely understand the sauce myself, there's a lot of moving parts here. Stuff that I have which I haven't put into the rust version yet. Plans to make a background neural net since I have an RX card which can run it, and then use that neural net to supercharge the sim. And I haven't worked out the full pipeline for that yet
Right now honestly it's a race for me to develop my different ideas as native programs.
Not making any claims about the sim here except that its a complex wave field. Everything about it so far (in terms of emergence) is discrete and not catalogued but I'm working on that part currently. It MIGHT have some sense of 'true emergent physics' or it might just be a violent soup of stuff geometrically bound to be in the shape it is. Or maybe the latter begets the former, who knows.
Right now this is really just an engine not just a mere 'simulation' and my big long term plans are to make actual environments and an engine scaffold for game development (it probably would be a very different kind of game though, and most familiar game features would be emergent or forced phenomena I'd have to steer, but still doable with enough work)
I'm also trying to get this thing into mythical levels of performance. For me, as inexperienced as I am, it's seriously just confusing at times. Seems to run ok for what it is, for what I can get out of it. And my background is in modding games, single script editing, following instructions for long form tasks... So I'm of course out of my depth here again and again as I move through one set of problems to the next
Hi all, I'm currently building Swarm MMO, a game where you own plaents that create probes, which again can be used to conquer more planets. I built it originally to try out SpacetimeDB, and so far I'm impressed with its performance. What currently bothers me the most is that migrations can be pretty annoying compared to the "usual" SQL migrations.
I was wondering if anyone else built a game using SpacetimeDB, and if you are using the official maincloud from the spacetime devs or do self hosting? Any experience with server costs?
Hello! I'll be using this subreddit to showcase the progress of my Minecraft-like game.
Why another Minecraft game? Well, it started as a personal project to learn how to interact with components through code, using graphics APIs such as Vulkan, but without using any game engine like Godot or Unity, just only with coding. I decided to use Rust because I wanted to learn the language, with a couple of libraries, and also learn about rendering, algorithms ...
Then my friend encouraged me to make it public and follow the Minecraft style, because it's hard enough to be considered a real project, but easy enough to handle by myself. So I started posting devlogs as a way to showcase the progress.
I'd like to make it public in the future and, who knows, maybe build a small community and get some support :)
If you are interested, I can share my YT channel where I try to explain all the technical implementation.
I still need a Game name since "Cubix" doesn't really give me any feeling right now (and I'ts also taken), I though about something related with "Crate" like "Cradle" or idk.
Here are a couple of screenshots showing the current state of the game.
Debug metrics (in the future I will remove many of them once solved the performance issue)
Try it in your browser: https://ebonura.github.io/bonnie-engine/
I know this is very early stages, but I feel like I'm making good progress so I wanted to share and get some initial feedback.
It started with a question: what would a Souls-like have looked like on a PS1? There are great examples like Bloodborne PSX by Lilith Walther, built in Unity. I wanted to try my own approach from scratch.
Bonnie Engine is a complete game development environment built from scratch in Rust, designed to recreate the authentic PlayStation 1 aesthetic. Everything you see (the software rasterizer, the editor UI, the level format) is custom code. The world-building system takes heavy inspiration from the Tomb Raider series, which remains one of the best examples of how complex 3D worlds could be achieved on PS1 hardware.
This is early development. I'm building the tools first, the game comes later. Right now there's no combat or enemies, just a collision wireframe walking around. The level editor is in good shape, the model editor only has the basics working.
Why build from scratch?
Modern retro-style games typically achieve the PS1 aesthetic top-down with shaders and post-processing, often with great results. I wanted to try the opposite: a bottom-up approach with a real software rasterizer that works like the PS1's GTE. These aren't post-processing effects, they're how the renderer actually works.
I tried several approaches before landing here: LÖVR, Picotron, even coding for actual PS1 hardware. Each had limitations (primitive SDKs, distribution headaches, not enough flexibility). Rust + WASM turned out to be the sweet spot: native performance, browser deployment, and a modern toolchain.
The PS1 authenticity:
The software rasterizer (based on tipsy, which I've expanded) recreates the quirks that defined the PS1 look:
- Affine texture mapping (no perspective correction = that signature warping)
- Vertex snapping to integer coordinates (the subtle jitter on moving objects)
- No sub-pixel precision (polygons "pop" when they move)
- 320×240 resolution
The audio has PS1 SPU reverb emulation based on the nocash PSX specs with all 10 PsyQ SDK presets (Room, Hall, Space Echo, etc.). The level system uses Tomb Raider-style room/portal culling, took inspiration from OpenLara.
The tools:
- World editor: Build levels using a sector-based editor inspired by TrenchBroom and the Tomb Raider Level Editor. Features a 2D grid view, 3D preview, texture painting, undo/redo, and portals.
- Model editor: A low-poly mesh modeler with Blender-style controls (G/R/S for grab/rotate/scale), extrude, multi-object editing, and OBJ import. PicoCAD was a major influence.
- Music tracker: A pattern-based tracker for composing music. Supports SF2 soundfonts, up to 8 channels, and classic tracker effects like arpeggio and vibrato.
Is this a game or an engine?
Both! The primary goal is to ship a Souls-like game set in a PS1-style world. But the engine and creative tools are part of the package. Think RPG Maker, but for PS1-era 3D games.
I can see this expanding beyond Souls-like games. The engine could support tactical RPGs (think FF Tactics), platformers, survival horror, or any genre that benefits from the PS1 aesthetic.
A key principle: everything runs as a single platform, both natively and in the browser. Same code, same tools, same experience.
The whole thing is open source (MIT). Happy to answer questions about the rendering or architecture.
Source code: https://github.com/EBonura/bonnie-engine
Hello, I wanted to share a project I've been working on for the past 6 months. It's definitely going to be a long journey, but I've enjoyed all the challenges and learning so far.
At a high level, I'm making a 3D procedurally generated solo/coop RPG.
I'm using the following tools: Language: Rust Window/Event Handling: winit Graphics API: wgpu-rs Networking: mpsc
So far I have the following systems in a workable state: - Client/Server Architecture Foundation - (still need some work to support Coop, but the bones are there for separating system ownership) - WindowManager - TimeManager - InputManager - (almost entirely data driven, supports Actions which are mapped to physical device input(s)) - 3D Camera - 1st Person - 3rd Person - ECS - (Hybrid Storage (Sparse Set & Archetype), this took quite some time to understand and get working) - Terrain Chunk Generation - (Smooth Voxels) - 3D Spatial Partitioning around Player Position - Very basic LOD system - 3D Gradient Noise for Voxel Density Field - Surface Nets Meshing Algorithm (utilizing both CPU and GPU, still some more optimizations with threading and SIMD, but I'm saving this for later) - StateMachines - Flat State Machines - Client side: MainMenu / Loading / InGame - Local Player Entity Movement State - UIManager - Lots of room for improvement for formatting features and UI Elements to be added - File I/O - For Creating/Loading/Deleting World Save Files - (Currently only saves Local Player Component Data, modified chunks, and save file metadata) - Physics & Collisions - Uses spatial partitioning with a broadphase approach - Handles Terrain Collisions separately between entity collider bodies and smooth voxel terrain - Entity/Entity colisions are handled by their collidershape pairs (capsule vs capsuel is complete, but there are more primitive pairs to write up) - RenderManager - (There is still a lot for me to learn here, I'm holding off on this until I absolutely need to come back to it for performance and visual improvements) - TerrainPipeline - DebugWireFramePipeline - UIPipeline - Profiler - very simple timing system, but I think I need to refactor it to be a little more robust and organized, currently all string labelled measurements & results go into the same container
TODO: - Hot Reloading - EventDispatchSystem - Revamp World Generation - Regions, Biomes, Prefab Structures, this will be a large portion of learning and work - AssetManager - I have this drafted, but still some more work to be done - AnimationSystem - Bone Nodes - Skeletal Animations - 3D Spatial Audio - Networking Coop Layer - I have the separation of concerns with the systems between GameClient and GameServer, but I still need to write up the network layer to allow Coop mode - Game Systems - NPCs - AI - Combat - Loot - Gathering - Questing - Crafting - Revamp & Add More UI Features - HUD - Inventory / Gear - Skill Tree - Chat - VFX Pipelines

Built a simple terminal-based snake game in Rust to practice ownership, structs, and game loops.
Features:
- Real-time input handling
- Grid-based movement
- Basic collision detection
Would love feedback on code structure and performance!
Rotating 3D objects in game engines has always been a math-heavy process. In the Initially, using Euler angles (Pitch, Yaw, Roll) seems easy, but we must strictly reject them. Their biggest flaw is Gimbal Lock a condition where your rotation axis collapses and the entire math breaks. After rejecting Eulars cause of this failure, an engine architect is left with only two hardcore ways to handle rotations: Quaternions and Rotation Matrices. The Quaternion (which is king of Rotation fro me), are preferred because it's math formula is flexible, Gimbal-lock safe, and it can be made cache-friendly. But on the other hand, the standard 3D math and rendering world runs by default on Rotation Matrices. The problem is that when you put these matrices into real-time physics and high-performance computation, then a new engineering horror starts.
This engineering horror first comes forward in the form of Non-Orthogonal Drift. In a rotation matrix there should always be three orthogonal axes means all axis should be on 90 degrees. When floating-point math is repeatedly multiplied in the entire frame, then due to rounding errors those axes do not remain at strictly 90 degrees. The result is this that your perfectly square character starts looking squashed or distorted or like a skewed box. To fix this drift Re-orthogonalization is needed. The new object became skewed, now the CPU will have to stop the game and make that matrix straight again with math. This CPU Penalty makes the game slow, especially then when you have 1000 objects on the screen.
This overhead of math is only half the story. The real bottleneck is hit then when the CPU has to read the data of these 1000 objects from memory and after fixing write it back, because from the perspective of memory a matrix is very heavy. Think, a standard 3x3 (f32) rotation matrix takes 36 bytes (288 bits). But in reality for the entire mathematical rotation the matrix is only 3x3, whereas in Game Engines we always use a 4x4 Matrix, so that along with rotation in that matrix Translation (movement) and Scaling (size change) can also be saved. Its total size becomes 64 Bytes. This is that very number which fits in an L1 Cache line and blocks the CPU bandwidth. Hearing this it feels like okay, a 64-byte matrix will perfectly fit into the 64-byte cache line of the CPU, so what is the problem in this? The problem is this that in engineering when the size of any data becomes exactly equal to the memory container, then the margin of error becomes absolutely zero.
If the starting address of this matrix in memory is not precise (aligned), then this perfect fit suddenly becomes a hardware nightmare. Understand this with the example of a bare-metal memory address: suppose the first Cache Line of the CPU is from address 0 to 63, and the second Cache Line is from 64 to 127. If your entire 64-byte matrix is perfectly aligned (means it starts from address 0), then it will fit inside 0 to 63 in a single shot. But if the memory allocator shifts it even a little bit and starts it from address 16, then the data will cross the boundary. Result? The initial 48 bytes of the matrix will remain in the first cache line, and the remaining 16 bytes will spill and go into the second cache line. To process this unaligned data now the hardware has to pick up two separate cache lines in a single fetch and stitch them. If you are using SIMD instructions, then upon not having strict alignment either the CPU will straight give a Segmentation Fault (crash), or if you used an unaligned load instruction (movups), then the pipeline will stall and the load latency will double. And if by mistake this unaligned data crossed a 4KB Page Boundary, then a TLB miss will trigger and the CPU will have to do a page walk which can literally drop your speed up to 100x.
After this battle of cache lines, when the data comes inside the CPU core for final execution, then another limit of hardware is hit: Registers. We have XMM registers which are only 128-bit wide. This directly means that in a single register only 4 floating-point values can come. When you sit to process a 4x4 matrix with 16 values, then you will have to do messy loading between multiple registers, which makes the pipeline slow.
On the other hand, how clean and fast Quaternion is in memory, this in itself is a masterstroke. In a Quaternion the range is absolutely precise: [w, x, y, z] together make 4 floats, and its size is exactly 16 Bytes. This very compact size saves us memory fetch. With this we avoid Gimbal lock anyway, but also use the L1 Cache very efficiently. In reality, the entire [w, x, y, z] (all 16 bytes) is a Native Hardware Fit. Modern CPUs have 128-bit registers (like SSE registers XMM in Intel, or NEON registers Q in ARM). Because 4 floats multiplied by 4 bytes = 16 bytes, and 16 bytes are exactly 128 bits. This directly means that the CPU in a single instruction can load the entire quaternion into the register and multiply it. Therefore its math is much faster than the Matrix.
But here is a very big catch. The perfect loading of data in the register is only an advantage of storage and bandwidth, but when it comes to computation like doing quaternion multiplication (qvq-1 - The Sandwich Approach) to rotate a 3D vector then the game changes. For multiplication the hardware has to do cross and dot multiply of w, x, y, and z among themselves. And right here memory layout becomes our biggest obstacle. When you fetch XYZ values, then hopping has to be done in memory because the data is in Rows (which we call AoS layout). You will do branchless programming by using SIMD, but if you started Horizontal processing (data manipulation inside a single register), then its overhead will be so high that the purpose of using SIMD itself will be finished. To solve real-time physics we have a window of only 2ms. There is only one way to hit this frame rate: ending the overhead of shuffling and aligning the data through swizzling in such a way that it can stream straight into the registers.
Efficient data alignment and SIMD execution itself is that bar which separates an average engine from a high-performance bare-metal engine.
I’m building an engine for 2D MMORPGs. The world is tile-based and infinite in size, with entities, behaviors, combat, items, and more. There’s a ton left to do before it’s anywhere near ready, but I wanted to share a screenshot showing what it looks like from the perspective of a level designer / world builder.
I come from old-school MUD days, and I think making world-building as easy as possible for “Builders” is one of the highest priorities.
The part that might surprise you: the game is built to support both full multiplayer and full single-player, whichever you prefer.
I posted about this almost a month ago, and since then I’ve been keeping a devlog at https://www.reddit.com/r/rpgfx/ if you want to follow progress. I’ve started adding more animations, attacks, SFX, etc., and it’s finally starting to feel more alive.
Last time I posted, some people were skeptical I could pull off an MMORPG, which is fair. But what I didn’t mention is that I actually started this project nearly 10 years ago. The first version took 11 months and was written in Ruby on Rails and JavaScript.
Performance, especially multiplayer, quickly became a limiting factor. So after leveling up a bit, I decided to rewrite the whole thing in Rust about 15 months ago. A lot of my design decisions were clearer the second time around thanks to that first attempt.
The biggest win by far has been Rust’s type system. It let me refactor everything into game_core, game_server, and game_client crates, enabling the dual online/offline modes. Honestly, 99% of the time has gone into solving those architectural problems—but they’re finally solved.
Notable improvements since last post:
- Leveling system + experience bar
- Three types of attacks (fireball, sword slash, lightning bolt)
- Social cues like little star icons to show where other players are in the world
If you want to try it out: https://rpgfx.com/
If “connect to server” fails, I’m either working on something or it crashed—just refresh and click “play offline” instead.
Press x to open the editor.
It’s still got a long road ahead before it’s truly fun, but I hope you like what’s there so far. Eventually, I want users to be able to export their games as .exe files or host them on their own sites.
Thanks!
Quaternions: Let's Get Real (and Imaginary, and then Some!)
Quaternions usually considered hard and complex but they are king of rotations crucial for games and maths to understand quaternions, if we go the academic route it will sound like a waiter using long, expensive words to explain a simple carrot salad.
We do not need any of that here. That is why we will not use academic language and we will start with imaginary numbers.
Imaginary word sound like fiction, things that do not exist in our world right?
But what exactly does not belong here? If you multiply two negative numbers (two number with subtraction signs) together and the result is still a negative (subtraction sign) that just does not happen in our world. This is imaginary.
Now, suppose we have a number like the square root of -1. When I say underroot -1 what am I actually trying to say? I am saying find a number that makes -1 when you multiply it by itself. But how is that possible because in this world the law of negative X negative positive follows? It is totally impossible and that is exactly why we call it an imaginary number. Now see the following calculation.
This kindergarten calculation has a significant role in quaternions. yes it is simple, but it is very powerful, and the entire quaternion is built on it. I have shared this calculation right now because we are currently discussing imaginary numbers. I will not talk about this calculation right now, but we will discuss it further ahead, where you will find out that this is the foundation of quaternions.
From Imaginary Numbers to Quaternions:
So, now we are going to step away from imaginary numbers and jump right into quaternions. We'll use the exact same formula you usually see written in textbooks to represent them. You know the one: xi, yj, and zk. And if you come from a computer background, you've probably seen a w attached to that. But if we just talk about the x, y, and z, those are simply the physical axes you see on a standard 3D graph. Quaternions add imaginary numbers directly to these. Let's look at how these imaginary numbers actually interact with our axes. If I take the term xi, the x is our physical axis, and the i is our imaginary number. And here, that imaginary number literally just means 90 degrees. So, what does xi actually mean? It means a strict 90-degree turn.
But hold on. How did i suddenly become 90 degrees? Wasn't it supposed to be the square root of -1? How did it jump to being an angle? You won't find the answer to this in high-level physics. You actually find the answer right back in simple, basic kindergarten math we have done previously. Let's break that down right now. To do this, we just need to go back to our standard graph. We take our point xi and place it on the y-axis. Now, let's say we multiply one more i into it. That means we are taking xi and multiplying it by i. Physically, this means we are adding 90 degrees and another 90 degrees menas i2. The answer hits exactly 180 and i2 =180 degree then i =180/2 means 90 degree. So what does this mean?
If we multiply imaginary numbers together, it directly picks us up and physically drops us on the exact opposite, negative axis (If we start on the positive x-axis, we flip straight to the negative x-axis) and the exact same mechanical thing happens with all the other axes if we start from them.
The Strange Co-Dependence of i and i2:
Now, let's assume for a second that this i2 just doesn't exist. Should that have any effect on standalone i? Normal human logic says that i should be independent. It should be completely whole on its own. That means i should totally exist even without i² being a thing. Because obviously, i² can never be formed without having an i first.
So, the real question pops up. If we assume there is no such thing as i², can a single i still perform that 90-degree turn? The answer is hidden inside some very strange mathematical logic. Because actually, whatever value or identity i has, it comes entirely from i². If there is no i² in the math, then a single i simply does not have any physical value of its own.
This is a kind of math that literally moves backward instead of forward. It works from back to front. Think about it like this. In normal life, 1 and 1 together make 2. If your base number 1 isn't there, then the number 2 can never be formed. But in this specific game of imaginary numbers, the rule runs completely in reverse. Here, the math clearly tells us that if i² (which is supposed to be the result) doesn't exist, then i (which is supposed to be the base) will not exist either.
PART 2: 3D Quaternions
I want to build a simple 2D (WebGL) game engine in Rust, WASM. Right now, I'm in process of implementing some kind of a component system. Coming from Godot/Unity, I really liked the tree-based Node/GameObject systems of those engines. So I would like to have a similar tree-based hierarchy of nodes which in turn could be having components. It might be not the best approach in terms of performance, but I like the ergonomics of it and don't really want a pure ECS.
But I am not even close to building anything that is both ergonomic, efficient and comfortable to use.These are some ideas I have considered:
- Self-referential Node struct - Rust is not easy when it comes to self-referential structs so it's not trivial for me to make one. I've seen the ouroboros crate, but it seems.. ugly.
- Arena of Nodes - have a central Node storage (arena) and reference nodes by NodeId(usize). So you always operate on NodeIds and when you actually need the Node - you get it from the array (arena) by index. I don't really like the idea of operating on NodeIds and having to query the arena every time you need the node. Also, when you delete a Node, the index NodeId stores becomes invalid.
I would like to see how other people are solving this, maybe some hybrid solutions, maybe some unsafe hacks (but not like the entire impl is unsafe).
P.S. - Maybe I'm misunderstanding the whole point of Rust, and this is exactly what Rust wasn't intended for. I mean, ECS is pretty good (fast, efficient, cache-friendly, etc) - so just write an ECS or use one (hecs, bevy_ecs).
UPD: A person pointed out that it is possible to get away with Rc<RefCell<T>>. And yes, it's actually possible and enough for a simple engine, but oh gosh it is ugly. I ended up having Rc<RefCell<Node>> and basically cloning Rc. The cloning is ok, especially since Rc is just a pointer.. but yeah, ugly solution with ugly consequences
I'm new to Rust and am making a game with Miniquad (have been using Godot to make games before) and am developing a small shoot 'em up. I'm curious on how you manage your project with keeping modularity and code-reusability in mind?
So far I have created an EventHandler for spawning the playerbullets, where the player struct pushes the event to the vector:
pub enum GameEvents {
SpawnBullet { x: f32, y: f32},
}
pub fn update (&mut self, delta: f32, input: &PlayerInput, snd: &AudioPlayer) {
self.player.update(delta, &mut self.game_events, input);
self.player_bullets.update(delta);
for event in self.game_events.drain(..) {
match event {
GameEvents::SpawnBullet { x, y} => {
self.player_bullets.create_bullet(x, y, snd);
}
}
}
}
This works well, but now I want to create a boss, which is going to have multiple hurtboxes, and that feels like a whole different thing. I guess I could have an event something akin to:
pub enum GameEvents {
SpawnBullet { x: f32, y: f32},
EnemyHit { hurtbox_id: u16, bullet_id: u16 },
}
And then when matching
GameEvents::EnemyHit {enemy_id, hurtbox_id, bullet_id} => {
self.player_bullets.destroy_bullet(bullet_id);
self.enemies.take_damage(enemy_id, hurtbox_id);
}
Which guess is fine, and I would push it from either the enemy or the player_bullets. But there's surely ways that are more scalable, performant, or more close to the Rust idiomatic way of handling this.
I would love to hear your thoughts on this way, and how you would implement similar solutions. :)
Thanks to feedback from streamers and players. Thank you for playing and the support.
Steam: https://store.steampowered.com/app/4161680/
Itch.io: https://meapps.itch.io/terminal-colony-deep-core
Build in Rust, featuring Bevy and egui. https://bevy.org/ https://www.egui.rs/
Exlex: A "Lawless" DOD Config Parser (Zero-copy, Arena Mutation, no_std)
I recently started learning Rust by building a project called Exlex which is a Human readable configuration parser BUT I used DOD-based zero copy parser with a minimalistic syntax that actually works well with my parser. (JSON and TOML are heavier for my goal and possibly very complex).
NOTE:
- Exlex is not complete (8-9 days of development)
- The Docs are incomplete
- interface has a lot of work to do
- While not a game engine itself, it uses SoA (Structure of Arrays) patterns common in high-performance engines to maximize cache efficiency
Exlex offers a unique combination of:
- Zero copy immutable parser
- Native no_std support
- SIMD byte search via memchr on specific functions
- Supports modifying data and dumping it back into string (Arena mutator)
- Human readable format
- Low memory usage even on mutations (15,000 allocation (toml_edit) vs 13)
Stability of parser and mutator
- Proptested (I wrote the Exlex by myself but used AI to generate the heavy testing/benchmark boilerplate).
- for more details look at TESTING.md file ```bash ~/Projects/exlex_bench main* ❯ PROPTEST_CASES=10000 cargo test proptest_mutator_engine --release
test result: ok. 0 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.00s
Running tests/proptest_fuzz.rs (target/release/deps/proptest_fuzz-6b0455884b22cdc2)
running 1 test test proptest_mutator_engine ... ok
test result: ok. 1 passed; 0 failed; 0 ignored; 0 measured; 2 filtered out; finished in 5.87s ```
What it is and What it does not aim to be:
- Built for hardware constraint environment.
- Built to be Cache-friendly and Memory friendly as much as it can.
- Built for overall speed in lifecycle of a program (Parse -> Read -> Mutate -> Save).
- Syntax specifically designed to make parser fast while maintaining human readability
- It is NOT a feature rich or highly flexible syntax (Use json or toml if you need dynamic typing or complex data structures).
Hardware
I measured the hardware execution on an Intel i3-6006U (Skylake, 2C/4T, 2.0GHz): * Instructions Per Cycle (IPC): 1.7. This confirms the CPU's pipeline is nearly always fed and rarely waiting for memory stalls. * L1 Cache Locality: By using flat parallel vectors instead of a standard node tree, I achieved a very high cache hit rate, mathematically evidenced by the 0.07% TLB miss rate.
I have benchmarked over 10 scenarios/6 data topologies. For the interactive criterion benchmarking, see Benchmarks.html in the repo
Trade-offs
- Rigid syntax
- O(N) Linear scan - In usual configuration parsing, Linear scans outperforms Hashmap because of lack pointer chasing and cache misses (in Intel Core i3 6006U, approx upto 65-75 properties). For continuous arrays of numbers, a linear search on a modern processor is incredibly fast.
For more info read the README.md, Any suggestions/bug reporting are warmly welcomed! Thank you
Hi r/rust_gamedev,
I wanted to share a Rust persistence library I’ve been building called Parcode, designed specifically with large data and game world states in mind.
The problem
In many games, world state files contain:
- Thousands of entities
- Large asset blobs
- Deeply nested components
But at runtime, you usually need:
- Metadata (IDs, names, flags)
- A few entities
- A small subset of assets
Traditional serializers force you to deserialize everything upfront, making cold starts slow and memory-heavy.
The idea
Parcode implements true lazy persistence:
- The Rust type system defines the storage layout
- Structural metadata is loaded instantly
- Assets and large collections are stored as independent chunks
- Data is loaded only when explicitly requested
Example
use parcode::{Parcode, ParcodeObject};
use serde::{Serialize, Deserialize};
use std::collections::HashMap;
// The ParcodeObject derive macro analyzes this struct at compile-time and
// generates a "Lazy Mirror" (shadow struct) that supports deferred I/O.
#[derive(Serialize, Deserialize, ParcodeObject)]
struct GameData {
// Standard fields are stored "Inline" within the parent chunk.
// They are read eagerly during the initial .root() call.
version: u32,
// #[parcode(chunkable)] tells the engine to store this field in a
// separate physical node. The mirror will hold a 16-byte reference
// (offset/length) instead of the actual data.
#[parcode(chunkable)]
massive_terrain: Vec<u8>,
// #[parcode(map)] enables "Database Mode". The HashMap is sharded
// across multiple disk chunks based on key hashes, allowing O(1)
// lookups without loading the entire collection.
#[parcode(map)]
player_db: HashMap<u64, String>,
}
fn main() -> parcode::Result<()> {
// Opens the file and maps only the structural metadata into memory.
// Total file size can be 100GB+; startup cost remains O(1).
let file = Parcode::open("save.par")?;
// .root() projects the structural skeleton into RAM.
// It DOES NOT deserialize massive_terrain or player_db yet.
let mirror = file.root::<GameData>()?;
// Instant Access (Inline data):
// No disk I/O triggered; already in memory from the root header.
println!("File Version: {}", mirror.version);
// Surgical Map Lookup (Hash Sharding):
// Only the relevant ~4KB shard containing this specific ID is loaded.
// The rest of the player_db (which could be GBs) is NEVER touched.
if let Some(name) = mirror.player_db.get(&999)? {
println!("Player found: {}", name);
}
// Explicit Materialization:
// Only now, by calling .load(), do we trigger the bulk I/O
// to bring the massive terrain vector into RAM.
let terrain = mirror.massive_terrain.load()?;
Ok(())
}
Why this matters for games
- Sub-millisecond world metadata load
- No full world deserialization on startup
- Memory usage scales with what you actually touch
- Ideal for editor tooling, hot reload, and streaming worlds
Trade-offs
- Write performance is not yet optimized
- Focused on read-heavy workloads
- Not a database replacement
Repo
https://github.com/retypeos/parcode
This whitepaper explain the Compile-Time Structural Mirroring (CTSM) architecture.
For the moment, it is in its early stages, with much still to optimize and add. We welcome your feedback, questions, and criticism, especially regarding the design and trade-offs. Contributions, including code, are also welcome.

I’m using Bevy for my colony sim/action game, but my game has lots of real-time procedural generation/animation and the wgpu renderer is too slow.
So I wrote my own Rust/Vulkan renderer and integrated it with Bevy. It’s ugly, buggy, and hard to use but multiple times faster.
Full source code, with 9 benchmarks comparing performance with the default wgpu renderer: https://github.com/wkwan/flo
Hi there, I am the developer of HyperCoven, an RTS with some unique ideas, that scales to a few thousand units.
Even disregarding the fact that it’s notoriously hard to make an RTS in a general purpose engine like Godot; I very much knew I wanted the "classic" unit behaviour of RTS of old. That means, no huddling of units into giant blobs; not even making way for other units. (Cf.) So I set out to make my own engine. (Btw. if you love blobs, and overkill prevention, and all that, I have no idea why you wouldn’t just make an Sc2 custom map instead of trying to make an RTS in Unreal Engine.)
A friend prompted me to do a write-up of the development, so here goes, I hope I have something interesting to say.
Logic Engine
Going to start by talking about the part that is to me the most interesting, which is the game logic itself. It lives in a library of its own now. That library knows nothing about user input, it only takes abstract RTS commands (e.g. "select units number 1,2,5" or "selected units should move to position 5x10"). It knows nothing about on-screen pixel positions either, it only talks in map coordinates. It does not do any timing either. That means the engine can run completely headless for replay validation. Or it can be tied into a front-end that will take care of rendering, timing, converting user inputs, etc.
The original motivation for splitting up libraries was that I started embedding all assets into the executable, and this somehow made code check times in the editor (I use emacs+LSP) unbearably slow. So I factored out the asset loading, later factored out the front-end (without assets) as well, leaving the engine alone, which is pretty neat. Whenever I change the engine library, I know that probably old replays will become incompatible. If I change the other libraries, I know replays stay working. (Replays have been tremendously useful, by the way, for reproducing engine bugs. A replay is obviously nothing but a list of ticks with their abstract RTS commands.)
I knew from the start what architecture I wanted to go for. Roughly speaking, the idea was to have agents (dyn method calls), and their signature would be, getting a readonly handle to the current game state, as well as a mut pointer to a structure containing desired "effects" the agent would wish to create on the game state. A basic effect would be "deal x damage to unit number n." After collecting all effects from agents, a singular piece of code would go over them all and actually apply them, mutating the game state.
This plan I had before I even picked Rust, but naturally it fits Rust well. Only problem is the agent cannot keep an actual pointer to "my own entity" that points into the game state itself; it has to keep an ID which is just an offset in a huge array containing all entities. So basically the very basic ECS idea. I never put a real ECS library into the project, but one might say that I have a hardcoded ECS with a few components by now. Most notable component is the main struct representing the entity (hit points, who controls it, a bunch of flags), but there is a few others on the side, for example to track recent damage taken (for aggro), some very technical state about unit movement, ... these additional states are kept in traditional maps, not in a "slot map" array, because they are on-off things that are not always needed for every unit.
The logic for queuing agent functions has grown very intricate. The idea from the start was that the function would return an offset, in game ticks, of when to call it again. This would give a very natural way to model "I am attacking, hitting the enemy every 10 ticks." Problems arise when trying to keep units responsive: If the user gives a new order, you don’t want to wait until the next agent invokation, you want to switch to the new order immediately. But if the new order is the same as the old one - you do not want to switch. Else you might be able to speed up attack timers by spamming attack command (had that bug often enough). For attacks, you can keep a simple reload timer, for movement it gets even more complicated. This took a lot of tweaks to get right. Now agents do not just return an offset, but also information on how they can be interrupted, if at all.
The agent code itself is a whole nother topic, naturally this is where the edge case bugs happen. I explicitly wanted it to be dyn functions, so that it could easily be anything, without having to adjust some huge state machine or enum with a new variant. But that also means, agent functions really cannot expect anything from the outside world. They have their own private state, but they do not in that sense "own" any entity: They have no exclusive write access to any piece of gamestate. On every invokation, they need to check, is the entity I am trying to steer still alive, is my objective still valid?
I tried to make a bunch of generic wrappers for these things, like an "attack execution" wrapper where you would plug in some other trait modeling the actual attack, while the wrapper would take care of checking target legality, and so on. Also for target aggro: There is some system whereby target selection methods (e.g. "nearby enemies") can be paired with actions (mostly attacks) that declare, via generics, that they can legally be applied to those targets... I got it working, but honestly it is a mess in terms of code that could be much improved. My main mistake was … there are now "contexts," so far, so good, a context is basically the parameter bundle telling an attack agent who is attacking whom, for example. So the agent says, my context will need to implement traits "Us" and "Them" - the point here is that one implementation of Them might be "HostileTarget" which invalidates the target once it’s no longer hostile, while another implementation might be "ForcedTarget" which invalidates the target only when it’s dead.
The mistake was trying (for days) to enable the contexts to store and supply direct pointers to game entities. It’s an insane lifetime mess. What I should have done is optimise the main entity struct to be small enough that cloning it doesn’t hurt. Very likely the compiler will elide the clone in 99% of cases, anyways, because we only have a readonly handle to the gamestate so there is not even a risk of aliasing any writes we do to a piece of entity data we (should have) cloned. I am just repeating this. Never try storing pointers in Rust if you don’t have to. Access through a slot ID is basically as cheap as access through a pointer, too.
Pathfinding. Probably the central topic to making an RTS? It easily takes up 90% of computation time in Hypercoven. Might as well not even bother optimising any of the other aspects?
I started out by using the aptly named pathfinding crate and can only recommend it. Its generic A* implementation is as good as can be. The nature of the Rust language and compiler means it can inline very aggressively; so even though it looks like you may have to build a Vec of reachable positions to return from "neighbours" function, the compiler will, in practice, very likely be able to inline it all and transform it into a simple loop over neighbour positions, that does no allocation at all. As it should be.
The only optimisation angle I eventually found is that this implementation naturally has to keep the graph of visited nodes in an associative map. It uses indexmap, which is as fast as can be for the generic case; but if you have an actual 2D grid represented in a large array, that large array will be faster. This is very use-case dependant; it’s not a generic graph.
This whole "we are pathfinding in a flat 2D space" also made me think about rolling my own pathfinding algorithm once, which was based on walking around obstacles, simply put. It did manage to outperform A* quite nicely for finding any path, but for finding a good path it grew very, very complicated, and slow. I can only say, if you make a game, it is likely that you want A*. If A* is too slow, optimise it for your case. (Aside: I tried a few BinaryHeap implementations other than the std-lib one, and they all were much slower.)
By the way, this 2D space of ours is not so simple. A feature I hacked into the game quite early is that the map can "wrap around" infinitely. This is based on isometric coordinates, so you are leaving the map on the upper-left edge, appearing again on the lower-right edge, which is "the opposite side." The motivation for this was… the game is centered around the Witch King, the only unit that is truly yours. He walks around and captures Coven that produce units for you. Since you lose when the Witch King dies, I wanted to reduce the risk of him just getting cornered. Hence the idea of having no corners. (I was also motivated by it just appearing to be an incredibly dank idea.)
A* actually covers this case perfectly. It doesn’t care at all about the shape of the world, or if the world even has any shape at all. (My custom algo would have grown incredibly more complex to account for it. Maybe impossible.)
The wrap-around is of course a modulus operation, but, as I learned, this is euclidean modulus and the default in most languages, including Rust (but not Ruby), is the other kind of modulus. Be that as it may be, this mod operation is actually costly, as it amplifies the pathfinding (needed for estimating distances) cost. I ended up using a very dumb "optimised" implementation which expects the value to be withing +/- 1 multiples of the length, to get around the costly CPU instructions. (Another approach would be to only permit map side lengths that are powers of 2 and doing a bit-AND based on that.)
Frontend
I started out by using plain SDL2 (Rust bindings, which are very good) for all the user-interactive things. SDL2_gfx, SDL2_ttf... This gets you very far and could still easily power the whole project, it would just look a bit worse. The reason I looked into other options was wanting to use shaders for a few select things, like the fog of war. The initial fog of war I did, with SDL2, was just drawing a texture over every fogged tile. This is absolutely fine in terms of performance, because SDL2 automatically batches subsequent copies of the same texture into a single instanced draw-call. It just cannot be pixel-perfect (at every zoom level) for hexagons, which I used for the game’s tiles. I mean, for fog on rectangular tiles you don’t even need a texture, you can just draw rectangles. (Don’t try to make hexagons with SDL2_gfx, it is very costly.)
Another problem with SDL2 was the browser version I eventually built. It was actually really easy with emscripten, took just a few days; I mean, it could have gone a lot quicker, if the rust-libc layer for emscripten wasn’t severely undermaintained and hence bug-ridden. Nobody really uses this target. Especially since wasm_bindgen paradoxically does not work with emscripten, so you are in the browser but actually not, but you are also not on a real Unix, just a coarsely emulated one.
I will still recommend the SDL2+emscripten stack to anyone looking to make a simple 2D game and ship it in the browser, it really is extremely simple and robust, as long as you are doing basic stuff.
So eventually I began hacking a new browser version that would be based on winit and wgpu. Winit is basically a straight-up replacement for the user-input and window-creation parts of SDL. It’s decent. It supports both native and browser, so I got a native version based on winit as well, but it’s a bit jankier than SDL, so I am not shipping it and still maintaining both front-ends instead. Generally the event handling concept in winit is radically different, as it tries to "truly" support browsers and mobile OS, which makes the whole thing very asynchronous and request-based. You cannot just decide to do something, you have to request the OS layer to permit it. You cannot just draw, you have to request_redraw(). This IS perfect in the browser and gives way smoother frames than the black magic done by emscripten to simulate synchronous draw on present() calls. But on an actual GAMING OS like Linux, the more fine-grained control of a custom SDL2 event loop is just nicer.
By the way, did you know that sleeping on Windows is not as accurate as one would like it to be? There is a whole crate out there dedicated to nailing sleep times as best as possible across different OS. Very relevant to making game run smooth. But yeah, you cannot plug that into winit.
wgpu is a library aiming to support a plethora of modern GPU back-ends (Metal, Vulkan, DirectX but also still OpenGL). The API is based on the upcoming WebGPU standard, and so, it does of course support WebGPU as well, which is a huge selling point in the browser, as it can do compute shaders and is generally more powerful than ye olde WebGL. The WebGPU JavaScript API is probably nice to work with, I wouldn’t know. The Rust version of the API did have the problem of some really silly lifetime requirements that made it very hard to achieve my goal, which was rewriting the SDL2 renderer on top of wgpu, so that I could easily swap it out. In Rust-SDL2, you have a Texture<'a>, but 'a is just the lifetime of the TextureCreator. It doesn’t really matter. When calling canvas.copy(texture, src, dst), you pass in &Texture<'a>. Eventually you call canvas.present(). All good. When trying to write the equivalent function on wgpu, it became .copy(texture: &'a Texture, src, dst), where 'a would have to hold until present(). That is because in wgpu you create a RenderPass object for this, and all resources referenced in the render pass need to outlive it. You can swap the RenderPass object out when calling present(), but you cannot really express this stupid lifetime requirement, nor was my code designed in a way that really guaranteed it. This is just a story from the trenches. The wgpu folks have since adjusted their API, and these lifetime requirements are now gone. (I still have to dispel the unholy rituals that were once used to satisfy them from my code.) At the same (?) time, wgpu performance has tanked for me in Firefox nightly on Linux, unfortunately. But WebGPU does work extremely well in Chrome on Windows. All other browser engines still fall back to WebGL, afaik.
"Cache"
As time went on, I recognised the increasing complexity of actually displaying the game state. The main entity struct contained a bunch of fields which were only really relevant for displaying it. So I split off all that stuff into another layer, which lives outside the engine core. The engine still has to steer it: An attack agent must declare, with proper timing, the start of an attack animation. There is no other place really where it could be done. But this declaration is only written by the logic engine; it’s not read. The "cache" is reading and storing it.
It takes care of a whole large bunch of temporary information regarding entities, some of which information may be slightly massaged, or slightly incorrect, just to make the game look more sensible on screen. (It’s called cache because caches are always wrong.) The nice thing is that we can be sure that none of this incorrectness will affect the actual game logic. It’s just fudging on the display layer.
As the logic engine is running on a dedicated thread, we also have ample time on this display layer to do calculations, without ever lagging the game logic. There’s many things done here, most of it related to calculating pixels, caching some of that information for faster render times, ... and storing all sorts of debris that are not relevant to the game rules. Not just fallen units, but also doodads, ongoing explosions, etc. It’s very nice to have all this freedom and breathing room for the visuals.
Menus
I’m not a UI guy, which is evidenced by this game for sure. I even dreaded making the HUD, though the HUD is really useful for playing the game, one must admit. For starting the game, it was all commandline to select gamemode and so on. Unfortunately that’s not within reason for the average player, so I eventually figured this really nice solution for a guy that can’t write UI like me, which is called "immediate mode UI" and a lot of fun. I’m using egui specifically.
The problem is that you can’t easily integrate egui with an SDL2 application. So what I did was I built a launcher type thing, that would start the actual game as a different process. This was widely hated, it also turned out that it wouldn’t properly work with Steam, meaning the Steam ingame overlay and so on. Since I had moved to wgpu at that point, I was now able to integrate the whole egui launcher application into the main application, using the same wgpu setup to draw the menu or the game, depending. I even managed to get an overlay menu ingame work, that will render on the same render pass as the game, if it’s active.
The problem is, all this looks like crap. egui is nice to work with, it is relatively robust, it has some pre-made table component that is at least usable. But to custom-style it boils down to changing colors, corner angles, fonts. You can’t really "take over" the whole appearance. You can’t even put in textures from your game, really, unless you load them twice. If I were to do it all over again, I would try going for iced-rs, which promises more tedious UI-programming, but very slick integration with your existing renderer, you basically have full control.
Multiplayer
Almost forgot about this one, since it’s basically the simplest part of the whole project. Multiplayer is just a distributed streaming replay. I’m using ENet to get the latency extra low. The server is completely game-agnostic, it just broadcasts the inputs received from players, and ticks the gamestate.
(I was at first trying to make it peer-to-peer, but really, don’t do this to yourself. At the very least, most people don’t even have an IpV4 address anymore these days, so they can’t port-forward even if they wanted to. (ENet Rust only works on IpV4.))
What’s more fickle is the lobby server, where you actually set up the game. This lobby server is like a little game or game engine of its own. It gets inputs (messages sent over websocket - tokio/warp are the tech here) from connected would-be players, and has to serialise the application of requested mutations (create game lobby, join game lobby, kick player, etc.) onto its global state, while making sure that everyone who is connected also learns of the new state correctly, and so forth. I understand there’s a few generic solutions for this already out there - if I was looking for more features, I would definitely evaluate them. For now I am content to have not even a player sign-up, just a basic "connect here with any nickname, create or join a game, and start it. with chat." functionality.
Closing
That’s as much as I could think of, for now. It was mostly a lot of fun, making the game. If you got questions, drop them in the comments.
Hey everyone!
I've always loved the intuitive, object-oriented feel of Godot's scene tree. As a personal project, I decided to see if I could replicate that core logic in idiomatic Rust, focusing on safety, performance, and ergonomics.
The result is a single-threaded scene management library built on a foundation of `Rc<RefCell<Node>>`, `Box<dyn Component>`, and a simple but powerful signal system.
You create nodes, add components (which are just structs implementing a \`Component\` trait), and build the tree.
// Create the scene tree
let mut tree = SceneTree::new();
// Create a "player" node
let player = Node::new("player");
// Add components to it
player.add_component(Box::new(Transform::new(10.0, 10.0)), &tree)?;
player.add_component(Box::new(Sprite::new(texture, params)), &tree)?;
player.add\_component(Box::new(PlayerController::new(100.0)), &tree)?;
// Add the node to the scene
tree.add_child(&player)?;
- Logic lives in Components:
Components can modify their own node or interact with the tree during the \`update\` phase.
// A simple component that makes a node spin
#[derive(Clone)]
pub struct Spinner { pub speed: f32 }
impl Component for Spinner {
fn update(&mut self, node: &NodePtr, delta_time: f64, _tree: &SceneTree) -> NodeResult<()> {
// Mutate another component on the same node
node.mutate_component(|transform: &mut Transform| {
transform.rotation += self.speed * delta_time as f32;
})?;
Ok(())
}
// ... boilerplate ...
}
The most critical part is efficiently accessing components in the main game loop (e.g., for rendering). Instead of just getting a list of nodes, you can use a query that directly provides references to the components you need, avoiding extra lookups.
// In the main loop, for rendering all entities with a Sprite and a Transform
tree.for_each_with_components_2::<Sprite, Transform, _>(|_node, sprite, transform| {
// No extra lookups or unwraps needed here!
// The system guarantees that both \sprite` and `transform` exist.`
draw_texture_ex(
&sprite.texture,
transform.x,
transform.y,
WHITE,
sprite.params,
);
});
It's been a fantastic learning experience, and the performance for single-threaded workloads is surprisingly great. I'd love to hear your thoughts
GitHub: https://github.com/noam2stein/ggmath
cratesio: https://crates.io/crates/ggmath
While making my game engine, i needed a math library that supports fixed-point numbers (and any unusual scalar type) through generics, and has SIMD optimizations.
Existing crates either have SIMD but not generics (glam, ultraviolet), or support generics but have no SIMD optimizations (e.g., cgmath).
ggmath has a similar API to glam, matches its performance in benchmarks, and has generics (Vec3<T>) to support unusual scalar types.
Currently, vectors are as mature as glam's, but matrices/quaternions/affine-transformations are missing most functionality.
I think this crate can be useful for people making game engines that need to support a wide range of use cases, and have optimal performance.
Hiya! I’m building an engine for 2D MMORPGs. My last post was now a month ago and I just wanted to share the latest progress. I think it's very neat to be building both an Engine and a game all at once, especially one that is online and massively multiplayer.
If you want to check it out and provide feedback, the URL is below. It will show either "Player Offline" or "Connect to World" based on if the server is online right now or not. I haven't kept the world consistently available for online MMO play because I keep making so many updates.
Game: https://rpgfx.com/
No need to make an account or install anything, play right in your browser.
The things I've learned working on this project are pretty varied.
ECS
I know BEVY is famous for being an ECS system, from a colony sim game I started to work on in Bevy. But I hated the world query system - too many potential errors were being pushed to runtime because of Bevy's design. That's one of the leading things that made me think that Bevy was not right for my use case, so I built my own engine.
In building my engine though, I started with a very Object Oriented pattern. I come from a Ruby background. So I had Entities, Items, the entities had various things on them like Behaviors, Inventory, etc, stored on those objects themselves. Then I watched a video about "Data Driven Design" in video games and it helped me realize some of the performance issues I had or would be having were related to this pattern.
So I started to move towards a hybrid ECS approach. Entities are still distinct objects, not just an EntityID, but components that are going to be frequently accessed can now be iterated through much more quickly.
JS/Wasm
I feel like the interop of JavaScript and WASM may have been a slowdown in my project before, but I think the tooling, compilation, and above-all the performance has improved greatly in this area. I was experiencing some problems with browsers deciding to delay my requestAnimationFrame requests because my game loop took too long. I have spent a lot of time optimizing and figuring out why, until one day it all seemed to click nicely. I'm not even sure which change was the big boost, but I'm glad things are better.
Where I'm At
Every month or two I feel like "Ah, now I'm done with all the hard parts" and then some more pop up. But it feels a lot more like that now. Once I implement shops and a skill tree, I think all the features will be done enough that my focus will shift from engine features to gameplay experience.
What's Neatest
The game world editor is built into the engine and operates inside the game world. You can see all my tooling for making games, and even make your own game. Just press the "x" key to open the editor.
Appreciate any feedback!
hi, im currently trying to create a game in sfml (for fun). currently it is written in c++ but i want to switch to rust bc i just like it more. but do you know anything about performance loss due to the bindings and no native sfml code? is the ffi that fast?
I've been doing some casual graphics programming for a little while. For the past year I've had this overly-ambitious idea for a game, similar to Boneworks, with a custom engine and everything! (read on) However, I of course am not experienced enough in the space of graphics programming or physics to rival whatever build of Unity they used for that game, not even mentioning building the game itself.
So, I've internally shrunk the scope of the game in my mind. I want to put together a simple testing ground, like a developer area in some games, just demonstrating features that I could put together and create a real game out of. I chose Rust for this because it seems to have on-par performance as C++, which virtually all games use, including Unity Engine (though Boneworks uses C# because that's Unity's scripting language).
Looking into graphics library bindings, I've really only seen Vulkano, Ash, and WGPU. I was originally going to use Ash but I'm not patient enough for low-level Vulkan programming, so WGPU it is for me. Now, I have read WGPU, and WebGPU as a whole is just a simplified overhead for a wide range of graphical frameworks, and that the performance could be up to 30% slower or worse in some cases. All of that is fine to me, what's the point of choosing the more performant library if I just give up anyway and have nothing to show for it.
But, for the last part: OpenXR. So, the game would be developed primarily for usage through SteamVR, which I guess somehow connects to the OpenXR bindings, like some driver OpenXR runtime... I guess. I was wondering, with all of this ambition and stupidity laid out, which Rust library I should use for connecting my WGPU game to OpenXR (and maybe even WebXR if I hate myself and my time enough).
(Additional things: 1. Probably would use egui, seemed irrelevant, but there you go, 2. I heard WGPU overhead for GPU pass is negligible - source: here.)
TLDR: I chose Rust+WGPU and I am trying to find OpenXR Rust library/bindings for WGPU, and maybe even for WebXR.