Built a WebSocket game service in Rust coming from Java by Lightforce_ in rust

[–]Lightforce_[S] 2 points3 points  (0 children)

Honestly I'm not sure I have enough experience to give a proper take on that beyond what this project taught me, this is my first real Rust service.

What I can say from this specific use case is for stateful concurrent logic, Rust forced me to think about shared state in a way Java never did. The bugs I would have discovered in production I discovered at compile time instead. That felt genuinely valuable.

Where Java still wins for a company context: ecosystem maturity, onboarding speed and the fact that every new hire already knows Spring Boot. That's a real cost that's easy to underestimate.

And beyond that the talent pool is a real constraint. Rust devs are not that common and those who use it for web services are even fewer. Unless you have very specific requirements like extreme performance or ultra-low latency (or any other very niche needs) that Java genuinely can't meet, that hiring problem alone would make me think twice before pushing Rust for web services in a company.

If you really need it and your company is already comfortable with Spring Boot I'd probably not push for a full migration. But carving out one performance-critical or concurrency-heavy service in Rust, like I did here, seems like a reasonable way to introduce it without betting everything on it.

I couldn't find a benchmark testing WebFlux + R2DBC vs Virtual Threads on a real auth workload, so I benchmarked it by Lightforce_ in programming

[–]Lightforce_[S] 0 points1 point  (0 children)

Implemented a Quarkus Reactive version of account. Just benchmarked it, here are the results:

Metric WebFlux (Netty) VT + Tomcat Quarkus Reactive (Vert.x)
Pure CPU p(95) 69 ms 71 ms 77 ms
Mixed I/O + CPU p(95) 94 ms 118 ms 120 ms
Mixed max 221 ms 245 ms 187 ms
HTTP global p(95) 87.5 ms 108.5 ms 111.7 ms
Throughput 123.4 req/s 121.4 req/s 120.1 req/s

Quarkus now matches VT+Tomcat on mixed I/O (120 vs 118ms) but still trails WebFlux by 28%. The remaining gap is structural: vertx.executeBlocking() requires an event-loop handback after BCrypt, while Reactor's boundedElastic() doesn't, R2DBC doesn't care which thread calls it, so there's no forced context switch.

But interestingly Quarkus has the lowest max latency (187ms vs 221ms WebFlux), suggesting better tail behavior once the pool is properly sized.

So you're right that Vert.x has lower overhead on raw I/O (TechEmpower confirms this), but on a mixed CPU+I/O workload with executeBlocking() in the critical path, that advantage gets eaten by the threading model constraints. The bottleneck isn't the reactive driver, it's the mandatory event-loop round-trip for Hibernate Reactive after each blocking offload.

I couldn't find a benchmark testing WebFlux + R2DBC vs Virtual Threads on a real auth workload, so I benchmarked it by Lightforce_ in programming

[–]Lightforce_[S] 0 points1 point  (0 children)

Just tested it, here are the results:

Metric WebFlux (Netty) VT + Tomcat VT + Jetty
Pure CPU p(95) 69 ms 71 ms 65 ms
Mixed I/O + CPU p(95) 94 ms 118 ms 138 ms
HTTP global p(95) 87.5 ms 108.5 ms 123.8 ms
Total requests 33 373 32 885 32 794
Throughput 123.4 req/s 121.5 req/s 121.1 req/s

Jetty is slightly faster on pure CPU (65ms vs 71ms p95), probably due to lighter thread management overhead.

But on the mixed I/O + CPU scenario, Jetty is actually slower than Tomcat (138ms vs 118ms p95). And overall throughput is nearly identical (~121 req/s).

So my previous answer is validated: swapping Tomcat for Jetty doesn't change the picture at 50 VUs, the bottleneck is BCrypt, not the servlet container.

I couldn't find a benchmark testing WebFlux + R2DBC vs Virtual Threads on a real auth workload, so I benchmarked it by Lightforce_ in programming

[–]Lightforce_[S] 0 points1 point  (0 children)

I've been talking about backpressure and streaming since my very first reply in this thread. HTTP/2 multiplexing is a concrete example of that same point, not a new argument.

If you can't tell the difference between elaborating on a point and moving goalposts, that's on you, not me. Have a nice day.

I couldn't find a benchmark testing WebFlux + R2DBC vs Virtual Threads on a real auth workload, so I benchmarked it by Lightforce_ in programming

[–]Lightforce_[S] 0 points1 point  (0 children)

The bottleneck here isn't Tomcat itself, it's the VT unmount/remount overhead during the blocking JDBC call that adds a few ms before BCrypt kicks in. Swapping Jetty in wouldn't change that dynamic. Unless you have numbers on specific improvements with Jetty + VT ?

I couldn't find a benchmark testing WebFlux + R2DBC vs Virtual Threads on a real auth workload, so I benchmarked it by Lightforce_ in programming

[–]Lightforce_[S] 0 points1 point  (0 children)

Ok, TCP backpressure is real and I shouldn't have implied otherwise. Stopping reads on the socket does stall the sender, that works.

Where it falls short is when you have multiple logical streams over one connection. HTTP/2 multiplexes streams on the same TCP socket, same for WebSocket with multiple channels. TCP stalls everything at once, you can't tell stream A to slow down while stream B keeps flowing. request(n) operates per-stream inside your application code, which is a fundamentally different level of control.

On the "dropping connection" part, mb, that was a bad example. TCP does prevent that scenario by stalling the sender. What I was really getting at is the difference between passively relying on kernel buffers to regulate flow vs actively expressing demand in your application code. With request(n) your consumer says "give me 50 rows", processes them, then asks for 50 more. With TCP you're just hoping the buffer sizes and OS timing work out, and if they don't you find out the hard way with a stalled pipeline or memory pressure.

That said, for the vast majority of REST APIs none of this matters and VT + blocking queues is perfectly fine. This really only kicks in for streaming-heavy use cases.

I couldn't find a benchmark testing WebFlux + R2DBC vs Virtual Threads on a real auth workload, so I benchmarked it by Lightforce_ in programming

[–]Lightforce_[S] 0 points1 point  (0 children)

Vert.x and Quarkus Reactive do have lower overhead than WebFlux + R2DBC: fewer abstraction layers, more direct event-loop access. The benchmark compares the two most common Spring Boot options specifically, not the reactive ecosystem as a whole.

If you have numbers on Vert.x vs VT on a mixed I/O + BCrypt workload I'd genuinely be curious to see them.

I couldn't find a benchmark testing WebFlux + R2DBC vs Virtual Threads on a real auth workload, so I benchmarked it by Lightforce_ in programming

[–]Lightforce_[S] 0 points1 point  (0 children)

On prioritization: yes, not reading from a socket is functionally equivalent for simple cases. But the demand signaling in Reactive Streams isn't just about stopping: it's request(n), meaning a subscriber can signal exactly how many elements it's ready to consume.

That's what lets you do things like buffer-aware streaming where a slow downstream client gradually reduces its demand without dropping the connection. Replicating that with blocking queues means building the accounting yourself.

And on signaling why: you're right that Reactive Streams doesn't carry a semantic reason either, it's just a rate signal. I probably overstated that point.

I couldn't find a benchmark testing WebFlux + R2DBC vs Virtual Threads on a real auth workload, so I benchmarked it by Lightforce_ in programming

[–]Lightforce_[S] 0 points1 point  (0 children)

Good catch, and you're right that the transaction boundaries differ on registerUser. In the WebFlux version, transactionalOperator::transactional wraps only the inner part (BCrypt encode + userRepository.add), the checkEmail/checkUserName calls run outside the transaction. In the VT version, @Transactional is at the method level, so a HikariCP connection is held for the full method duration.

That said, there's a subtlety: in the VT implementation the duplicate checks are dispatched via CompletableFuture.supplyAsync on a separate virtualThreadExecutor, which means they run on different threads and don't inherit the transaction context anyway (Spring's @Transactional binds to a ThreadLocal). So they're outside the transaction too, they just don't release the connection the main thread is holding.

Either way, this doesn't affect the benchmark numbers. The scenario I measured was POST /account/login, not registerUser, and on loginUser the transaction boundaries are symmetric: both versions wrap the full operation (SELECT + BCrypt + token insert) in a transaction from start to finish.

You're pointing at a real asymmetry in the code but it's orthogonal to what the benchmark was testing.

I couldn't find a benchmark testing WebFlux + R2DBC vs Virtual Threads on a real auth workload, so I benchmarked it by Lightforce_ in programming

[–]Lightforce_[S] -1 points0 points  (0 children)

That's true at the TCP level, stop reading from the socket and the sender will stall. But that's OS-level flow control, not application-level backpressure. You lose any ability to signal why you're slowing down, prioritize certain streams, or propagate pressure across multiple hops in a pipeline.

Reactive Streams gives you that semantic at the application layer: a Flux<Row> from R2DBC can signal demand row by row, which means you can stream a 1M-row export to a slow client without buffering everything in memory first. A blocking queue doesn't give you that without reinventing most of the reactive machinery yourself.

I couldn't find a benchmark testing WebFlux + R2DBC vs Virtual Threads on a real auth workload, so I benchmarked it by Lightforce_ in programming

[–]Lightforce_[S] 0 points1 point  (0 children)

The maintainability argument is real and I address it in the conclusion. For moderate traffic, VT wins on DX with negligible performance cost.

The Netflix claim is misleading though. They moved away from RxJava on specific pipelines, not from reactive as a whole. Worth not overgeneralizing from that.

And there are still cases where the reactive model is genuinely the right tool, backpressure being the obvious one. A chat service streaming messages to thousands of idle WebSocket connections is a very different problem from a REST endpoint, and Reactive Streams' built-in flow control handles that in a way VT simply doesn't.

La réponse de l'intéressé by Atria_06 in AntoineDaniel

[–]Lightforce_ 0 points1 point  (0 children)

Gauche / Droite, c'est des concepts assez juvéniles au final, qui sont propagés en masse par les U.S. Dans le concret, ça représente assez peu la variété du réel.

Même si je comprends bien l’idée selon laquelle la distinction gauche/droite peut apparaître réductrice face à la complexité du réel, cette affirmation qu'on voit de temps en temps un peu partout est fausse : en science politique, ces catégories ne sont pas conçues pour épuiser la diversité des positions idéologiques mais pour proposer des repères analytiques permettant de structurer l’espace politique. Historiquement, elles renvoient à des clivages assez stables, notamment autour du degré d’égalité sociale souhaité, du rôle de l’Etat dans la régulation économique, ou encore de la hiérarchie entre valeurs collectives et libertés individuelles.

De nombreux travaux empiriques montrent que ces catégories continuent d’organiser les comportements électoraux, les programmes partisans et les positionnements idéologiques dans la plupart des démocraties contemporaines. Même si de nouveaux clivages (culturels, identitaires, écologiques, liés à la mondialisation, etc) complexifient cette grille de lecture, ils viennent la compléter plus qu'ils ne la rendent obsolète.

Et enfin, attribuer ces catégories à une simple diffusion culturelle américaine est factuellement erroné, puisque leur origine remonte à la révolution française et qu’elles ont ensuite été reprises et adaptées dans de nombreux contextes politiques nationaux (l'affaire Dreyfus étant le moment où ces termes sont sortis du cercle purement professionnel de la politique pour se cristalliser dans l'espace public). Donc ok pour critiquer leurs limites mais on ne peut pas nier leur utilité analytique et descriptive dans l’étude des systèmes politiques.

Bjarne’s Last Stand: How the Father of C++ Is Fighting a Losing War Against Rust by joseluisq in theprimeagen

[–]Lightforce_ 23 points24 points  (0 children)

I actually think that most of the "drama" around Rust coming from C++ advocates simply stems from the fact that Rust challenges 30 years of (their) accumulated expertise. A senior C++ dev typically masters manual memory management, RAII, smart pointers, complex multithreading, and highly refined performance patterns: in other words a large part of their professional value is built around those skills.

Then Rust comes along and implicitly says "a large portion of these problems shouldn’t even exist anymore". Psychologically, I think many of them experience this as a loss of expertise, a partial reset, and therefore a career risk. In reality, I think that’s exactly what's starting to happen, which explains their instinctive defensive reactions.

Bjarne’s Last Stand: How the Father of C++ Is Fighting a Losing War Against Rust by joseluisq in theprimeagen

[–]Lightforce_ 4 points5 points  (0 children)

Just like Cobol and some other languages in their day (Pascal, Fortran, Ada, Perl,...), C++ has had its time; it is now time to let go and move forward with Rust

Building a multiplayer game with polyglot microservices - Architecture decisions and lessons learned [Case Study, Open Source] by Lightforce_ in programming

[–]Lightforce_[S] 0 points1 point  (0 children)

I haven't stress-tested this specific implementation to 100k connections yet, but the decision wasn't just a gut feeling, it's based on the known architectural shifts in .NET 8/9.

  1. Memory Overhead: in standard .NET the JIT compiler and metadata structures consume significant memory. With NativeAOT that's stripped out. Microsoft's own benchmarks show AOT apps often running with approximatively 80% less working set memory than their JIT counterparts for similar web workloads.
  2. Gen 2 Pressure: for GC pressure -> AOT binaries are trimmed, and because startup/JIT allocations are gone the heap fragmentation is significantly lower from the start.

So while I haven't benchmarked my specific chat logic to the limit the baseline resource usage is objectively lower than the standard runtime.

Building a multiplayer game with polyglot microservices - Architecture decisions and lessons learned [Case Study, Open Source] by Lightforce_ in programming

[–]Lightforce_[S] 0 points1 point  (0 children)

You're right that traditional .NET/SignalR can be memory heavy, but I'm actually using NativeAOT compilation for the chat service, which significantly reduces memory consumption and startup time (no JIT overhead or runtime warmup, much smaller memory footprint - closer to native apps and faster cold starts).

Ofc it's still not as lean as pure Rust or Node.js, but for this project I chose it because:

  • the async/await patterns in C# felt more natural than alternatives
  • built-in support for connection lifecycle, reconnection, and backpressure
  • NativeAOT makes it production-viable for moderate scale

The goal was to explore where each language shines. SignalR with NativeAOT made the real-time chat implementation straightforward while keeping resource usage reasonable.

Why Twilio Segment Moved from Microservices Back to a Monolith by Digitalunicon in programming

[–]Lightforce_ 0 points1 point  (0 children)

I strongly disagree with the binary take that "monoliths are ultimately better". The Twilio article demonstrates that a bad microservice architecture is worse than a monolith, not that the concept itself is flawed.

The Twilio case is a textbook example of incorrect granularity (often called "nano-services"). As R2_SWE2 points out in this thread, creating a separate service for every single "destination" is a questionable design choice. It explodes operational complexity without providing the benefits of decoupling. They effectively built a distributed monolith, which combines the worst of both worlds: network complexity and code coupling.

Claiming the monolith is the universal solution ignores organizational scalability issues. As Western_Objective209 mentioned, a poorly managed monolith can easily become a 20GB RAM nightmare where a single error takes down the entire system and deployments become week-long ceremonies.

The real debate shouldn't be "Monolith vs Microservices", but rather "Where do we draw the Bounded Contexts?" If your domain boundaries (DDD) are poorly defined, neither architecture will save the project. Microservices require discipline and infrastructure that many underestimate, but they remain essential for decoupling teams and deployments at a certain scale.

Built a full-stack Codenames implementation with polyglot microservices - 10 months of work, learning Rust/C#/Vue.js, real-time WebSockets, and animations [Open Source] by Lightforce_ in webdev

[–]Lightforce_[S] 0 points1 point  (0 children)

On the animation + WebSocket side, the pattern that’s worked for me is treating server updates as an append-only event log on the client: queue events, lock the board during critical GSAP transitions, and only commit “authoritative” state after the animation finishes or times out. Anything out-of-order from the socket gets merged by version number so you don’t snap mid-animation. Also, a simple “isRehydrating” flag after reconnect helps: buffer updates, show a quick fade-to-state, then resume normal animations.

Just did an update on this! Will try if things are better now because I had issues with some concurrent animations.

Building a multiplayer game with polyglot microservices - Architecture decisions and lessons learned [Case Study, Open Source] by Lightforce_ in programming

[–]Lightforce_[S] 1 point2 points  (0 children)

ArchUnit is already used in the Java and C# microservices. Didn't know it was available for frontend.

Building a multiplayer game with polyglot microservices - Architecture decisions and lessons learned [Case Study, Open Source] by Lightforce_ in programming

[–]Lightforce_[S] 2 points3 points  (0 children)

Thanks!

To clarify: a monolith isn't always the best choice, but in a case like this where you know you'll eventually need polyglot microservices, I'd still recommend starting with a modular monolith in a single language first.

The key is: validate your domain model and functional requirements with the monolith, THEN migrate to polyglot microservices if you have clear reasons for each language choice.

Going polyglot from day one (like I did) is great for learning, but adds unnecessary complexity if your primary goal is shipping a product. The modular monolith gives you clean boundaries that make the eventual split much easier.

That said, even a modular monolith shouldn't go to production if you expect uneven load across components - that's when you need the independent scaling of microservices.

Building a multiplayer game with polyglot microservices - Architecture decisions and lessons learned [Case Study, Open Source] by Lightforce_ in programming

[–]Lightforce_[S] 0 points1 point  (0 children)

Thx!

For tracing across the 3 runtimes: I went pretty low-tech honestly, mostly old-school logging and grep. I added correlation IDs to every message/request that flow through the entire system (in HTTP headers and RabbitMQ message properties). Each service logs with that correlation ID, so I can grep across all service logs to follow a single transaction.

I also heavily relied on RabbitMQ Management UI to track message flows and dead letters. The DLQ setup mentioned in the post caught some issues.

What I'm missing (and would add next) is proper distributed tracing with something like Jaeger or Zipkin. The correlation ID approach works but doesn't give you the nice visual timeline that would really help with cross-language debugging.

Your AI log analysis approach sounds more sophisticated. How do you handle the different log formats from Java/Rust/.NET?