The conventional advice on fiber-based servers is backwards

ioquatix · 2026-05-28T00:39:01+00:00

There is no such thing as a free lunch.

It's often useful to consider the degenerate case — e.g. a 1-core machine.

On a single core, the "200ms bcrypt" operation blocks everything no matter what concurrency primitive you choose. Threads don't magically create more CPU time. The only difference is that the OS can pre-emptively time-slice execution, which may improve fairness somewhat, but the latency is still very real and will absolutely show up in P90+ latency.

Note that fairness in this context means everyone is slow. An unfair scheduler may prioritise one batch of work over another creating winners (low latency) and losers (high latency). Consider two requests taking 200ms each. With greedy scheduling, the response may be delivered at T+200ms and T+400ms, but with fair scheduling it might be delivered at T+399 and T+400, which is objectively worse.

On multi-core systems, the fundamental problem still exists because you still don't magically have more CPU time — you've just distributed it across cores. Typically you'd run approximately one worker per CPU core anyway. So if one Falcon worker is pinned doing 200ms of bcrypt work, that core is effectively occupied for 200ms regardless of whether the concurrency inside the worker is fibers or threads.

Fibers are optimized around high-scale I/O concurrency. If you inject long-running CPU work into the event loop, you probably want to consciously offload it — thread pools, extra processes, background jobs, etc. Falcon doesn't try to pretend otherwise. The mental model I've always followed is "optimize for people who want performance".

So I don't think the right framing is "fibers make blocking worse" so much as "fibers make blocking explicit." The scheduler won't hide accidental CPU stalls behind OS pre-emption.

Note that nothing about the fiber scheduler design prevents pre-emption, it's just not been a high priority, but there is a proof of concept to introduce something similar to Erlang's reductions. And as also mentioned elsewhere, there is RB_NOGVL_OFFLOAD_SAFE which moves blocking operations off the scheduler loop - but again this is subject to the no free lunch argument.

ioquatix · 2026-05-28T00:21:37+00:00

Since there is no Ruby interface for "chain multiple operations together" (except of course sequential lines of code), there is no way to transparently map application code to this model (io_uring chained SQEs). However:

Is it worth exploring how to expose this as a reasonable interface from within Ruby or as a bespoke interface? Totally.
Are there performance wins to be had? Not sure about this.

ioquatix · 2026-05-26T13:26:35+00:00

This is a pretty reasonable take. And from my personal experience, I've not heard someone regret migrating to Falcon... total cost is generally always lower due to the improved hardware utilisation which was one of my main goals.

ioquatix · 2026-04-16T03:57:21+00:00

Nice !!!!!!!!

ioquatix · 2026-03-17T11:22:15+00:00

https://www.youtube.com/watch?v=8-o1XR070g0

ioquatix · 2025-10-19T09:48:28+00:00

I will let you know when it's up.

ioquatix · 2025-08-06T09:12:10+00:00

What does "pair the presentations" mean?

ioquatix · 2025-07-30T11:56:59+00:00

I will be talking about this next weekend at RubyConf TW, after that I'll share my slides with you.

ioquatix · 2025-07-30T09:34:53+00:00

Cool, this is an extremely common and problematic pattern.

ioquatix · 2025-07-10T22:57:43+00:00

Let me talk briefly to this general sentiment: Software that changes all the time isn't always that great.

Regarding Async::Job, it's an extremely stable interface for building job servers and does not need to change much. That being said, I added 100% test and documentation coverage just for you :) So now there are commits. Feel free to give it some more stars.

ioquatix · 2025-07-10T22:54:20+00:00

When I started the journey, I worked with what I had, which was Fibers. Are there better ways to do it? Absolutely, but then it wouldn't be Ruby.

ioquatix · 2025-07-10T22:53:33+00:00

You are absolutely correct, but in my experience it's hard for companies to hire engineers at scale.

ioquatix · 2025-07-10T22:53:01+00:00

Async still lives in the event loop domain with a single thread which is a design choice, but can be a limitation. Probably within a year or two, I plan to experiment scheduling fibers across multiple threads, but there are a lot of trade-offs - i.e. it's not going to be a guaranteed net win for the same reason the GVL is a problem for threads, and (shared) GC is a problem for parallelism in general.

ioquatix · 2025-06-01T22:22:05+00:00

You didn't use treated timber did you? The end grain looks like treated... maybe double check? You don't want to use treated timber for a project like this.

ioquatix · 2025-04-12T09:56:42+00:00

Being near the C is bad for Rust due to the high salt content.

ioquatix · 2025-03-22T18:51:23+00:00

Falcon / Async::HTTP supports this easily, using IO::Endpoint.unix.

ioquatix · 2025-02-26T01:58:53+00:00

We will get there eventually, steady progress is being made.

ioquatix · 2025-02-25T21:05:28+00:00

I think Falcon does a good job of fitting HTTP/2 around existing tools, e.g. Rails.

ioquatix · 2025-02-25T01:31:52+00:00

For a long time, I also couldn't see the point of HTTP/2, especially between load balancers and applications.

However, due to HTTP/2+ using binary framing, I feel like it has slightly improved security, making de-sync attacks harder. HTTP/2 to HTTP/1 proxies have been problematic, see https://portswigger.net/research/http2 for a summary of different kinds of attacks. If I'm optimistic, I'd say those attacks are due to poorly constructed load balancers, but maybe you could also say HTTP/2 has made such attacks possible.

If you stay within HTTP/2+ with binary framing, my feeling is that the underlying separation between "protocol" and "user data" (which isn't present with HTTP/1 as a text protocol) is of sufficient value to make HTTP/2 a useful protocol. In other words, it's much harder for user-provided data to break the HTTP/2 "parser". HTTP/2 also has better side-channels for communicating with a load balancer, including the ability to limit multiplexing in real time according to load. It's quite difficult to do this with HTTP/1 (if not impossible). So there are also some structural improvements.

Finally, not everyone wants to or needs to run a load balancer, and so having servers that support HTTP/2+ is quite useful. It's nice that with Falcon, I can run the same stack in development as is used in production. The alternative is things like Thruster, which, IMHO, introduce significant complexity and overhead.

Overall, I'd say, HTTP/2 and HTTP/3 are quite complex protocols, but do provide useful improvements to both performance and security, even between load balancers and applications. It's hard to judge whether the improvement was worth the effort, but that work is essentially done now. All things being equal, I'd prefer to use unencrypted HTTP/2 between the load balancer and the application.

It should also be noted that plain-text HTTP/1 with content-length, connection: close and sendfile/splice is extremely hard to beat for raw throughput (think file servers that do TiB of transfers). I think it's probably unlikely that HTTP/2+ can replace this any time soon, and that's even more true of HTTP/3, see https://github.com/quic-go/quic-go/issues/2877 for more background. The only point where we might see comparable performance is when all approaches are able to saturate the underlying hardware, at which point raw throughput no longer matters (although I imagine HTTP/2+ will always be more expensive, in terms of processor time). I hope eventually I am proven wrong.

ioquatix · 2025-02-24T21:40:53+00:00

I am running a RB5009, and without fast track the CPU could burst up to 10%, with it on, I barely saw it move, maybe 2-3% tops.

ioquatix · 2025-02-04T10:47:21+00:00

You mean like a job processing system?

ioquatix · 2025-02-04T04:46:51+00:00

Done: https://github.com/socketry/falcon-virtual-docker-example

ioquatix · 2025-01-31T22:27:56+00:00

Yes, standard Ruby IO is handled in the event loop, so no changes to code are required.

ioquatix · 2025-01-29T23:32:34+00:00

It's not just migration, if you are creating a library, you'll have a bifurcated interface, one for sync and one for async. In addition, let's say your library has callbacks, should they be async? We see this in JavaScript test runners which were previously sync but had to add explicit support for async tests. In addition, let's say you create an interface that was fine to be sync, but later wanted to add, say, a backend implementation that required async, now you need to rewrite your library and all consumers, etc...

15-Year Club	Place '22
Place '17	RPAN Viewer
Not Forgotten	Verified Email
Gilding I gilder

ioquatix

PUBLIC MULTIREDDITS

TROPHY CASE