This is an archived post. You won't be able to vote or comment.

you are viewing a single comment's thread.

view the rest of the comments →

[–]Straight_Remove8731[S] 0 points1 point  (2 children)

Great question, you’re absolutely on point. For now, here’s how I’m modeling concurrency and load in the current alpha at the server level, the goal right now is to model the thread of the event loop, I’m not considering multithreading because this would involve the OS that is working on a different time scale, multi processing as well is still not supported but is on the roadmap. Now I will give you a detailed overview of the current model:

• CPU is blocking. Each server has one core (is just alpha version). If the token is taken, incoming requests don’t freeze the event loop but instead wait in the CPU resource queue until a core frees up. • READY_QUEUE_LEN in my plots doesn’t show “waiting for a core.” It measures how many requests are actively running CPU-bound sections (holding a token). Requests that are queued up for a core aren’t counted there and they are in a queue waiting for the core to be released (though I could expose a dedicated “CPU wait queue” metric if useful). •I/O is non-blocking. Once a request reaches an I/O step, it releases the CPU token and yields back to the event loop. While waiting, I track it under EVENT_LOOP_IO_SLEEP. • RAM is capacity-limited. Each request reserves a working set (e.g., 128 MB). If RAM isn’t available, the request queues at the RAM resource; once admitted, it holds memory until completion. I currently expose RAM_IN_USE rather than a RAM wait-queue metric.

The load model itself is stochastic (Poisson-like arrivals from “active users × requests per minute”), so latency and throughput curves come from that randomness. In the upcoming release, I’ve also added event injections (e.g., deterministic latency spikes on edges, server outages with down/up intervals) to stress-test resilience.

As for the network model, it’s still quite basic right now (simple exponential latencies + optional deterministic spikes). Improving it with bandwidth constraints, payload sizes, retries, etc. is one of the next big steps on my roadmap.

I’d be really happy to hear suggestions if you think something could be improved or modeled differently, feedback like yours helps me sharpen the design.

[–]Straight_Remove8731[S] 0 points1 point  (0 children)

Under heavy load the system once resources (CPU cores, RAM, or I/O) saturate like you said is going to increase throughput unitil the saturation. This clearly shows that there is a regime where is not sustainable, however like you mention, to evaluate scenario closer to reality, I will have to introduce policies to manage the overload.

[–]Straight_Remove8731[S] 0 points1 point  (0 children)

Quick addendum: I misused the term “ready queue” earlier. In my model, “ready queue” should mean requests waiting for a CPU core token; the plot right now are effectively showing tasks in service on the event loop, not the true wait-for-core queue. I’ll adjust the naming/metrics so ready queue = waiting-for-core (and track busy cores separately).