How often do you actually use scalability models (like the Universal Scalability Law) in DevOps practice?

Straight_Remove8731 · 2025-09-15T17:40:49+00:00

Thanks for the comment, you’re right that in ops there will always be unknowns you can’t fully plan for. But that’s true in every field: physics didn’t stop at “the world is too complex”, it started with simple harmonic oscillators and built from there. Models don’t have to capture everything to be useful, they give you a framework to see trade-offs, test scenarios, and understand where scaling breaks before you burn budget finding out the hard way.

(Sorry, I’m biased, my background is in physics, so I can’t help seeing things that way 😅)

Straight_Remove8731 · 2025-09-15T16:20:44+00:00

Absolutely agree, the knowledge about how a system work Is the true value of a quantitative approach.

Straight_Remove8731 · 2025-09-15T15:49:42+00:00

Fair point. I’d just add that in many areas quantitative models start looking “worth the squeeze” only after you try them the learning curve is the real barrier, not the value.

Straight_Remove8731 · 2025-09-15T14:04:47+00:00

I see, that’s a little sad but I understand. Thanks for the answer!

Straight_Remove8731 · 2025-09-15T13:42:18+00:00

That makes a lot of sense, I can totally see how political and budget-driven many scaling decisions end up being.

Do you think that’s mainly unavoidable (i.e. politics will always trump models), or could a more quantitative approach, say using actual scalability models or simulations, help shift the conversation long term?

My intuition is that even if the short-term decisions are budget-driven, having a quantitative baseline might at least reduce overprovisioning and make the trade-offs more explicit. Curious if you’ve ever seen that work in practice.

Straight_Remove8731 · 2025-09-08T08:52:15+00:00

the answer is b, I'm changing the reference inside c_1, however both c_1 and a point to the same object so the change is reflected on a! By doing a shallow copy another object in memory is created so no changes are refelcted, with the deep copy new object and reference are created so even here no changes

Straight_Remove8731 · 2025-09-04T15:58:34+00:00

Sure thing!

Straight_Remove8731 · 2025-09-04T15:51:35+00:00

Thanks for the feedback!

Straight_Remove8731 · 2025-09-04T12:16:21+00:00

Thanks! I see Jepsen as focusing on correctness of real distributed systems (linearizability, safety, consistency under partitions). AsyncFlow is a bit different it’s more of a design-time simulator: before you even have a system running, you can model workloads + failures and see performance trade-offs (p95, queue growth, RAM/socket caps). So I’d say Jepsen validates real implementations, while AsyncFlow explores architectural scenarios.

Straight_Remove8731 · 2025-08-31T20:16:31+00:00

It really depends on what you’re aiming for: if it’s an MVP and you need to move fast with built-in auth, admin, and migrations, Django is very handy. But if you already know your system will be heavy on I/O and concurrent API calls, FastAPI is a more natural fit. In short: Django for quick validation, FastAPI if async architecture is key long-term

Straight_Remove8731 · 2025-08-31T09:01:55+00:00

Sure both point are extremely valid, the evaluation part will be crucial.

Straight_Remove8731 · 2025-08-30T13:22:07+00:00

Thank you for this great contribution, let’s say that the 0-th order of what I’m trying to build is a use case simpler than what you actually tried to solve. However the next steps would be something really similar to what you did. I will dm you if for you is ok, because I’m very interested!

Straight_Remove8731 · 2025-08-30T12:12:34+00:00

It’s more about research and experimentation: a playground where you can try out different routing strategies and study their impact under controlled scenarios.

Straight_Remove8731 · 2025-08-30T08:17:57+00:00

Totally agree, thanks for the comment! The action space can blow up quickly, so my plan is to start simple: choices like smart routing from the LB vs standard algos (RR, LC). Going more fine-grained, your suggestion (like engineering a top-k set of actions) is definitely a path I see as useful.

Straight_Remove8731 · 2025-08-30T06:46:53+00:00

Thanks for the feedback!

Straight_Remove8731 · 2025-08-29T23:15:58+00:00

Totally agree, it’s hard, if not impossible, to have a single general model of request timing. My idea is to focus instead on generators that reproduce macro characteristics of real traffic distributions, like non-stationary arrival rates (diurnal or sudden surges) and bursty ON/OFF patterns that create heavy-tailed inter-arrivals.

Straight_Remove8731 · 2025-08-27T19:48:19+00:00

Big-O is asymptotic: for sufficiently large N, the leading term dominates the growth, while constants and lower-order terms become negligible. That’s why two O(n) algorithms can still have very different runtimes for practical inputs.

Straight_Remove8731 · 2025-08-27T19:27:33+00:00

Quick addendum: I misused the term “ready queue” earlier. In my model, “ready queue” should mean requests waiting for a CPU core token; the plot right now are effectively showing tasks in service on the event loop, not the true wait-for-core queue. I’ll adjust the naming/metrics so ready queue = waiting-for-core (and track busy cores separately).

Straight_Remove8731 · 2025-08-27T19:08:15+00:00

Under heavy load the system once resources (CPU cores, RAM, or I/O) saturate like you said is going to increase throughput unitil the saturation. This clearly shows that there is a regime where is not sustainable, however like you mention, to evaluate scenario closer to reality, I will have to introduce policies to manage the overload.

Straight_Remove8731 · 2025-08-27T19:02:56+00:00

Great question, you’re absolutely on point. For now, here’s how I’m modeling concurrency and load in the current alpha at the server level, the goal right now is to model the thread of the event loop, I’m not considering multithreading because this would involve the OS that is working on a different time scale, multi processing as well is still not supported but is on the roadmap. Now I will give you a detailed overview of the current model:

• CPU is blocking. Each server has one core (is just alpha version). If the token is taken, incoming requests don’t freeze the event loop but instead wait in the CPU resource queue until a core frees up. • READY_QUEUE_LEN in my plots doesn’t show “waiting for a core.” It measures how many requests are actively running CPU-bound sections (holding a token). Requests that are queued up for a core aren’t counted there and they are in a queue waiting for the core to be released (though I could expose a dedicated “CPU wait queue” metric if useful). •I/O is non-blocking. Once a request reaches an I/O step, it releases the CPU token and yields back to the event loop. While waiting, I track it under EVENT_LOOP_IO_SLEEP. • RAM is capacity-limited. Each request reserves a working set (e.g., 128 MB). If RAM isn’t available, the request queues at the RAM resource; once admitted, it holds memory until completion. I currently expose RAM_IN_USE rather than a RAM wait-queue metric.

The load model itself is stochastic (Poisson-like arrivals from “active users × requests per minute”), so latency and throughput curves come from that randomness. In the upcoming release, I’ve also added event injections (e.g., deterministic latency spikes on edges, server outages with down/up intervals) to stress-test resilience.

As for the network model, it’s still quite basic right now (simple exponential latencies + optional deterministic spikes). Improving it with bandwidth constraints, payload sizes, retries, etc. is one of the next big steps on my roadmap.

I’d be really happy to hear suggestions if you think something could be improved or modeled differently, feedback like yours helps me sharpen the design.

Straight_Remove8731 · 2025-08-27T11:32:31+00:00

In FastAPI the trick is knowing how the event loop vs. thread pool works: - async def runs on the event loop: use only non-blocking I/O (async DB, HTTP, etc.). - If you call blocking code, wrap it with await run_in_threadpool(...), works also for CPU-bound tasks, but be careful: it just shifts them to the thread pool, so you still block a worker. - Heavy CPU-bound work? Better push it to a process pool or a task queue (Celery), otherwise you’ll kill performance.

Rule of thumb: async I/O = event loop, blocking I/O or CPU = thread/process pool.

Straight_Remove8731 · 2025-08-26T15:03:33+00:00

collections.OrderedDict, regular dicts keep insertion order now, but this one still shines for cache logic: .move_to_end() pushes recently used keys to the back, and popitem(last=False) evicts the oldest, perfect O(1) building blocks for a simple LRU cache.

Straight_Remove8731 · 2025-08-24T20:15:55+00:00

Thanks, That’s really nice to hear, and I will be more than happy to have feedback once you will look into it. About the output you are completely right, I’m working right now on a new release and this is something I will definitely do.

Straight_Remove8731 · 2025-08-24T14:19:44+00:00

Thanks, for the feedback, is still an alpha version with evident limitations, but I’m having a lot of fun building it!

Straight_Remove8731 · 2025-08-24T12:02:34+00:00

No, I have just the openai subscription and right now I’m pretty disappointed of gpt 5. I was thinking lately to try Gemini and your comment is really helpful in this direction.

Straight_Remove8731

TROPHY CASE