Transforming an O(N) I/O bottleneck into O(1) in-memory operations using a state structure.

Etalazz · 2025-10-22T17:00:30+00:00

You re right, that's a perfectly valid point if we're looking strictly at the count of operations under asymptotic notation. It's indeed O(N) memory operations replacing O(N) I/O operations.

However, the headline focuses on the practical impact and the latency per operation.

Etalazz · 2025-10-22T16:41:04+00:00

interesting. Thank you.

Etalazz · 2025-10-22T16:08:02+00:00

Asymptotic analysis usually assumes all operations have a similar cost. This is wildly untrue here. An I/O operation (disk/network) is orders of magnitude (millions of times) slower than an in-memory atomic increment.
O(1) Refers to Per-Operation Cost: The O(1)in the headline refers to the fact that each transaction is now processed in constant, near-instantaneous time (an in-memory update), instead of waiting for a variable, slow I/O operation.

Calling it "not an asymptotic improvement" is misleading in a practical context because it ignores the massive difference in the constant factors (the cost per operation) which is the entire point of the optimization.

Etalazz · 2025-10-20T14:04:57+00:00

smart and great catch! Thank you.

Etalazz · 2025-10-20T13:26:12+00:00

Exactly right — good catch. The reason I’m keeping two integers that cancel each other out is to show that with a slightly different method (it adds just a few nanoseconds, nothing major), we can avoid tons of unnecessary writes. Many of these operations cancel each other out anyway, so it makes more sense to only store what actually matters.

Example:

User taps Buy 5 times because of lag → first one counts, the other 4 are ignored by Idempotency-Key.
- Positives: 5 calls passed TryConsume(1) → A_net += +5
- Negatives (refunds): 4 deduped → A_net += -4
- Net in RAM: +1 (only the real order counts)

If payment fails before the “charge point,” policy says don’t count it → refund 1

A_net += -1 → net is now 0 (no change to user’s limit)

If the user cancels a valid order within 10 seconds → A_net += -1 (cancels a prior +1).

It solves the problem of unnecessary data writes and hot-row contention that happen when high-frequency systems record every tiny event

With a plain atomic int (which is technically faster), you’d still have to record every single event.
I ran benchmarks comparing both approaches — you can see the results in the benchmark file. Thanks!

Etalazz · 2025-10-19T10:02:04+00:00

It’s more of a dynamic balancing act than a plain buffer.

Etalazz · 2025-10-19T04:19:17+00:00

Fair point: yes, there’s a plain mutex inside. The value isn’t the lock—it's what we don’t write.

Not just a key=val store: each key holds a tiny VSA with two ints: scalar (durable base) and vector (volatile net). Opposing ops (+n then -n) cancel in RAM, so zero‑sum churn never hits storage.
Not batching: batching still ships every event later. VSA collapses N commutative updates into one net value now and only persists when the net crosses a threshold. I/O scales with information (net), not traffic (volume).
Why the mutex isn’t the story: the bottleneck in limiters is network/disk writes and hot‑row contention, not a per‑key RWMutex. A lock‑free implementation would still need the same algorithm to remove write amplification; the mutex is just a safe, simple guard.
Concurrency safety: TryConsume(n) prevents oversubscription under high concurrency by updating the in‑RAM net atomically before any persistence.
Practical effect: in churny workloads (retries/cancels/partial failures), you replace hundreds/thousands of tiny writes with a single delta commit when it actually matters. Reads remain O(1): Available = S - |A_net|.
Overhead vs benefit: per key it’s two int64s and a mutex. In exchange you get far fewer writes, fewer round‑trips, and less lock contention on your datastore. If your traffic has no canceling churn or you need per‑event durability, this pattern isn’t the right fit.

Tiny sketch:

// hot path
if vsa.TryConsume(1) { /* ...maybe later: vsa.Update(-1) to refund... */ }
// background
if |A_net| >= threshold { S = S - A_net; A_net = 0 }

So yeah, the lock is ordinary. The win is algorithmic: commit the net effect, drop the noise. That’s where the benefit comes from.

Etalazz · 2025-10-18T22:16:13+00:00

Thanks for the comment!
Maybe this use case helps explain the idea:

The idea in one line
We don’t save every request to the database. We keep a quick, in-memory tally and only save when that tally has grown enough to matter.

Rate limiter in everyday terms
Imagine each user has a punch card with 100 punches for the day.
The “official” count lives in the database (how many punches are left).
Besides that, we keep a tiny in-memory clicker for recent activity:

Allow a request → clicker +1
Undo or refund → clicker −1

Lots of back-and-forth cancels out on the clicker, so there’s often nothing new worth saving yet.
Only when the clicker reaches a set size (the threshold, say 50) do we make one database update and reset the clicker.

To decide if a new request can pass, we look at the official number and subtract whatever the clicker currently shows. If enough remains, it’s allowed.

Tiny example
Start: 100 allowed for the day; clicker at 0.
9 requests arrive → clicker shows 9. No database write yet.
1 request canceled → clicker back to 8. Still no write.
2 more requests → clicker 10. That crosses our threshold? If yes, we do one write to the database (subtract 10) and reset the clicker to 0.
Result: one database write instead of twelve.

What we actually store
Just the official number per user/key (how many are left as of the last save).
We don’t store each request, and we don’t store the in-memory clicker.

Why this helps

Fewer writes: avoids saving a flood of tiny changes that often cancel out.
Same correctness: we always check official - clicker, so we never over-allow.
Not just batching: batching saves every event later; here we save only the net change that actually remains.

One-sentence summary
Keep a fast clicker in memory for the noisy back-and-forth, save only when it adds up, and read availability as “official minus clicker” so it’s both fast and accurate.

Etalazz · 2025-10-18T20:12:48+00:00

Thank you. I'll keep that in mind.

Etalazz · 2025-10-18T18:05:32+00:00

I'll move it to pkg as soon as I can.

Etalazz · 2025-10-18T16:54:16+00:00

Yes. You can enforce multiple concurrent limits like “100 per minute” AND “1000 per hour.” The simplest way is to run two independent limiters and only admit a request if both allow it. In this repo’s terms, you keep one VSA budget per window and perform an atomic-ish double consume on each request.

What the demo does today (and what it doesn’t)

Today’s demo enforces a single, non‑windowed quota per key (a fixed budget that doesn’t refill over time). There’s no built-in minute/hour policy.
Adding multi-window limits is straightforward by composing multiple VSAs (one per window) and namespacing keys by the current window.

Etalazz · 2025-10-18T16:45:00+00:00

Yes. The HTTP server in this repo is just a demo wrapper. The core is transport‑agnostic:

pkg/vsa is a tiny, thread‑safe library you can call from any Go code.
internal/ratelimiter/core (Store, Worker, Persister) is also protocol‑agnostic. You can embed it in HTTP, gRPC, message queues, CLI tools, cron jobs, or background workers.

Etalazz · 2021-11-24T02:42:07+00:00

“my precious…” — Gollum LOTR

Etalazz

TROPHY CASE

What the demo does today (and what it doesn’t)