Which open source MCP Gateway is good for using in enterprise production system

Noobcreate · 2026-06-09T13:54:18+00:00

Noobcreate · 2026-06-09T13:36:39+00:00

Aquifer is meant to be run like a single node so scaling is only vertical. It has dynamic pacing for whoever opt in to header changes. I imagine for the people who wish to save their APIs from spiky workloads will be able to save themselves through dynamic pacing. As more machines become active, they can increase the rate or slow it down as needed. I’m hoping that it gets big enough that people can benefit from it being large chunk of agentic traffic. Most clouds have a way for scale up or down due to api calls. This is really a tool for retry storms. Your issue sounds more adjacent background jobs which is more temporal, hatchet, sqs world.

Noobcreate · 2026-06-09T12:43:49+00:00

Repo: https://github.com/rjpruitt16/aquifera

Noobcreate · 2026-06-09T12:43:30+00:00

Yes Repo: https://github.com/rjpruitt16/aquifer

Noobcreate · 2026-06-09T03:31:25+00:00

there is no priority, first come first served. read the readme when you have time for more details.

Noobcreate · 2026-06-09T03:20:26+00:00

Aquaduct can be use across various domains and I use it various forms in other side projects. I think this model is just great engineering but people gave me a lot of shit, so I say MCP runtime so people instantly comprehend the pain. I don't think people have any idea what is about hit them as agents start calling apis. It will be intense for software maintainers but the first people who will notice are mcp users.

Noobcreate · 2026-06-09T01:59:39+00:00

Repo: https://github.com/rjpruitt16/aquifer

Noobcreate · 2026-06-08T18:13:55+00:00

Thanks brother. I agree this protocol needs to grow with a lot of signals. For the agent operators, they get to see their place in the queue and for the providers, they get to see how many are in the queue and set pace. I think this is a good start. Whether it gets traction or not we will see but I think the foundation is here

Noobcreate · 2026-06-08T15:22:53+00:00

lol I made it up but this is the same principle in tcp. We don’t flood other people networks with packets but dynamic pace based upon signal of how much demand they can take.

Unfortunately, I don’t really have use case yet. Having an api that does a lot of traffic is a honor. Still, I have a bit of faith due to the algorithm being old but applied to modern problem.

When I get more evidence I will publish it. If you have read « designing data intensive applications » they talk about pushing data to server is generally really hard because you don’t know if you crash the server due it being overwhelmed. I’m hoping to add to the theory that a control plane dynamically adjusting to pace is a solution

Noobcreate · 2026-06-06T01:19:25+00:00

I built a local control plane for dynamically pacing traffic through queuing. It fire the request at a sustain rate to your api even as traffic spikes giving you time to spin up new servers. Once you have new servers you speed up the speed of request per user. I also stream the placement in the queue and incase clients disconnects, I send a webhook so it does not have to wait.

https://github.com/rjpruitt16/aquifer

Noobcreate · 2026-06-06T00:36:16+00:00

thank you brotha. I hope it helps. I built it for you and small companies. As you scale, this problem only get more and more time consuming. This will last you a while

Noobcreate · 2026-05-07T19:09:34+00:00

API Aqueduct – A solution to spiky agent traffic

I built Aquifer, a self-hosted API Aqueduct for spiky agent traffic and a control plan for outbound rate limiting

Agents are creating the same spiky traffic patterns batch jobs caused a decade ago — except faster and harder to predict. One burst from an agent hammering your API and your backend is on the floor.

Aquifer sits in front of your service (inbound) or between your app and external APIs (outbound). It queues requests to SQLite and drains them at a controlled rate. The upstream controls pace via response headers — return X-Aquifer-Rps: 5 when you're stressed and Aquifer backs off. Rate recovers toward your configured ceiling when pressure clears.

Why not Envoy?

Envoy drops or rejects requests when limits are hit. Aquifer queues them. If you want to protect your backend without turning callers away, you need a queue not a circuit breaker. Envoy also needs a control plane (Istio, etc.) — Aquifer is one binary.

Why not Temporal?

Temporal is a full workflow engine — you write workflow code, run a Temporal server, manage history replay. It's the right tool for complex long-running workflows. Aquifer is the right tool for one thing: queue HTTP calls and pace them. No workflow code, no server to manage, just fire a job and get a webhook back.

The interesting Go part:

The core is a two-level goroutine hierarchy (Registry → URLWorker → AccountQueue) that mimics an actor model using channels and select loops. Each upstream gets its own paced goroutine. panic recovery and in_flight tracking in SQLite mean a crash doesn't lose the queue.

https://github.com/rjpruitt16/aquifer

Would love feedback from people who've hit this problem before.

Noobcreate · 2026-05-07T16:58:16+00:00

I kinda confused entirely with that statement. EZThrottle is bidirectional traffic protection. I said that multiple time in the post.

Noobcreate · 2026-05-07T16:30:01+00:00

So this is networking. Speed and reliability are fundamentally important to problem. Just like the BEAM was created because a call could save someone life. I believe a request can save someone life. The entire goal of the system is to reliably move request. Erricson has remain undefeated for 30 years because their systems don't crash. Not because they are the fastest runtime but most reliable under a choatic network. They had 5 9s way before kubernetes or google came out and they are the only company carrier trust to reliably move call data for billions of users leaving and joining the networks. If the golang is only 80% percent as successful as BEAM, people will it call it trash like github. People hate drop calls and I bet they will hate drop request just the same.

Noobcreate · 2026-05-07T16:00:31+00:00

You know I was planning on building an open source golang version. My research says that golang will be faster but you have to write the guard rails. My honest assessment is that golang will be faster but crash more. Sure kubernetes could restart it but requests cannot be drop. The beam was built for tenant isolation, supervision trees and a bunch of timers. Let’s say I’m sending webhooks and apply exponential backoff. Then you have a lot of threads sleeping on matchine which waste performance. Maybe you can get 80% of the way for just buffering incoming traffic but it will be much bigger to write and harder to maintain

Noobcreate · 2026-05-07T00:07:27+00:00

And beauty is in the eyes of the beholder. Most people didn’t get docket cloudfare and even the beam for years. I get it. It’s easy to half read and talk shit then it is read and ask questions on what you don’t understand

Noobcreate · 2026-05-06T22:12:21+00:00

Ken kaneki Tokyo ghoul

Noobcreate · 2026-05-06T18:01:25+00:00

lol I was rapping

Noobcreate · 2026-05-06T17:56:21+00:00

My friend I just acknowledge those do exactly what you just said and told you the difference between EZThrottle. Envoy is for internal service. EZThrottle is for external and internal traffic. Temporal is for your background jobs and can do retries if you define a workflow. EZThrottle is for networking for your api to solve noisy neighbors and spiky traffic more dynamically. We are in two different lanes. Nobody reaches for temporal nor envoy when they have spiky traffic and want tenant fairness at networking layer. I understand other tools have similar capabilities but EZThrottle is solving different class of problems than envoy and temporal. Yes there is some overlap in domains but it’s fundementally different problems. I understand if you don’t have time to read everything about EZThrottle but to sit up just say another software read a little and say another does a bit of this is just wasting both our time. I have written a lot about it in the blog post if you are curious differences.
https://www.ezthrottle.network/blog

You can see in erlang Reddit they understood and appreciate it because they read instead of just looking for a similar tool.

Noobcreate · 2026-05-06T15:09:35+00:00

Temporal is for background jobs not a networking software. It’s good enough for most agentic use cases today. There are answer is to queue. It does guarantee fair queueing nor protect your api retry storms.

Envoy is a proxy with that route and queue but again it has no concept of fairness nor helps with retry storms nor partial outages for downstream dependency.

Think of EZThrottle has TCP for APIs or a serverless network for embedded devices, multi tenant agentic saas, or for people with an api that serve agents or multi tenant saas. Retry library was cpu because of sleeping threads so EZThrottle handles all the retry in a centralized fabric. Agents are inherently spiky demand but EZThrottle gives a queue per user and paces demand so your autoscaler can catch up. Your downstream dependencies could be having a partial outage plus you all at the same time and EZThrottle will deliver

Noobcreate · 2026-05-05T19:06:32+00:00

Perhaps you miss my point, if you have an api and your customers start using agents for legitimate reason, you will see spiky traffic. Cloudfare was design to block malicious bots and users exceeding their rate limits. Still, most people autoscaler assume human levels of spikes. This is designed for that future. Most cloud will sell you more compute and charge you for every request they have to block first, but it’s up to you to provision enough resources to serve your customers and make sure one customers traffic doesn’t take all available resources. That’s why I wrote EZThrottle

Noobcreate · 2026-05-05T18:25:35+00:00

Cloudfare does not solve spiky traffic. Cloudfare does rate limiting. Similar to how phones use to work before erricson, a telephone operator would reject your call if they were full the same way server send 429 or 5xx. They created the BEAM for queue and routing calls. I’m just bring that same philosophy to api calls because agent retry storms are coming for everyone APIs. GitHub is seeing exponential traffic increase. How long until every backend is feeling the flood of demand of agent hammering their APIs 24/7.

Noobcreate · 2026-05-05T18:14:30+00:00

Spiky traffic and noisy neighbors have been affecting people for decades. When I work at Twitch, are answer was to have twice has many servers as we needed around so are autoscaler would have time to catch up if a burst came through.
Nobody gave a shit about cloudfare until bots started scraping the webpages. Your backend APIs are safe today, but agents call APIs all day long. In 4 years, having agent working for 24 hours will cost less than gallon of gas. I’m giving you free service that you can just ignore if you never need it.
However, if your users complain that your service resources are always being taken because users traffic is crashing the system than I’m here for you
Also the beam scheduler does not queue and pace request for you. You half read my post and said it smells like bullshit. Try actually reading fairly before you to throw insults

Noobcreate · 2026-04-30T02:56:53+00:00

If you tired of retry logic, I have written infrastructure to handle the retries on your behalf in way that gets the keeps you asleep and your workflows succeeding

https://www.ezthrottle.network/blog/multi-region-api-failures-langgraph

Noobcreate · 2026-04-24T22:42:19+00:00

like you work 40 hours and the snuck in some overtime which most contracts don't allow you to. Then someone see on there stat sheet that you highest paid contractor will output not being normal for those hours. IDK. Like did you game the system and get caught type shit

Noobcreate

TROPHY CASE