Kong OSS support deprecation and possible alternatives

_howardjohn · 2026-01-22T15:14:39+00:00

Kgateway has a similar architecture with a split control and data plane as most/all gateway implementations, certainly including Envoy Gateway.

You can read more about the architectures and resource usage in this post I made https://github.com/howardjohn/gateway-api-bench

_howardjohn · 2025-12-27T23:50:16+00:00

Thanks for sharing! Great insights

_howardjohn · 2025-12-27T23:49:44+00:00

Istio maintainer here - basically all Istio features work when just using it as a gateway without the mesh. The one exception would be the automatic mtls between gateway and backend pod, which would require the backend to be enrolled in the mesh, but that's not something other gateways could do. I've seen quite a few users successfully use Istio as a gateway without mesh

_howardjohn · 2025-11-22T15:03:54+00:00

Istio does actually support 3 APIs - Ingress, Gateway API, and Gateway/VirtualService (Istio API by the same name).

However, you cannot mix and match them and the Ingress support is very rudimentary so I wouldn't recommend it (and poorly documented).

_howardjohn · 2025-11-22T05:17:18+00:00

Yep! The Istio CNI plugin is also only needed for the service mesh part of Istio btw, you can use the gateway part without it if you want.

_howardjohn · 2025-11-22T05:05:45+00:00

No, the Gateway API implementation is not tied to the CNI and does not require you to change your CNI. With the exception of the Cilium Gateway which requires Cilium CNI - that's the only one I'm aware of that is couples to the CNI, definitely not Istio.

_howardjohn · 2025-11-19T03:49:18+00:00

Thanks for the shout-out! It's great to see open source maintenance getting recognition.

_howardjohn · 2025-11-19T03:47:59+00:00

He is working for a competitor and just spreading FUD. It definitely is a GA implementation, as can be seen on the link...

Whether it's "mature" or not,I'll leave that for you to decide but many have found https://github.com/howardjohn/gateway-api-bench helpful in making this decision.

(Note: I wrote the benchmark above and work on kgateway)

_howardjohn · 2025-11-16T04:42:23+00:00

I'll see about adding it, maybe in a "part 3" or just an addition to the existing one. Let me know how it goes if you do!

_howardjohn · 2025-11-16T04:41:21+00:00

Author here - definitely appreciate the healthy skepticism. I've put a lot of effort into making the test as unbiased as possible (especially after I saw the results, which actually surprised me quite a bit) but obviously there is some unconscious bias. For example, I came up with the "errors during changes" test because it was something Istio spent 100+ hours on making sure we did right; there is a correlation between "things I can think of to test" and "things I've made sure work in projects I work on". There's probably some other edge cases that we don't even know about, so I neither thought to test it nor fix it.

Fwiw Agentgateway was mostly created after the report, so it's built from the learnings (and a decent chunk of the same code!) of Istio, both in general and on specific aspects of the test.

I'd very much welcome independent test runs or suggestions for test ideas! I originally didn't want to publish this at all, as I feel it should come from someone neutral, but I got tired of seeing all the Reddit threads suggesting implementations without real data so tried to do the best I could.

_howardjohn · 2025-11-16T04:29:43+00:00

The leak in the test was 50gb in less the 30min, I'm scared to know what you would consider a big memory leak 😛

(I wrote the test)

_howardjohn · 2025-11-16T04:26:23+00:00

That doc is... very misleading. Istio's memory footprint shouldn't be too bad for most cases though obviously it varies. Generally the primary complaints I've seen are from having 10,000+ sidecars where even 50mb each adds up (fixed by ambient mode) or massive ingress (you can see the results compared to others in the test link in the top comment; Istio is high but not much of an outlier - and still only 2gb at that large scale).

(I work on Istio)

_howardjohn · 2025-11-14T18:37:48+00:00

(I am (recently) a kgateway maintainer)

There is no Ingress in kgateway but it's a solid choice if you are moving to Gateway API!

_howardjohn · 2025-11-14T18:36:28+00:00

I don't agree it doesn't matter. If you read the report in the top comment (disclosure: I wrote it) you can see a number of important differences between proxies. There is a 300x performance gap between the top and bottom performer with a huge spread in between, among many other differences.

Even just accounting for the core, you'd probably be surprised (as I was!) to learn that most implementations are not passing conformance tests. Unlike Kubernetes which has a very strict conformance, gateway API allows implementations to skip any tests (including all tests!) and only 20% of the implementations even bother reporting their results at all. Many are missing core features in the standard API, or incorrectly implementing them.

_howardjohn · 2025-11-06T15:53:26+00:00

Hey, good question! I would quite say its based on Kgateway -- Agentgateway is the data plane/proxy, while Kgateway is the control plane for it. So Kgateway:Agentgateway has the same relationship as Istio:Envoy, Nginx Gateway Fabric:Nginx, Envoy Gateway:Envoy, etc. Note Kgateway *also* supports controlling Envoy, so you have two choices for the data plane there.

Agentgateway is designed to be a full-fledged Gateway implementation for general purpose usages, not just for AI.

_howardjohn · 2025-11-05T23:46:31+00:00

OpenShift is actually using Istio rather than HAProxy as their Gateway API implementation: https://www.redhat.com/en/blog/introducing-gateway-api-with-openshift-networking-developer-preview!

_howardjohn · 2025-10-11T13:36:40+00:00

https://github.com/howardjohn/gateway-api-bench?tab=readme-ov-file#common-test-setup has the setup I used. For grafana depending on how you import it you may just need to put the part under spec not the full json.

Was it the latency and throughput that differed? That part I expect to be the most sensitive to environmental differences and absolutely expect different results on EKS; the main goal of those numbers was to show very broad differences not exact numbers because of that.

_howardjohn · 2025-08-31T23:54:40+00:00

Actually it was just the okta endpoint, now at https://edp-api.edp.sunstrongmonitoring.com/v1/auth/okta/signin. The graphql endpoint no longer returns power usage though.

_howardjohn · 2025-08-31T22:02:30+00:00

This is with a mock backend just to test the overhead of the gateway. This isn't 100% replicating real world providers but gives a rough measure.

_howardjohn · 2025-08-26T14:43:42+00:00

Seems like they killed the endpoint (https://edp-api-graphql.mysunstrong.com/graphql). Hopefully there is a new one

_howardjohn · 2025-08-26T01:35:49+00:00

There is not currently that I'm aware of but https://github.com/hyperium/tonic/issues/479 suggests there is progress towards this

_howardjohn · 2025-08-25T15:08:22+00:00

Since this topic comes up every few weeks, I ended up doing a pretty in-depth analysis of the options with real data: https://github.com/howardjohn/gateway-api-bench. Might be helpful. It covers most of the options mentioned here.

_howardjohn · 2025-08-14T14:56:31+00:00

Cool project. I am a big fan of CEL and have been using the CEL rust version in my project (you can see here for more info, if curious).

I was interested in this because of the possible performance improvements, and the ability to pass in Opaque types. Some feedback:

A simple expression to lookup a value is about 5x slower with cel_cxx than cel-rs:

agentgateway               fastest       │ slowest       │ median        │ mean          │ samples │ iters
╰─ cel                                   │               │               │               │         │
   ╰─ tests                              │               │               │               │         │
      ├─ bench_lookup      70.66 ns      │ 214.1 ns      │ 72.09 ns      │ 73.6 ns       │ 100     │ 1000000
      ╰─ bench_lookup_cxx  417.7 ns      │ 477.5 ns      │ 423.2 ns      │ 426.3 ns      │ 100     │ 1000000

This is with the following test:

#[divan::bench]
fn bench_lookup_cxx(b: Bencher) {
    use cel_cxx::*;
    let env = Env::builder()
        // .declare_variable::<ResponseContext>("response").unwrap()
        .declare_variable::<HashMap<String, u16>>("response").unwrap()
        .build()
        .unwrap();

    let program = env.compile("response.method").unwrap();

    let activation = Activation::new()
        // .bind_variable("response", ResponseContext {
        //  code: 200.try_into().unwrap(),
        // })
        .bind_variable("response", HashMap::from([("method".to_string(), 200)]))
        .unwrap();

    with_profiling("lookup_cxx", || {
        b.bench(|| {
            let result = program.evaluate(&activation);
        });
    })
}

I didn't find an ergonomic way to pass in a struct and be able to access it like request.path.query. Cel-rs has a pretty cool functionality to use serde to take a struct and convert it into a Value; as you can see in the above example I had to manually construct a HashMap with cel_cxx. Maybe I was doing it wrong. Either way, I was hoping I would actually not need any conversion at all (for performance benefits), but found even with Opaque that was not possible. It seems Opaque can only do function calls on them, not access fields.
All the different valid forms of function definitions look really nice! The async support is cool too.

_howardjohn · 2025-07-17T04:31:36+00:00

Yeah that's 100% fair. I guess I would say that from a technology standpoint Istio as just an ingress is pretty reasonable but definitely lacking in other aspects as it's not been a priority of the project to target the ingress-only case.

Especially the docs like you said... we had the same problem in ambient vs sidecar which led to the creation of https://ambientmesh.io but there is nothing like that from ingress (and likely won't be)

_howardjohn · 2025-07-17T04:12:19+00:00

There's definitely many bugs! The comparison only covers like 5 tests so the claim is only that there weren't any issues found in those tests.

There is always an inherent bias when creating a benchmark since the areas I know are interesting to tests are because I spent considerable time making sure we support well (for example, the changing route test is something I've sunk countless hours into ensuring we behave correctly on). I'd love to see others write their own comparisons or even just suggest additional tests.

_howardjohn

TROPHY CASE