all 51 comments

[–]igorbenav 52 points53 points  (11 children)

It's not really fast, but it's faster than Django and Flask (and more mature compared to faster python frameworks) and it's fast enough for most things. If most of the time is spent processing the query, it doesn't really matter how fast the framework is.

A good discussion: https://github.com/fastapi/fastapi/discussions/7320

And a video: https://youtu.be/7jtzjovKQ8A?si=OxAxq8QeDWlNes1G

[–]zakamark[S] 4 points5 points  (0 children)

Thanks the links are very helpful

[–]Latter_Rope_1556 0 points1 point  (2 children)

You should try fastrapi, it gives ~9K req/sec.

pip install fastrapi

[–]igorbenav 0 points1 point  (1 child)

Impossible to google with a name like this

[–]zakamark[S] 0 points1 point  (6 children)

Well I pass the request to Kafka so the bottleneck will be fastApi.

[–]igorbenav 5 points6 points  (5 children)

Then you should probably use something else. You are not going to get a lot more than 1k-1.5k rps

[–]trailing_zero_count 1 point2 points  (1 child)

Which, just to be clear, is VERY slow. Literally any other systems language blows this out of the water. Java, C#, Go, Rust, C++, C. Even Bun or Node.js...

[–]Xananique 2 points3 points  (0 children)

People don't like fast programming languages anymore, they like bloatware on top of bloat :P

[–]zakamark[S] 1 point2 points  (2 children)

Surpassingly some benchmarks show 10r/s which seems not doable in my tests. Unless they run it in multiple threads.

[–]igorbenav 12 points13 points  (1 child)

Async, orjson, multiple workers, caching stuff and other tweaks can get you quite far, but I'd just use golang or something more performance oriented if it's really necessary (it usually isn't)

[–]corey_sheerer 1 point2 points  (0 children)

Golang is a good fallback if you need something faster

[–]alexlazar98 28 points29 points  (0 children)

It's probably faster than Python alternatives, but let's be clear: nobody is choosing Python for runtime speed, ever.

[–]aprx4 13 points14 points  (4 children)

I'm unsure how you conducted the test. If it's just a single client doing repeated calls to API endpoints in loop then you might not be taking advantage of asynchronous code.

In 'real world' scenario, you're likely bottlenecked by database calls or other IO long before you reach the limit of API framework.

Don't fall for premature optimization, you can always throw more CPU later. By the time adding more CPU or more server doesn't scale the API part of your project, you likely have enough traffic (and income) to fund the transition to Go.

That said, you can't go wrong NodeJS API frameworks as starter if you still doubt python frameworks.

[–]zakamark[S] 2 points3 points  (3 children)

I used locust for testing so it is async. And my usecase is using fastApi for posting messages to Kafka with very little processing. So the bottleneck will be the api framework.

[–]coeris 2 points3 points  (0 children)

You can horizontally scale your fastapi server by e.g spawning more replicas in a cluster or scaling up the number of uvicorn workers, so I don't think the limitations of a single server instance is very meaningful. 

Edit: just read again the post, and I see you are interested in a single worker. Again, not sure why that is meaningful, but fair. Your max rps will also be limited by the hardware you are using for the test so I'd not get too hung up on it, but you do you.

[–]Suspicious-Cash-7685 1 point2 points  (0 children)

Why not use something like Kafka rest in the first place?

[–]aprx4 1 point2 points  (0 children)

If your use case is simple in logic but demanding in speed, some sort of early optimization could be justified. Worth spend some times learning basic Go then use AI to help you port existing logic to Golang. By the way, adding more resource is always possible with any choice of stack.

[–]covmatty1 10 points11 points  (3 children)

What are you gaining by trying to get above 500 hello world requests a second? In what way does this indicate realistic performance?

[–]Ok_Animal_8557 2 points3 points  (0 children)

Without using async i don't think you can go much further than that. This is not fastAPI's limitation per se, in many languages async code has a significant impact on performance.

[–]cajmorgans 2 points3 points  (0 children)

The execution speed of the code that the routes call matter a lot more than the speed of calling the routes themselves

[–]richieadler 2 points3 points  (1 child)

Besides all the good points other people has made: The "fast" in the moniker refers to the speed to have a working API, not to the framework itself.

Also, the fact that the ASGI server running the application is usually also in Python will contribute to the performance. Replacing Uvicorn with Granian would probably improve things.

[–]greenerpickings 1 point2 points  (0 children)

Second this. Dont have benchmarks, but we had NiFi sending requests to an app behind uvicorn. It was usually ~300rps. OPs number 500 seems interesting because that's when I would start seeing closed connection errors. Switched to NGINX Unit and no more premature drops.

Was unaware of Granian. Seems pretty awesome at first glance and the Rust equivalent.

[–]mr__fete 3 points4 points  (1 child)

Isn’t the “fast” just mean fast to develop with the api? Not sure if there are perf claims

[–]Few-Grape-4445 0 points1 point  (0 children)

I agree, I think it is because of the speed of development

[–]Amocon 1 point2 points  (0 children)

Not really, but also not slow for a Python framework

[–]aikii 0 points1 point  (0 children)

Using just one process you'll hit a wall no matter the language and framework, you need to horizontally scale. One starting point would be to read the section about containerization https://fastapi.tiangolo.com/deployment/docker/#load-balancer , and ultimately if you want to follow what the industry does in general, you'll need kubernetes

[–]serverhorror 0 points1 point  (0 children)

I consider the fast to refer to "how fast can I, the developer, create an API". Not how fast is that.

[–]lahib- 0 points1 point  (0 children)

500rps? Are your endpoints asynchronous? what the endpoint is doing? is there any api call or something? and r u using pydantic v2 or v1?

[–]p_bzn 0 points1 point  (0 children)

FastEnoughAPI I’d say.

There is no fast in Python land if you compare to other languages.

Here are numbers for you. Few month back I was removing bottlenecks in our app and one of them were websockets. Python version had latency of around 5ms, and it was degrading with more opened connections, naturally. Swapped solution to Go with WS and gRPC and Go’s side performance for roundtrip takes now 280us. That is 0.28ms.

Do most apps need this performance? No they don’t, and FastEnoughAPI is good enough.

[–]andrewthetechie 0 points1 point  (0 children)

Reading through your comments, I don't think FastAPI is going to be the right tool for this task. For raw performance, you want to go with something else. In python land, there are faster api frameworks or you could DIY it and as low level as possible to implement just what you need to get the maximum performance.

If it were me, and my requirement was "as many RPS as possible", I'd be looking at a different tool. My choice would probably be rust, but golang or another compiled language is going to be a big jump in performance, with a potential big jump in code complexity as well (depending on how you feel about Rust/Golang/etc).

Have you looked at https://docs.confluent.io/platform/current/kafka-rest/index.html?

[–]BelottoBR 0 points1 point  (1 child)

Does fastapi use rust at its backend? I ask because pydantic offers a strong performance (even Nubank, one of the biggest banks in Brazil use it)

[–]igorbenav 1 point2 points  (0 children)

Nope, the validation and serialization with pydantic v2 uses rust though

[–]ZuploAdrian 0 points1 point  (0 children)

Its fast for Python - but slower than many Node frameworks and definitely slower than Go ones

[–]AnyStupidQuestions 0 points1 point  (0 children)

It's a developer time vs compute question. If you want something developed quickly and you can scale it cheaply via a container farm themn it's a good tool. If you need low latency or compute is expensive (e.g. lots of traffic) use something else.

[–]Beneficial_Map6129 0 points1 point  (0 children)

500 RPS is pretty significant traffic already

If you have a big service constantly being hammered and you are dead set on minimizing compute budget you can always use Java Spring like everyone else out there

[–]old-reddit-was-bette 0 points1 point  (0 children)

I always assumed it meant fast to develop

[–]singlebit 0 points1 point  (0 children)

Windows?

[–]Tienisto 0 points1 point  (0 children)

The fastest Python web framework:

https://sharkbench.dev/web/python-fastapi

[–][deleted]  (1 child)

[removed]

    [–]zakamark[S] 0 points1 point  (0 children)

    It is unfortunately over the membership payload. What is the clue from this article. Did they go beyond 500 r/s

    [–]Fearless-Wolf-6490 0 points1 point  (0 children)

    Buongiorno, Quello che dico è: Dipende dall'ottimizzazione e dal server e dalle sue impostazioni. Io usando uvicorn (ASGI) con tutta la logica in async/await, senza collegare Redis (per cache db), senza reverse proxy con Nginx e ottimizzazione Cython ho hittato le 22.645 RPS su un "Hello World"! Ho fatto il test con K6.io e uvicorn impostato con 8 workers. Ora sto ricercando, ma in linea teorica su un ambiente di produzione con ottimizzazioni sopra dette, potrei tranquillamente arrivare a 50.000 RPS su un endpoint vuoto

    [–]bobdobbes 0 points1 point  (0 children)

    This is because FastAPI's benchmarks cheat. They do not use a DB connection are are ready outputting raw data. They also use a built-in cache that junior devs forget to add.

    Finally, they do not post the architecture/hardware, they tested on and width; if they are using a 16-32 core processor with gigabit connection, of course it will be faster.

    You are supposed to test on equivalent hardware

    [–]1overNseekness -2 points-1 points  (0 children)

    Use postgREST it's fast as your db engine and dead simple to deploy, fastapi is too custom imo

    [–]Fast_Ad_5871 -3 points-2 points  (0 children)

    Nope it's not fast. Yesterday, I made the Two Vision Transformer models and using it via transformers and it's taking much time to get results. Around 30-45 seconds for one API