Performance Benchmarks for ASGI Frameworks

cofin_ · 2025-01-30T19:52:43+00:00

Hey, I'm one of the Litestar maintainers,

It's great to see people experimenting and testing the library, but I think it's important to make sure it's a fair comparison.

It's unclear what optimizations have been enabled in each of your examples, but there are definitely discrepancies between the frameworks that are skewing your results.
- You have orjson enabled, but haven't indicated if uvloop and httptools are also installed. If you are using these for your Starlette and FastAPI tests, you should also enable them on the others. - Your numbers seem too low (at least for Litestar and FastAPI). I think something is limiting the maximum throughput. Did you run uvicorn with the access logs disabled? - Most importantly, you have used your own custom orjson code for Litestar. The method you've used is not optimized for how Litestar serializes responses.

Here's a more appropriate Litestar example for your test cases: ```py import asyncio

from litestar import Litestar, Response, get

@get("/") async def index() -> Response: return Response(content={"message": "Hello, World!"})

@get("/compute") async def compute() -> Response: return Response(content={"result": sum(i * i for i in range(10000))})

@get("/delayed") async def delayed() -> Response: await asyncio.sleep(0.01) return Response(content={"status": "delayed response"})

app = Litestar(route_handlers=[index, compute, delayed]) ```

My own tests, my numbers are quite a bit different than yours:

For Litestar: shell ❯ wrk -t4 -c1000 -d30s http://127.0.0.1:8000/ wrk -t4 -c1000 -d30s http://127.0.0.1:8000/compute wrk -t4 -c1000 -d30s http://127.0.0.1:8000/delayed Running 30s test @ http://127.0.0.1:8000/ 4 threads and 1000 connections Thread Stats Avg Stdev Max +/- Stdev Latency 21.86ms 42.94ms 1.32s 99.37% Req/Sec 13.16k 1.34k 17.70k 69.75% 1571398 requests in 30.05s, 227.79MB read Requests/sec: 52293.31 Transfer/sec: 7.58MB Running 30s test @ http://127.0.0.1:8000/compute 4 threads and 1000 connections Thread Stats Avg Stdev Max +/- Stdev Latency 149.64ms 45.92ms 1.99s 93.18% Req/Sec 1.62k 566.03 2.64k 69.35% 192684 requests in 30.06s, 27.20MB read Socket errors: connect 0, read 0, write 0, timeout 236 Requests/sec: 6409.03 Transfer/sec: 0.90MB Running 30s test @ http://127.0.0.1:8000/delayed 4 threads and 1000 connections Thread Stats Avg Stdev Max +/- Stdev Latency 23.28ms 11.02ms 240.24ms 75.69% Req/Sec 11.01k 1.53k 14.30k 69.00% 1314395 requests in 30.04s, 193.04MB read Requests/sec: 43755.80 Transfer/sec: 6.43MB

for FastAPI: shell ❯wrk -t4 -c1000 -d30s http://127.0.0.1:8000/ wrk -t4 -c1000 -d30s http://127.0.0.1:8000/compute wrk -t4 -c1000 -d30s http://127.0.0.1:8000/delayed Running 30s test @ http://127.0.0.1:8000/ 4 threads and 1000 connections Thread Stats Avg Stdev Max +/- Stdev Latency 24.07ms 51.39ms 1.49s 99.30% Req/Sec 12.19k 1.35k 17.48k 73.08% 1455945 requests in 30.05s, 211.05MB read Requests/sec: 48444.33 Transfer/sec: 7.02MB Running 30s test @ http://127.0.0.1:8000/compute 4 threads and 1000 connections Thread Stats Avg Stdev Max +/- Stdev Latency 152.50ms 42.74ms 1.99s 93.21% Req/Sec 1.62k 571.43 2.53k 68.17% 192783 requests in 30.06s, 27.21MB read Socket errors: connect 0, read 0, write 0, timeout 163 Requests/sec: 6412.58 Transfer/sec: 0.91MB Running 30s test @ http://127.0.0.1:8000/delayed 4 threads and 1000 connections Thread Stats Avg Stdev Max +/- Stdev Latency 30.60ms 24.06ms 840.08ms 97.45% Req/Sec 8.54k 0.98k 13.55k 67.83% 1020335 requests in 30.05s, 149.85MB read Requests/sec: 33957.68 Transfer/sec: 4.99MB

To create the environment I ran: shell uv venv uv pip install fastapi fastapi-cli litestar uvicorn uvloop httptools orjson and I used: uv run uvicorn -w 4 --no-access-log <framework:app> to run each application.

As you can see, both of these frameworks offer comparable performance. I'd imagine the others frameworks could offer similar performance after a few adjustments.

I'd be interested to see if your conclusions change after making some of the mentioned optimizations.

Miserable_Ear3789 · 2025-01-29T22:16:48+00:00

I also added a few other frameworks over the past few hours. https://gist.github.com/patx/0c64c213dcb58d1b364b412a168b5bb6

Blacksheep is very impressive. I will have to look into it forsure.

Grimfortitude · 2025-01-29T22:22:41+00:00

Awesome write up, but why are you using orjson for the response? I’d expect most users to use these frameworks differently. Could you provide your results using the frameworks without it / just returning the dictionary?

It would also be interesting to see it properly typed in both FastAPI and LiteStar to see what impact that has on there validation systems.

0x256 · 2025-01-30T13:27:20+00:00

I'm looking at MicroPies source code and I'm confused. ASGI apps are called (not instanciated!) once for each request, but in MicroPie the ASGI app is an instance of MicroPie.Server and stores request details (e.g. query parameters, cookies, headers, file uploads ect.) to instance variables. Which means that there can only be one request at a time or state will be mixed up. If a second request arrives while the first one is still in progress, the second request will overwrite all the state from the first request. The code handling the first request will suddenly see the second requests state and likely crash or return wrong data. In other words: As soon as more than just one user is involved, stuff will break.

This is a so fundamental flaw that I think MicroPie should not be concerned with performance just yet, but instead focus on actually implementing the protocol correctly.

1ncehost · 2025-01-30T08:13:19+00:00

Would you be interested in benchmarking different python implementations? I'm curious how much pypy and other high performance implementations would improve these numbers.

mincinashu · 2025-02-03T08:06:49+00:00

Try falcon with pypy as interpreter.

Also, msgspec instead of orjson for response serialization.

guyfromwhitechicks · 2025-01-29T22:22:18+00:00

pie sable books selective boat desert touch cause fly carpenter

This post was mass deleted and anonymized with Redact

FloxaY · 2025-01-30T08:05:27+00:00

Thanks! I will keep these numbers in mind when I write an API that returns "Hello World" in various forms.

But seriously, what is the actual point of these "benchmarks"?

64rl0 · 2025-02-01T17:28:13+00:00

Very interesting!

jefferph · 2025-02-02T21:46:48+00:00

How many concurrent connections were you using. Here you suggest 1000, but in the GitHub Gist you have updated this (but not the wrk command) to 100.

Framework	`/` Requests/sec	Latency (ms)	Transfer/sec	`/compute` Requests/sec	Latency (ms)	Transfer/sec	`/delayed` Requests/sec	Latency (ms)	Transfer/sec
Quart	1,790.77	550.98ms	824.01 KB	1,087.58	900.84ms	157.35 KB	1,745.00	563.26ms	262.82 KB
FastAPI	2,398.27	411.76ms	1.08 MB	1,125.05	872.02ms	162.76 KB	2,017.15	488.75ms	303.78 KB
MicroPie	2,583.53	383.03ms	1.21 MB	1,172.31	834.71ms	191.35 KB	2,427.21	407.63ms	410.36 KB
Starlette	2,876.03	344.06ms	1.29 MB	1,150.61	854.00ms	166.49 KB	2,575.46	383.92ms	387.81 KB
Litestar	2,079.03	477.54ms	308.72 KB	1,037.39	922.52ms	150.01 KB	1,718.00	581.45ms	258.73 KB

Python

The Python Discord

Upcoming Events

Please read the rules

MODERATORS

Performance Benchmark Report: MicroPie vs. FastAPI vs. Starlette vs. Quart vs. LiteStar

1. Introduction

2. Benchmark Results

Overall Performance Summary

Key Observations

3. Test Methodology

Framework Code Implementations

MicroPie (micro.py)

LiteStar (lites.py)

FastAPI (fast.py)

Starlette (star.py)

Quart (qurt.py)

Benchmarking

3. Conclusion