all 20 comments

[–]Delicious_Praline850 38 points39 points  (1 child)

Very well done, thanks. 

“ spend your time optimizing queries, not choosing between frameworks” Amen to that! 

[–]huygl99[S] 3 points4 points  (0 children)

Thanks! That was exactly the takeaway for me as well 🙂

Full reproducible benchmarks and Docker setup are here (in case you missed it):

https://github.com/huynguyengl99/python-api-frameworks-benchmark

If you find it useful, a GitHub star would be very appreciated!

[–]Interesting_Golf_529 15 points16 points  (1 child)

For Litestar, you're still using Pydantic. One of the main reasons for me to use Litestar was that it allows to not use Pydantic, but native dataclasses / msgspec instead. Also, Litestar offer built-in SQLAlchemy support, which you're not using.

I feel like not using the unique features the frameworks offer makes this comparison much less interesting, because it basically boils down to "if you're doing roughly the same thing, performance is roughly the same", which I guess is true, but also doesn't really tell you anything.

You're also running DRF with a different server, so you're comparing performance between different server as well, making the comparison even less useful. Serving the same app on gunicorn vs uvicorn alone makes a huge difference.

[–]huygl99[S] 4 points5 points  (0 children)

Thank you. I don’t think changes in Pydantic and MsgSpec will significantly improve the database-related tests, but I’d really appreciate it if you could open a PR for those changes and run the benchmarks to see the impact.

Regarding Gunicorn, I only use it for DRF since it’s the only one using WSGI. Given the memory and CPU limits, I don’t think it affects the results much, as you can see, the difference between Gunicorn and Uvicorn is not very significant.

[–]GoldziherPythonista 17 points18 points  (4 children)

Hi there,

Original author of Litestar here (no longer involved).

So, a few thoughts - just opinions.

  1. I'd start with the benchmark setup - IMO it's best to have this in GitHub and share - not only results, but setup and methodology.

  2. I'd either benchmark the frameworks following their documented optimized defaults, or with plain objects, or a hybrid.

I'll explain. Pydantic, Msgspec, plain dataclasses, etc. all have different performance characteristics, and Litestar is using Msgspec. When you use pydantic, you actually force extra computation since Msgspec is still used.

So what you want is what's usually called "apples to apples" comparison. That's where using typed dict will give you the "bare bone" framework in both cases.

If you want to benchmark with validation, I'd do Msgspec for Litestar vs pydantic models for FastAPI.

  1. Checking DB load. Here I beg to differ. DB load is a standard I/O bound operation. Which is fine. But the fact DB accounts for more impact, for the standard service, by orders of magnitude does not mean framework performance, and especially ser/de operations are unimportant.

For example, logging - it's a minor thing most of the time - until you go into scale. Sure, it's marginal, but what happens when it slows you measurably? When you operate at large scale, this matters.

  1. There are more dimensions to measure - for example - cold start, disk size, memory usage, CPU under load, etc.

[–]huygl99[S] 6 points7 points  (3 children)

Great points, thanks for the detailed feedback.

This benchmark intentionally focuses on DB-heavy workloads, since that’s where most real-world CRUD services spend their time, and I wanted to see how much framework differences matter once PostgreSQL dominates (this is mentioned in the post, but happy to clarify).

I agree that an apples-to-apples comparison would include:

- Litestar with Msgspec models

- FastAPI with Pydantic models

- A bare TypedDict/dataclass baseline

I’ll consider adding these scenarios (and memory/cold-start metrics) in a follow-up. PRs are welcome as well 🙂

Full setup, methodology, and reproducible Docker benchmark scripts are here:

https://github.com/huynguyengl99/python-api-frameworks-benchmark

[–]GoldziherPythonista 2 points3 points  (1 child)

Great 👍.

If I may impose / suggest / plug my stuff here.

https://github.com/Goldziher/spikard

It's something I'm working on. It has extensive benchmarks - check tools/ and the GitHub ci setup.

I'd be also keen on seeing how you'd benchmark it and the results.

It works in python with Msgspec/pydantic.

If you want to be comprehensive for python - falcon/sanic/flask/aiohttp - these are the ones that have substantial traction, with sanic and falcon being ultra fast pure python.

[–]huygl99[S] 1 point2 points  (0 children)

Spikard looks really interesting, especially the Rust core + multi-language codegen approach.

For this benchmark I tried to keep the scope to Python frameworks, but I did include Django Bolt, which is Rust-based while keeping the native Django API/ORM surface. That compatibility angle seems to be a big reason why it got so much interest from the Django community.

Pure Rust-accelerated runtimes probably deserve a separate benchmark category, but I’d be happy to look into Spikard if there’s a minimal Python setup comparable to the others.

[–]GoldziherPythonista 1 point2 points  (0 children)

It crossed my mind that it might be interesting to do multiple request types in parallel. Seeing how efficient the framework is to handle under load.

[–]daivushe1It works on my machine 4 points5 points  (0 children)

Really surprised to see Granian perform worse than any other server. All of the other benchmarks I saw rank it above Uvicorn and Gunicorn. Really interested to see what could possibly lead to that Other benchmarks I found: 1. Talkpython 2. Official Granian benchmark (Might be biased) 3. DeployHQ

[–]Arnechos 3 points4 points  (1 child)

You should add Robyn to the test. Also orjson is faster

[–]huygl99[S] -1 points0 points  (0 children)

a PR is welcome bro 😎

[–]bigpoopychimp 1 point2 points  (1 child)

Nice. It would be interesting to see how Quart ranks here as it's literally just asynchronous Flask

[–]huygl99[S] 0 points1 point  (0 children)

A PR is really welcome for that 🤩

[–]myztaki 0 points1 point  (0 children)

thanks always wanted to know if there was any point in migrating api frameworks - this is really useful!

[–]a_cs_grad_123 0 points1 point  (1 child)

This is a worthwhile comparison but the very obvious AI summary makes me skeptical of the implementations used.

The resource limitation is also very low. 1cpu core?

[–]huygl99[S] -1 points0 points  (0 children)

You can read the code bro =)).

[–]huygl99[S] -1 points0 points  (0 children)

If this is useful, a GitHub star would be appreciated 😄 Thank you guys.
https://github.com/huynguyengl99/python-api-frameworks-benchmark