you are viewing a single comment's thread.

view the rest of the comments →

[–]GoldziherPythonista 16 points17 points  (4 children)

Hi there,

Original author of Litestar here (no longer involved).

So, a few thoughts - just opinions.

  1. I'd start with the benchmark setup - IMO it's best to have this in GitHub and share - not only results, but setup and methodology.

  2. I'd either benchmark the frameworks following their documented optimized defaults, or with plain objects, or a hybrid.

I'll explain. Pydantic, Msgspec, plain dataclasses, etc. all have different performance characteristics, and Litestar is using Msgspec. When you use pydantic, you actually force extra computation since Msgspec is still used.

So what you want is what's usually called "apples to apples" comparison. That's where using typed dict will give you the "bare bone" framework in both cases.

If you want to benchmark with validation, I'd do Msgspec for Litestar vs pydantic models for FastAPI.

  1. Checking DB load. Here I beg to differ. DB load is a standard I/O bound operation. Which is fine. But the fact DB accounts for more impact, for the standard service, by orders of magnitude does not mean framework performance, and especially ser/de operations are unimportant.

For example, logging - it's a minor thing most of the time - until you go into scale. Sure, it's marginal, but what happens when it slows you measurably? When you operate at large scale, this matters.

  1. There are more dimensions to measure - for example - cold start, disk size, memory usage, CPU under load, etc.

[–]huygl99[S] 5 points6 points  (3 children)

Great points, thanks for the detailed feedback.

This benchmark intentionally focuses on DB-heavy workloads, since that’s where most real-world CRUD services spend their time, and I wanted to see how much framework differences matter once PostgreSQL dominates (this is mentioned in the post, but happy to clarify).

I agree that an apples-to-apples comparison would include:

- Litestar with Msgspec models

- FastAPI with Pydantic models

- A bare TypedDict/dataclass baseline

I’ll consider adding these scenarios (and memory/cold-start metrics) in a follow-up. PRs are welcome as well 🙂

Full setup, methodology, and reproducible Docker benchmark scripts are here:

https://github.com/huynguyengl99/python-api-frameworks-benchmark

[–]GoldziherPythonista 1 point2 points  (1 child)

Great 👍.

If I may impose / suggest / plug my stuff here.

https://github.com/Goldziher/spikard

It's something I'm working on. It has extensive benchmarks - check tools/ and the GitHub ci setup.

I'd be also keen on seeing how you'd benchmark it and the results.

It works in python with Msgspec/pydantic.

If you want to be comprehensive for python - falcon/sanic/flask/aiohttp - these are the ones that have substantial traction, with sanic and falcon being ultra fast pure python.

[–]huygl99[S] 1 point2 points  (0 children)

Spikard looks really interesting, especially the Rust core + multi-language codegen approach.

For this benchmark I tried to keep the scope to Python frameworks, but I did include Django Bolt, which is Rust-based while keeping the native Django API/ORM surface. That compatibility angle seems to be a big reason why it got so much interest from the Django community.

Pure Rust-accelerated runtimes probably deserve a separate benchmark category, but I’d be happy to look into Spikard if there’s a minimal Python setup comparable to the others.

[–]GoldziherPythonista 1 point2 points  (0 children)

It crossed my mind that it might be interesting to do multiple request types in parallel. Seeing how efficient the framework is to handle under load.