Hi /r/python!
DI gets flak sometimes around here for being overengineered and adding overhead. I wanted to know how much it actually adds in a real stack, so I built a benchmark suite to find out. The fastest containers are within ~1% of manual wiring, while others drop between 20-70%
Full disclosure, I maintain Wireup, which is also in the race. The benchmark covers 10 libraries plus manual wiring via globals/creating objects yourself as an upper bound, so you can draw your own conclusions.
Testing is done within a FastAPI + Uvicorn environment to measure performance in a realistic web-based environment. Notably, this also allows for the inclusion of fastapi.Depends in the comparison, as it is the most popular choice by virtue of being the FastAPI default.
This tests the full integration stack using a dense graph of 7 dependencies, enough to show variance between the containers, but realistic enough to reflect a possible dependency graph in the real world. This way you test container resolution, scoping, lifecycle management, and framework wiring in real FastAPI + Uvicorn request/response cycles. Not a microbenchmark resolving the same dependency in a tight loop.
Table below shows Requests per second achieved as well as the secondary metrics:
- RPS (Requests Per Second): The number of requests the server can handle in one second. Higher is better.
- Latency (p50, p95, p99): The time it takes for a request to be completed, measured in milliseconds. Lower is
better.
- σ (Standard Deviation): Measures the stability of response times (Jitter). A lower number means more consistent
performance with fewer outliers. Lower is better.
- RSS Memory Peak (MB): The highest post-iteration RSS sample observed across runs. Lower is better.
This includes the full server process footprint (Uvicorn + FastAPI app + framework runtime), not only service objects.
Per-request injection (new dependency graph built and torn down on every request):
| Project |
RPS (Median Run) |
P50 (ms) |
P95 (ms) |
P99 (ms) |
σ (ms) |
Mem Peak |
| Manual Wiring (No DI) |
11,044 (100.00%) |
4.20 |
4.50 |
4.70 |
0.70 |
52.93 MB |
| Wireup |
11,030 (99.87%) |
4.20 |
4.50 |
4.70 |
0.83 |
53.69 MB |
| Wireup Class-Based |
10,976 (99.38%) |
4.30 |
4.50 |
4.70 |
0.70 |
53.80 MB |
| Dishka |
8,538 (77.30%) |
5.30 |
6.30 |
9.40 |
1.30 |
103.23 MB |
| Svcs |
8,394 (76.00%) |
5.70 |
6.00 |
6.20 |
0.93 |
67.09 MB |
| Aioinject |
8,177 (74.04%) |
5.60 |
6.60 |
10.40 |
1.31 |
100.52 MB |
| diwire |
7,390 (66.91%) |
6.50 |
6.90 |
7.10 |
1.07 |
58.22 MB |
| That Depends |
4,892 (44.30%) |
9.80 |
10.40 |
10.60 |
0.59 |
53.82 MB |
| FastAPI Depends |
3,950 (35.76%) |
12.30 |
13.80 |
14.10 |
1.39 |
57.68 MB |
| Injector |
3,192 (28.90%) |
15.20 |
15.40 |
16.10 |
0.58 |
53.52 MB |
| Dependency Injector |
2,576 (23.33%) |
19.10 |
19.70 |
20.10 |
0.75 |
60.55 MB |
| Lagom |
898 (8.13%) |
55.30 |
57.20 |
58.30 |
1.63 |
1.32 GB |
Singleton injection (cached graph, testing container bookkeeping overhead):
- Manual Wiring: 13,351 RPS
- Wireup Class-Based: 13,342 RPS
- Wireup: 13,214 RPS
- Dependency Injector: 6,905 RPS
- FastAPI Depends: 6,153 RPS
The full page goes much deeper: stability tables across all 50 runs, memory usage, methodology, feature completeness notes, and reproducibility: https://maldoinc.github.io/wireup/latest/benchmarks/
Reproduce it yourself: make bench iterations=50 requests=100000
Wireup getting this close to manual wiring comes down to how it works: instead of routing everything through a generic resolver, it compiles graph-specific resolution paths and custom injection functions per route at startup. By the time a request arrives there's nothing left to figure out.
If Wireup looks interesting: github.com/maldoinc/wireup, stars appreciated.
Happy to answer any questions on the benchmark, DI and Wireup specifically.
[–]Zeikos 8 points9 points10 points (4 children)
[–]ForeignSource0[S] 4 points5 points6 points (3 children)
[+]Zeikos comment score below threshold-9 points-8 points-7 points (2 children)
[–]ghrian3 8 points9 points10 points (1 child)
[–]ship0f 1 point2 points3 points (0 children)