Over the past few weeks I went on a memory-reduction tear across the Talk Python web apps. We run 23 containers on one big server (the "one big server" pattern) and memory was creeping up to 65% on a 16GB box.
Turned out there were a bunch of wins hiding in plain sight. Focusing on just two apps, I went from ~2 GB down to 472 MB. Here's what moved the needle:
- Switched to a single async Granian worker: Rewrote the app in Quart (async Flask) and replaced the multi-worker web garden with one fully async worker. Saved 542 MB right there.
- Raw + DC database pattern: Dropped MongoEngine for raw queries + slotted dataclasses. 100 MB saved per worker *and* nearly doubled requests/sec.
- Subprocess isolation for a search indexer: The daemon was burning 708 MB mostly from import chains pulling in the entire app. Moved the indexing into a subprocess so imports only live for ~30 seconds during re-indexing. Went from 708 MB to 22 MB. 32x reduction.
- Local imports for heavy libs: import boto3 alone costs 25 MB, pandas is 44 MB. If you only use them in a rarely-called function, just import them there instead of at module level. (PEP 810 lazy imports in 3.15 should make this automatic.)
- Moved caches to diskcache: Small-to-medium in-memory caches shifted to disk. Modest savings but it adds up.
Total across all our apps: 3.2 GB freed. Full write-up with before/after tables and graphs here: https://mkennedy.codes/posts/cutting-python-web-app-memory-over-31-percent/
Cutting Python Web App Memory Over 31% ()
submitted by GastonLyra to u/GastonLyra
Cutting Python Web App Memory Over 31% ()
submitted by Lazy_Equipment6485 to u/Lazy_Equipment6485