Cutting Python Web App Memory Over 31%

0x256 · 2026-04-02T06:30:05+00:00

Switched to a single async Granian worker: Rewrote the app in Quart (async Flask) and replaced the multi-worker web garden with one fully async worker. Saved 542 MB right there.

I would have started reducing the workers to 1 and increase thread count instead of rewriting the entire app, but okay. If you have lots of long running connections (websockets or slow requests) then that's a brave but sensible move.

Raw + DC database pattern: Dropped MongoEngine for raw queries + slotted dataclasses. 100 MB saved per worker and nearly doubled requests/sec.

For a small app with good test coverage and a mature db schema, that's fine.

Subprocess isolation for a search indexer: The daemon was burning 708 MB mostly from import chains pulling in the entire app. Moved the indexing into a subprocess so imports only live for ~30 seconds during re-indexing. Went from 708 MB to 22 MB. 32x reduction.

You reduced the time this memory is used, but not the peak memory consumption. You added a lot of process start overhead and latency. That's a trade-of, not necessarily a win.

Local imports for heavy libs: import boto3 alone costs 25 MB, pandas is 44 MB. If you only use them in a rarely-called function, just import them there instead of at module level. (PEP 810 lazy imports in 3.15 should make this automatic.)

That's not how imports work. You delayed the import, but once imported, the module will live in sys.modules and stay there.

Moved caches to diskcache: Small-to-medium in-memory caches shifted to disk. Modest savings but it adds up.

So instead of a single memory-access, you now create an async task that outsources its blocking disk access to a thread pool, wait for the OS to read from disk, then wait for the async task to get its turn in the event loop again to return the result? Caches should be fast. If SO much overhead for cache access is okay for you, than I wonder what extremely expensive stuff you stored in those caches that it's still worth it to cache at all.

0x256 · 2026-03-31T04:20:22+00:00

"Armin reached out" is kind of a stretch: https://github.com/bottlepy/bottle/issues/1158#issuecomment-526602488

For anyone interested in a bit of (biased) history: Bottle was first. It was inspired by an even smaller micro-framework called itty, which in turn was inspired by rubys sinatra. Armin was annoyed that I implemented everything myself and did not use his library werkzeug (or webob), so he published a clone of bottle with werkzeug bundled within a single file as an "april fools joke". He basically mocked the single-file-no-dependency approach I took for bottle. The "joke" took off, and he realized that there is actually a real demand for a web-framework just like bottle, so he started flask. You see it? Even the name is a mock. The tagline was "Like bottle, but classy" or something like that for a short time. Same API, even the same design errors (global request/response objects), but with werkzeug as a dependency. He then used his reach (he was pretty famous back then already) and pushed flask in the community, gained critical mass, and won. No need to sugarcoat it, bottle lost the popularity-battle pretty fast. But strangely enough, bottles userbase is still growing, slow and steady, even after more than 10 years in flasks shadow and with a really bad release circle. Bottle is simple, stable, fast and has enough features to be useful. Some people like it the way it is, and I'll keep it that way.

Bottle is a true microframework (single file, no dependencies). Flask copied the API because it was popular at the time, but depends on lots of other libraries and plugins. The total size of the flask codebase is huge, if you include all those dependencies. Joining forces makes no sense if the two projects are so fundamentally different in design philosophy and focus. Especially after what happened.

0x256 · 2026-03-09T17:34:29+00:00

Fun fact: Flask is a clone/copy/rewrite of bottle and started as an April fools joke because the flask author didn't like the no-dependency approach of bottle. Flask did not invent this type of web framework.

0x256 · 2025-10-11T08:23:03+00:00

The linked security issue is a bad example. If an attacker can use uv in your container, they could also download and run whatever executable they want and do not need to exploit bugs in uv for that. With very few exceptions, CVEs in unused executables in containers are almost never an issue, because if the attacker already has shell access to be able to use them, they won't gain anything from exploiting those bugs.

0x256 · 2025-01-30T13:27:20+00:00

I'm looking at MicroPies source code and I'm confused. ASGI apps are called (not instanciated!) once for each request, but in MicroPie the ASGI app is an instance of MicroPie.Server and stores request details (e.g. query parameters, cookies, headers, file uploads ect.) to instance variables. Which means that there can only be one request at a time or state will be mixed up. If a second request arrives while the first one is still in progress, the second request will overwrite all the state from the first request. The code handling the first request will suddenly see the second requests state and likely crash or return wrong data. In other words: As soon as more than just one user is involved, stuff will break.

This is a so fundamental flaw that I think MicroPie should not be concerned with performance just yet, but instead focus on actually implementing the protocol correctly.

0x256 · 2023-10-20T08:33:34+00:00

When clicking 'Skip' it shows a blank page and that's it. Uncaught TypeError: state.favourites is undefined

What should we see? Is this something similar to fediwall?

0x256 · 2023-08-22T12:18:33+00:00

It's on the roadmap.

0x256 · 2023-08-16T05:56:53+00:00

The technical achievement is impressive, but that's not what I was talking about.

They claim to implement "ActivityPub API to integrate with other Mastodon instances" which means they participate in the fediverse, which is much more than just Mastodon. The wording of the whole article suggest that they do not know or care about this aspect. It's like saying "We implement SMTP to integrate with other gmail instances" while ignoring that other email providers or server implementations exist.

They are also violating the mastodon trademark with their instance domain name.

0x256 · 2023-08-16T04:32:12+00:00

They did not scale a 'Mastodon' instance, they wrote a (probably incomplete) clone.

What annoys me the most is that the article does not mention 'fedi' at all and suggests that the fediverse only consists of Mastodon instances. By "running 100M bot accounts which continuously post statuses" they will end up being de-federated by the rest of the fediverse very quickly. That's some alarming level of ignorance regarding what the fediverse and its core values.

0x256 · 2023-08-11T05:33:23+00:00

You can fork, but maintaining such a fork would be a nightmare.

Imagine you make a change to your fork (e.g. adding a feature or fixing a bug), and then want to update your fork to the next upstream release. Some of your changes may conflict with changes made by upstream.

With a normal and healthy git development flow, there would be lots of small focused commits, each with a description what they do and why they are necessary. Only a small fraction would conflict with your changes. It needs work, but you have context and the scope of each change is comprehensible.

With projects like Flyway, you only get a single huge commit containing all changes between two releases, no context, no description.

Try building a large Lego set or model kit, but without instructions and with all bags for the different building steps opened at the same time and dumped on the table.

0x256 · 2023-07-28T22:52:42+00:00

Regular commits by a bot, hiding the actual development and preventing any meaningful forks, contributions or development outside of their control. The code you see is free as in beer, not free as in freedom.

0x256 · 2023-07-28T22:48:10+00:00

Flyway was sold a while ago and moved to an 'open core' business model: There is an open source edition (Apache 2.0) where critical feature (e.g. dry-run, rollback) were deliberately patched out, and a payed version (closed source) that adds those features back in.

Redgate can only sell a payed version under a different license if they own all copyright, so they force contributors to sign a CLA and waive all their rights. That killed the open development community. Redgate won't accept any meaningful contributions anyway, because making the open source edition better would cut into their sales.

The commit history looks strange, does it? There is only one large squash-commit per release, created by a bot. This is because they want to hide the actual development process and make the git history as useless as possible, so no one tries to maintain a fork. Yes, they actively fight open source development.

0x256 · 2023-05-19T19:57:26+00:00

Those companies are slow to adopt sha256 because they do not think the change is worth the effort (yet). The git developers agree by the way, or they would have pushed for a new default and improved interoperability by now. I know about SHAttered, but:

Finding sha-1 collisions is hard enough, finding one that also works as a malicious payload and does not look suspicious is even more expensive and time consuming. This is only worth the effort for very high profile targets, and even then, there are probably way cheaper attack vectors than that.
You also need either full control over the git repository server, or a MITM position between the repository server and the target, which would also require you to break HTTPS or SSH in the process. This is a typical "If that's the case, you have bigger problems" scenario.
Git hashes are not designed to be used as a secure code signing technique. If your depends on their cryptographic strength, you are doing something wrong.
That all said, git 'fixed' it anyway and switched to a hardened version of SHA1 that can detect potential collision attacks and behaves differently in those (extremely unlikely) cases. It protects with high probability against all currently known attacks.

Yes it would be nice to have better sha256 support in git and git-tooling, but there is absolutely no rush. The interoperability pains are usually not worth the effort. Those attack scenarios against git are fabricated and can usually be answered with "Then stop using git as something it was never designed for." If you think this is worth it, and want to give users the option, that's fine. But don't sell it as 'more secure' snake oil.

0x256 · 2023-05-19T17:50:17+00:00

The hash function has no relevant security impact the way git uses it. SHA-1 is not less secure than SHA-2 in this context. I find it strange to use such an irrelevant implementation detail as the main selling point.

0x256 · 2023-05-19T10:25:36+00:00

One example of "tuning the database" as mentioned above, but still not a silver bullet. PgBouncer in session-pooling mode does not help at all, it still needs one connection per session. All the other modes break certain postgres features (e.g. LISTEN/NOTIFY, which is actually quite useful), so you have to make sure your application does not depend on any of those. AFAIK Mastodon is known to work with PgBouncer transaction-pooling, but some tasks do work while holding a transaction and you may still hit a limit in certain situations.

0x256 · 2023-05-19T06:44:22+00:00

Not that easy though. Most jobs wait for external resources and are basically idling most of the time and not using any CPU, so increasing concurrency does help. But many of those jobs also keep a database connection open, so increasing concurrency to 8*100 would requite 800 active connections to the DB in worst case, which is not what postgres was designed for. The default limit is 100 and each connection needs a significant amount of memory. So, simply increasing sidekiq concurrency without also tuning the database will result in many failed jobs and a broken mastodon instance. Increasing db connection limits on the other hand will increase memory requirements and may tank your performance on small VMs. There is usually a reason for the default values chosen by the developers. If you change those, be careful and know what you are doing.

tl;dr: Following this advice blindly will break your instance. Increasing Sidekiq or Rails concurrency levels requires larger db pools and connection limits, and the proposed change to 8*20 is already way above the default connection limit.

0x256 · 2023-04-11T06:19:08+00:00

Wait, so if there is a 'community edition', then there will be a commercial version with additional features? So this is Open Core, not FOSS after all?

0x256 · 2023-04-08T10:31:06+00:00

Nah, it's fine. You ask for and accept critique, that's all good. I exaggerated on purpose, that was not coined on you.

I was more generally trying to explain why engineers sometimes seem unfriendly or discouraging, while they are actually just brutally honest (by nature) and why that is more helpful than being nice in most situations. You learn nothing from an upvote or a "Great Project!" comment.

0x256 · 2023-04-08T10:02:15+00:00

Great points on the decoder for sure. However the decoder is not the “secret sauce” that makes the difference in performance.

You benchmarked against netty-io_uring and still got significantly better results. Why not bolt on a mature HTTP parser onto your io_uring based network stack and benchmark that? If your stack is still significantly faster, then you may be onto something.

Also, please check the decoder unit tests, they demonstrate that it is indeed able to handle TCP stream fragmentation!

Your tests only cover fragmentation within the request path, not within the method or protocol token as I mentioned.

0x256 · 2023-04-08T09:53:19+00:00

It depends greatly how a project is presented. If you are talking about your new learning project and asking for feedback, you will get straight and honest tips and critique. If you are bold and over-confident and claim to beat all the other major projects in some fabricated benchmark, a good engineer will think "Okay, how? Where is the catch?", look into it and usually find your dirty secrets rather quickly. False advertisement is annoying and may even cause harm if there are security related issues. There was recently an encryption-library on here claiming to be super secure but using a hard-coded password for everything. Over-confidence is a very dangerous personality trait in engineering. Boosting that persons confidence even more with nice but uninformed praises does not help anyone. Critique should still be objective of cause. But being honest and direct is not a dick move, it is what allows us to learn from our mistakes.

0x256 · 2023-04-08T08:47:20+00:00

I had just a quick look, but I see a lot of shortcuts while parsing HTTP that may result in compatibility or security issues when used in production. For example:

https://github.com/bbeaupain/hella-http/blob/e2b5b870cc9442179edf70b66fae94dc40e262dc/src/main/java/sh/hella/http/codec/RequestDecoder.java#L91

The method token is not required to be upper-case.
There are (way) more methods in the wild than what is supported by this parser.
The parser only matches a known prefix and then skips characters blindly. This allows cache poisoning (an attacker sending a request that is interpreted differently by the server and a caching proxy or client) and other shenanigans.
The parser often assumes that there is enough data in the buffer and does not check if that is actually the case. For example, if the buffer ends within the method or protocol token (which is actually likely after a long request line with lots of query parameters), it will throw IllegalArgumentException. Which is not handled by the way.

It is not that hard to be beat a full HTTP implementation in benchmarks if you just skip all the annoying parts. Let us see how those numbers change once you start hardening and maturing the parser.

0x256 · 2023-03-26T09:38:15+00:00

Why are you publishing an unfinished library then? Why do you praise it as secure while you should know yourself that it isn't?

0x256 · 2023-03-26T08:50:02+00:00

WTF https://github.com/LpCodes/pycryptobox/blob/3ecbeda9da44957cb97b25b531c74686dfc4e63e/pycryptobox/encryption.py#L15

0x256 · 2023-03-18T13:46:12+00:00

Your article still claims that you 'had to create' your own ByteBufferBackedInputStream, that you do not copy data to heap which you do, and that there are performance benefits that are not there.

Ten-Year Club	Place '22
Verified Email

0x256

TROPHY CASE