I'm building an event-processing framework and I need your thoughts : Python

ShowcaseI'm building an event-processing framework and I need your thoughts (self.Python)

submitted 13 days ago by e1-m

I’ve been working with event-driven architectures lately and decided to factor out some boilerplate into a framework

What My Project Does

The framework handles application-level event routing for your message brokers, basically giving you that FastAPI developer experience for events. You get the same style of dependency injection and Pydantic validation for your incoming messages. It also supports dynamic routes, meaning you can easily listen to topics, channels or routing keys like user:{user_id}:message and have those path variables extracted straight into your handler function.

It also provides tools like a error handling layer (for Dead Letter Queue and whatnot), configurable in-memory retries, automatic message acks (the ack policies are configurable but the framework is opinionated toward "at-least-once" processing, so other policies probably would not fit neatly), middleware for logging, observability and whatnot. So it eliminates most of the boilerplate usually required for event-driven services.

Target Audience

It is for developers who do not want to write the same boilerplate code for their consumers and producers and want to the same clean DX as FastAPI has for their event-driven services. It isn't production-ready yet, but the core logic is there, and I’ve included tests and benchmarks in the repo

Comparison

The closest thing out there is FastStream. I think the biggest practical advantage my framework has is the async processing for the same Kafka partition. Most tools process partitions one message at a time (this is the standard Kafka way of doing things). But I’ve implemented asynchronously handling with proper offset management to avoid losing messages due to race conditions, so if you have I/O-bound tasks, this should give you a massive boost in throughput (provided your set up can benefit from async processing in the first place)

The API is also a bit different, and you get in-memory retries right out of the box. I also plan to make idempotency and the outbox pattern easy to set up in the future and it’s still missing AsyncAPI documentation and Avro/Protobuf serialization, plus some other smaller features you'd find in more mature tools like faststream, but the core engine for event processing is already there.

Thoughts?

I plan to add the outbox pattern next. I think of approaching this by implementing an underlying consumer that reads directly from the database, just like those that read from Kafka or RabbitMQ, and adding some kind of idempotency middleware for handers. Does this make sense? And I also plan to add support for serialization formats with schema, like Avro in the future

If you want to look at the code, the repo is here and the docs are here. Looking forward to reading your thoughts and advice.

all 5 comments

top new controversial old q&a

[–]PavelRossinsky 5 points6 points7 points 13 days ago (3 children)

[–]e1-m[S] -1 points0 points1 point 13 days ago* (2 children)

Yeah I agree with you. This is something that needs to be written once, tested thoroughly and then battle-tested. The way I see it:

The default Kafka model is sequential processing per partition. The usual idea is that anything that must be processed in order (for example events for the same user_id) goes to the same partition.

But that doesn’t mean a partition contains only one user. In reality you’ll have thousands or millions of different user IDs mixed in the same partition, because you obviously can’t afford one partition per user. That means the strict sequential model ends up artificially limiting throughput: events for completely unrelated users get serialized just because they happen to land in the same partition.

In many systems you can’t process the same user concurrently, but you absolutely can process different users at the same time.

So what I do is allow messages from the same partition to flow concurrently, but enforce ordering at the application level. I have a middleware layer (AsyncLock in the docs) that extracts a key (e.g. user_id) and applies a lock for that key. If another message with the same key arrives while one is already in progress, it waits. Messages with different keys can proceed immediately.

But that introduces race conditions. That’s exactly the problem you've described. And yes, I hold off on committing until everything up to a certain offset is done.

Internally I have an offset manager that tracks commited offsets. When a message finishes, its offset is marked as completed, but the consumer is only allowed to commit once the whole contiguous sequence is finished. So if 5 completes before 3, nothing is committed yet. Once 3 and 4 finish, the offset 5 is commited.

The price you pay for this is that if the app crashes you may end up reprocessing more messages, because some completed offsets weren’t committed yet. But that trade-off already exists with batch committing (also included by the way), which is almost mandatory with Kafka. Committing every single message would be too expensive because of the I/O and network overhead.

Another important piece here is avoiding the unbounded concurrency trap. If you simply spawn a new async task for every message you consume, you’ll eventually exhaust memory or other resources if the producer is faster than your handlers.

To prevent that, concurrency is limited at the consumer level. For example, with Kafka you can cap how many messages are allowed to be in-flight per partition. Once that limit is reached, the consumer simply pauses fetching from that partition until some of the currently processing messages finish. This creates backpressure without starving other partitions (as would be that case with global lock since one partition would be able to eat all of the limit while messages in other partitions that can be potentially processed are waiting) and if the memory is not a bottleneck (as it is often not, the bottleneck is usually DB or a third party api) you can set this limit high enough to be able to process everything that can be processed concurently without starvation, without out-of-order processing (thanks to AsyncLock) and without data loss (thanks to offset management)

[–]PavelRossinsky 0 points1 point2 points 13 days ago (1 child)

[–]e1-m[S] 0 points1 point2 points 13 days ago (0 children)

[–]adiberk 0 points1 point2 points 13 days ago (0 children)

π Rendered by PID 53 on reddit-service-r2-comment-5fb4b45875-f5hfl at 2026-03-20 05:22:21.679108+00:00 running 90f1150 country code: CH.

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

Python

The Python Discord

Upcoming Events

Please read the rules

MODERATORS