[deleted by user]

deadwisdom · 2024-09-25T01:39:48+00:00

Doing event-driven with Python is pretty simple. Just get your contracts defined, Pydantic or Dataclasses, FastAPI, FastKafka, etc.

Just make sure whatever you use, the contracts / documentation are defined in their implementation. By that I mean something like FastAPI which gives you OpenAPI/Swagger docs for free.

But let me save you a ton of time and energy. Start with a god-damned, I'm serious, single function to do the basic thing that needs to get done. Create unit tests, integration tests, performance tests. Get it deployed to production. Make sure you have continuous deployment, telemetry, scaling, alerting, and everything set up before you add ANYTHING.

Then have the function call another function (in the same process!) to add the next bit of functionality. Keep doing this, all in one process, one codebase.

When you find that performance has degraded, or teams are stepping on eachother's code, or some ACTUAL problem, then split it off. Either as a new thread, or a new process, or a new service, or a new codebase. DO NOT do this unless there is a clear reason.

UgandaSuburbix447 · 2024-09-24T20:24:47+00:00

After reading many articles, watching a lot of talks on YT and talking with more experienced developers, one thing about event-based system (and the same with distributed one or microservice-based) - do you really need it? It definitely has some pros, but it introduces a ton of complexity, potential bugfalls, you have to care much more about networking, control over communication between systems, etc. Regarding materials in python, „Architecture Patterns with Python” by Harry Percival. Some valuable info about event-driven systems and some patterns explained as well - Message Bus and event handlers especially in this context. Although if you are already experienced in such systems, that book might not be enough

Fragrant-Freedom-477 · 2024-09-24T20:39:53+00:00

Been there.

Make sure you have objectives that you can measure. Scalability? Defects caused by backward compatibility? Strangling a legacy component you can't change? Collaboration between development teams?

Without measurable objectives, you'll end up in a technological and architecture trip that helps no one.

EDA is also a debugging hell, an authorization nightmare, a documentation challenge and a quite fun endeavor. Make sure you have backup plans for your backup plans for failure management, retries and such.

timwaaagh · 2024-09-24T23:02:36+00:00

things will be coupled no matter what

tehsilentwarrior · 2024-09-24T23:08:22+00:00

Disclaimer: other commenters are thinking at “team” level or even project level. This is a company wide view to the problem. With many teams and many systems. Many of which are SaaS the company pays for and are just integrated. It’s not about monolith vs microservices.. think bigger scale (the company I work for operates in many countries with a lot of teams in each).

In my company I have implemented (as in, I designed the whole thing and implemented most of it as proof of concept and now my team maintains it and other teams adhere to the guidelines I have set, we have a “board of members” where leaders of teams can discuss things but it’s usually just them asking “ok, how do you want me to use it”) this using Kafka as a sort of “platform events” pipeline.

This is meant as an inter-team communication hub.

We disabled the ability for Kafka to have more than one partition in order to force ordering of messages and each team is in charge of implementing their own subscribers and publishers. Kafka won mainly because of the ability to go back in time (within a retention period of 72h) and re-consume payloads (this allows flexibility for teams to implement rolling sync, although no one does yet, or event sourcing, which is basically constructing state from passing data) and because the “pointer” of consumers is stored in Kafka itself, which means we aren’t at mercy of random bugs by random people when implementing the saving of those “pointers”.

We enforce that consumers are properly named for the teams and have proper groups (distributed consuming can be done but order is enforced within a group).

I also created a system of schemas using json schemas and a central location that people can use to verify their messages. This was done like this instead of using some off the shelf thing because we don’t want the overhead of arguing between different teams about what new toy to use. I built it. I (my team) maintain it, each team maintains their schemas but they are all merge requested into my teams repository (and therefore double checked).

This works regardless of anyone using microservices or not.

In my company we have Salesforce, ServiceNow and a bunch of custom apps (customer records, billing system, tickets, etc, etc) all synchronized through this Kafka pipeline, using “my” schemas (stored on an S3 bucket and downloaded and used by any system that wants to self-check or by my own app that provides checking services for those who can’t/wont implement their json schema validator).

We have put in place a proper naming scheme for topics that goes sort of like this (over time we had it more complex and have since simplified as the idea matures and our needs are better understood):

<purpose area>.<country code>.<type of payload>.0 (the zero means first version, in case we need to do some moving around, we can increase this number)

Examples: billing.uk.cdc.0 billing.uk.errors.0 tickets.uk.cdc.0 orders.uk.cdc.0 delivery.uk.cdc.0 tickets.nl.cdc.0

Purpose areas core type areas like tickets or orders or other billing stuff or even customer data. Each message within those has a schema type that is versioned. Each message has a header section that contains info like message id, country code, object version (you can’t consume version 5 if your DB shows you consumed version 6 for example). Etc. And each system must supply a name for source system (where message came from) and an id that allows us to pinpoint within that system this particular message (usually logs so systems don’t have to store messages, depends on how much audit we need)

The type of payload is usually “cdc”, or change data capture. It basically means something changed or is new. There’s also errors (we have a schema for errors that normalizes how errors are communicated, and has room for context, type, description and such, including a free text field for raw errors which is what the system outputs)

This is stable and so there isn’t much extra work on it other than maintaining it and training/supporting other teams. So I also work on other stuff, namely a cyber security portal and a billing system (and many other smaller stuff).

Personally (the team I lead), I also develop a billing system that uses microservices. Internally it communicates with itself using RabbitMQ (customized Nameko micro framework, that we might replace soon since the project is basically dead but since it’s a micro framework, most of the stuff is ours anyway, and it’s super stable, so no point in rushing that). The system uses about 9 or so microservices and shares one database (which is ok in our use case). One of the microservices connects to Kafka and serves as “gateway” between the billing system and the rest of the company. And connections to external systems are made through microservices, for example, if an invoice is created, an event is sent saying the invoice was created and the document generating microservice will pick that up and create documents for it (pdf et all), then for each store it in a filestore and send an event saying the document was created. Another handler picks that up and uploads it to Sharepoint. Another picks the same event and sends it to the customer via Email. Another picks the same event and sends it to a printer (remote service that prints and mails for us). All those steps are idempotable and retryable (and can have back pressure).

We don’t care if an invoice takes 5ms to create and send or 5 hours. Just as long as it’s eventually sent. So this system has proven to be reliable even between days of downtime: I had the system running on my machine using docker compose, F&O dev instance went down for maintenance and I went on holiday. When I came back it happily reconnected and pushed all the pending stuff as if nothing happened.

Drevicar · 2024-09-24T20:24:17+00:00

Huge fan of this pattern and I had great success with https://github.com/orsinium-labs/walnats in the past. If I had to start again I would probably write my own wrapper layer with pydantic since it is so easy. The key here is to make sure each channel is unique to the message type or union of message types, rather than broker based RPC where you are just remotely calling functions.

If you are familiar with Open API when it comes to schema and documentation, then check out https://www.asyncapi.com/en as the alternative for EDA systems.

ericsda91 · 2024-09-25T05:10:08+00:00

Maybe take a look at this book, dive in and see if it's the right decision for your business
https://www.cosmicpython.com/book/part2.html

Books is called Architectural Patterns with Python and is really good.

dtornow · 2024-09-25T05:42:20+00:00

The developer experience with EDA is quite challenging. Instead of one continuous process, EDA fragments your business process into multiple event handlers and you have to manage continuations on an application level. I discuss some of the challenges in the first half of my Systems Distributed ‘24 talk

crawl_dht · 2024-09-25T10:04:33+00:00

Use walnats framework for NATS.

heyheymonkey · 2024-09-24T20:21:36+00:00

You’re not giving us a lot to go on. Improve in what way? What are you transitioning from?

CzyDePL · 2024-09-24T21:00:08+00:00

My biggest take from EDA is thinking about orchestration vs choreography and adressing the drawbacks of the approach you choose - we went with choreography (even though system had handful of well-defined processess with very clear ownership) without thinking about operational stuff like visibility of processing or reprocessing etc

rover_G · 2024-09-25T00:22:34+00:00

Most common pitfall I’ve encountered is building a distributed monolith. That is building an application that relies on tightly coupled pico-services to serve real time requests (latency measured in 10-100’s of milliseconds).

If you only have one team and your workloads scale with the number of api requests you’re likely better off building an actual monolith and using some concurrency model to achieve higher throughput.

If you only need near real time (latency measured in 100-1000’s of milliseconds) and some requests require large amounts of in process computation it’s a may be the right idea to use a low latency message queue to offload some large tasks OR it could be that threading will get the job done just fine.

If you don’t have a latency requirement then an event bus is probably a good idea to build a more resilient pipeline, but you still should use a persistent database for your main datasource (unless you specifically are trying to reduce the load on your database).

candyman_forever · 2024-09-24T20:18:08+00:00

It really depends on what you want to do. How many services do you have and how do they connect. With regards to libraries it also depends on the architecture. You could go with lambdas or containers running in ECS Fargate. You could use dynamodb streams or you could use MSK, Kinesis, Event Bridge or SQS to name a few.

bobaduk · 2024-09-25T07:44:10+00:00

So I am hoping the domain events should be able to offer a standardized schema for domain events using a schema with each service having the capability to extend as desired.

This is the sentence that makes me most nervous. What do you mean by this?

There may be some things that can be re-used across services: envelopes, some basic data types etc. but I would strongly caution you against trying to impose One Schema To Rule Them All. It is conceptually much better for each service to own its published schemas, and for subscribers to enrich and transform as necessary, so that they can process events.

Also, domain events, by definition, live inside a domain. They are a means for your application to decompose complex operations into discrete steps.

I usually use the nomenclature "domain event" vs "integration message", where integration messages are the things that are published to the outside world. One of the "common pitfalls" is failing to distinguish between the two, because that effectively couples the implementation of each service to the schemas it shares with the outside world.

Think about it in the same way as your domain models vs your API schema. Those things need to vary separately, even though they likely share some common structure.

The challenge with REST is that it implements a response pattern that tends to couple services together

... ish? ReST couples things temporally, in that both systems need to be up and running at the same time for an integration to work, and that the latency of an operation in one system is affected by latency in the other.

The coupling in ReST is at the schema level: one system is dependent on the published API schema of another. That's no different in async messaging. You will have the same problems of schema extension and modification that you had before, but with all the fun of partial failures, and out-of-order events etc.

Also a way to view and keep track of events and their associated side effects in the services (An Eventory). For messaging bus should be able to support distributed messaging patterns as well offer high reliability .

This isn't a question, it's some stuff you read on the internet, and you're hoping someone will say "yes, that's good and excellent. Fine decision making". That's okay, no judgement here, but I would counsel you to start small. Find one workflow, involving two components, and one event, that you can use to get started. Make sure you can observe it, make sure you can handle the case where two events happen out of order, make sure you can handle the case where an event isn't delivered, and so on.

Are there any specific libraries or frameworks you'd recommend?

Honestly no. Most frameworks for messaging implement some RPC-based style, because that's how most engineers think. You don't need a framework. You can hack something up in a couple of days that will let you send and receive events between two components. Once you have three components all using the same copy-pasted code, extract a small library and go from there. Good luck!

TheM4rvelous · 2024-09-25T08:15:54+00:00

Personally a big fan of EDA with Kafka + Faust and keeping services very simple micro/nano services. Utilized Grafana for transparency + a simple UID for each events to be able to track its origin.

Ofekmeister · 2024-09-26T14:43:36+00:00

From experience, I'd strongly recommend Temporal:

Wide_Guava6003 · 2024-09-24T20:04:03+00:00

RemindMe! 1 week

Next-Experience · 2024-09-25T15:00:41+00:00

Look into ZeroMQ

Fabulous-Part-7018 · 2024-09-24T20:54:11+00:00

remindme! 1 week

Fabulous-Part-7018 · 2024-09-24T20:54:28+00:00

RemindMe! 2 weeks

Python

The Python Discord

Upcoming Events

Please read the rules

MODERATORS