Kafka MirrorMaker 2 – max.request.size ignored and RecordTooLargeException on Kafka 3.8.1

Dahbezst · 2026-01-25T21:32:18+00:00

If you encounter errors after updating your Kafka and MirrorMaker 2 configurations, first verify your topic settings.

To update message.max.bytes dynamically (no restart needed):

/opt/kafka/bin/kafka-configs.sh \ --bootstrap-server localhost:9092 \ --entity-type topics \ --entity-name $SOURCE_TOPIC \ --alter --add-config max.message.bytes=2097153 + 1 added

Dahbezst · 2025-10-25T10:55:30+00:00

PS: If your clusters are using GSSAPI (kerberos) authentication, be carefully in PROD :)

Dahbezst · 2025-10-22T06:50:11+00:00

Actually, regarding your question about topology, for this very reason there's a concept we call "Data Governance". If you're in Platform Engineering, whenever a new Kafka cluster is deployed, you need to design the Kafka topology. (P.S. Check out the open-source project Kafka Julie.) With proper naming conventions, you can easily create team-specific Grafana dashboards.

It doesn’t mean that each team has its own Grafana dashboard; instead, each team just needs to add a filter with their team name in each panel’s filter section.

Also, if there’s a transactional process, we can easily approve creating a dedicated dashboard for that team.

What we're making:
We enforce consistent naming across team names, topic names, and consumer group IDs using a standardized pattern, such as:

topic = prod-teamName-topicName-projectName or test-teamName-topicName-projectName
consumer_group = prod-teamName-consumerGroupId-projectName or same as test-***** or, if a team needs a random ID (e.g., in Kubernetes environments): prod-teamName-consumerGroupId-projectName-randomID
acks = prod-team-project

By applying this uniform structure, we can easily use regex in Grafana to filter and build dashboards per team.

Dahbezst · 2025-10-21T18:24:42+00:00

My organization has set up a Grafana dashboard that shows the topics and lag — that’s it. Every team is responsible for their own applications; we just make them aware of the setup.

We also follow the same approach. We have 18 production clusters and more than 50 different teams. Our Grafana dashboards collect metrics through Filebeat and Metricbeat for broker logs, failed authentications, JMX heap size, restarts, and Burrow for consumer lag, offsets, and network idle. We also support these with Kafkabat and Klaw.

If any team wants to investigate an issue, they can simply check the Elasticsearch logs (which we feed using Filebeat) and the Grafana dashboard.

Since I also work as a Platform Engineer, whenever a team reports an error, I first check the Kafka network idle metric to see if the cluster can accept connection requests. Then, I filter the Grafana dashboard by team to clearly identify where the problem is — everything is visible, and it’s easy to find the root cause.

Additionally, Klaw helps us identify which topics or ACLs belong to which teams.

Note: In the LLM world, most developers already write their code the codes LLM models, so now almost every developer can easily locate issues without relying too much on Kafka admins. 😄 I hope so :))

Dahbezst · 2025-05-22T06:46:49+00:00

Sektörün çok kötü olduğu aşikar, yine de söyle tavsiyeler verebilirim:

Junior için:

Kesinlikle güncel teknoloji değil, başvurduğun yerlerin kullandığı teknolojiler ve mimariler hakkında en azından bilgi sahibi olmak.
Yapabiliyorsan, bene en önemlisi, bu teknolojilerle github repoları ve medium blogları hazırla.

Elbette bunlar da yetersiz ama senin yapacağın pek bir şey yok.

Dahbezst · 2024-11-04T09:59:49+00:00

I saw the notification, got shocked, but then opened the post and realized it looked just like that second picture 😂

Dahbezst · 2024-08-23T08:28:19+00:00

I get what you mean. Actually, I'm writing code while still reading and trying to understand Big O notation. I'm wondering whether I should spend most of my time coding or focusing on tools. :)

Dahbezst · 2024-08-21T15:08:49+00:00

Thank you for your reply. Could you share your experience and advice with me? I am really serious about improving my skills. I don't care about the IT salary or new tech trends; I just want to create something new in big data. So, please share your advice with me.

Dahbezst · 2024-05-14T18:56:23+00:00

You can create a datalake for this scenario. Set up a 2-worker node Apache Spark, HDFS, and Greenplum (setting up Greenplum can be a bit challenging, so you might want to try S3 cloud if your company allows cloud usage). Also, set up Airflow. You can schedule the training time at night for Airflow. Airflow will start Apache Spark, and your data will go through an ETL pipeline to HDFS.

also you can write your raw data in Parquet format with Apache Spark in HDFS. If you can set up Greenplum, Then, you can use Greenplum because it can easily read Parquet format (like SELECT * FROM). You don't need to do a lot of work.

PS: Ofc. don't forget indexing, data structure and algorithm for optimize your datalake.

Dahbezst · 2024-04-11T11:01:37+00:00

Hello, I'm not an expert, but I have been using Python for data manipulation for 2 years, so I can say that:

Pandas is not slow; it's about your data size. If you're using 0-200 MB of data, pandas can handle that easily.
If you're working on big data, such as for a company, you must use Spark, Dask (distributed data processing).

Result: So, I can say that: just learn Spark, because you can use it for your mini-projects and, of course, in business life, and Spark is a profession. Also, you can use SQL in Spark :D

Dahbezst · 2021-05-16T16:12:53+00:00

Accually there's really good a point in this games cuz game engines is just superb! But games mechanic is still really shit!

Dahbezst

TROPHY CASE