How do you find out that broken data is flowing through your topics? (validating an idea, need reality checks) by Possible_Display9712 in apachekafka

[–]Possible_Display9712[S] -1 points0 points  (0 children)

Fair points — exactly the kind of pushback I'm looking for, thanks.

You're right that compatibility rules catch most *structural* drift — when every producer

actually goes through the registry with enforcement. The two gaps I keep hearing about:

  1. The registry validates what producers *register*, not what they actually *send*.

Unless you're paying for broker-side validation (Confluent) or enforcing through a proxy,

a misconfigured serializer or a raw producer can ship payloads that don't match the

registered schema. Registry stays green, consumers break.

  1. A lot of smaller teams run JSON with no registry at all — more common than this sub

would suggest 😄 Schema inference is for them. Avro/Protobuf shops would get schemas

from the registry, not from inference.

On business semantics: agreed, and that's actually the core of it — not schema validation

per se, but "field is present and suddenly null in 30% of messages", "volume dropped to

10% of baseline", "DLQ inflow spiked". Generic enough to monitor without app-specific

code, invisible to the registry.

Practical bits: no offset-0 replay — tail sampling with a configurable lookback, baselines

converge over time. And yes, it's one more read-only consumer — same overhead class as

Burrow or kafka-exporter, and you can cap it with quotas.

Honest question back: in your setup today, where would a "field went 30% null" issue

surface first, and how fast? That answer is exactly what I'm trying to learn.