Avoid re-consumption with Apache Kafka MirrorMaker 2 in an active-active configuration?

I am running Apache MirrorMaker 2.7 on multiple active Kafka (2.6) clusters (named prod1, prod2). So topic on prod1 is replicated by MirrorMaker2 as prod1.topic on the prod2 cluster.

I have a Kafka Consumer service running on both prod1 and prod2 using the same kafka consumer group-id.

I have emit.checkpoints.interval.seconds=1 in my MirrorMaker2 config, resulting in offsets being translated every second from
topic to prod1.topic.

The problem is, everytime my producer produces to topic on prod1, the data is replicated over to prod1.topic on prod2, and consumed by my consumer on prod2 BEFORE the latest-commit offset from my prod1 consumer is translated over to prod2 by MirrorMaker2.

This does not happen if I start my prod2 consumer a few seconds after data is produced to prod1, as latest-commit offsets would have arrived on prod2 by then, however, both my consumers on prod1/2 need to be running live as I am using prod1/2 as an active-active deployment configuration.

So how could I ensure any data produced is consumed only once, by either of the prod1/ prod2 consumers?

Read more here: Source link