Avoid re-consumption with Apache Kafka MirrorMaker 2 in an active-active configuration?
I am running Apache MirrorMaker 2.7 on multiple active Kafka (2.6) clusters (named prod1
, prod2
). So topic
on prod1
is replicated by MirrorMaker2 as prod1.topic
on the prod2
cluster.
I have a Kafka Consumer service running on both prod1
and prod2
using the same kafka consumer group-id.
I have emit.checkpoints.interval.seconds=1
in my MirrorMaker2 config, resulting in offsets being translated every second from
topic
to prod1.topic
.
The problem is, everytime my producer produces to topic
on prod1
, the data is replicated over to prod1.topic
on prod2
, and consumed by my consumer on prod2
BEFORE the latest-commit offset from my prod1
consumer is translated over to prod2
by MirrorMaker2.
This does not happen if I start my prod2
consumer a few seconds after data is produced to prod1
, as latest-commit offsets would have arrived on prod2
by then, however, both my consumers on prod1/2
need to be running live as I am using prod1/2
as an active-active deployment configuration.
So how could I ensure any data produced is consumed only once, by either of the prod1
/ prod2
consumers?
Read more here: Source link