Migrate from Kafka MirrorMaker to Replicator in Confluent Platform

Kafka MirrorMaker is a stand-alone tool for copying data between two Apache Kafka® clusters. It is little more than a Kafka consumer and producer hooked together. Data will be read from topics in the origin cluster and written to a topic with the same name in the destination cluster.

Important

Confluent Replicator is a more complete solution that copies topic configuration and data, and also integrates with Kafka Connect and Control Center to improve availability, scalability and ease of use.

This topic provides examples of how to migrate from an existing datacenter that is using Apache Kafka® MirrorMaker to Replicator. In these examples, messages are replicated from a specific point in time, not from the beginning. This is helpful if you have a large number of legacy messages that you do not want to migrate.

Assume there are two datacenters, DC1 (Active) and DC2 (Passive), that are each running a Kafka cluster. There is a single topic in DC1 and it has been replicated to DC2 with the same topic name. The topic name is inventory.

Example 1: Same number of partitions in DC1 and DC2

In this example, you migrate from MirrorMaker to Replicator and keep the same number of partitions for inventory in DC1 and DC2.

Prerequisites:

Confluent Platform 5.0.0 or later is installed.
You must have the same number of partitions for inventory in DC1 and DC2 to use this method.
The src.consumer.group.id in Replicator must match group.id in MirrorMaker.

Stop the running MirrorMaker instance in DC1, where <mm pid> is the MirrorMaker process ID:
```
kill <mm pid>
```

Configure and start Replicator. In this example, Replicator is run as an executable from the command line or from a Docker image.

Add these values to CONFLUENT_HOME/etc/kafka-connect-replicator/replicator_consumer.properties. Replace localhost:9082 with the bootstrap.servers of DC1, the source cluster:
```
bootstrap.servers=localhost:9082
topic.preserve.partitions=true
```
Add this value to CONFLUENT_HOME/etc/kafka-connect-replicator/replicator_producer.properties. Replace localhost:9092 with the bootstrap.servers of DC2, the destination cluster:
```
bootstrap.servers=localhost:9092
```

Ensure the replication factors are set to 2 or 3 for production, if they are not already:

echo "confluent.topic.replication.factor=3" >> ./etc/kafka-connect-replicator/quickstart-replicator.properties
echo "offset.storage.replication.factor=3" >> ./etc/kafka-connect-replicator/quickstart-replicator.properties
echo "config.storage.replication.factor=3" >> ./etc/kafka-connect-replicator/quickstart-replicator.properties
echo "status.storage.replication.factor=3" >> ./etc/kafka-connect-replicator/quickstart-replicator.properties

Start Replicator:

replicator --cluster.id <new-cluster-id> \
--producer.config replicator_producer.properties \
--consumer.config replicator_consumer.properties \
--replication.config ./etc/kafka-connect-replicator/quickstart-replicator.properties

Replicator will use the committed offsets by MirrorMaker from DC1 and start replicating messages from DC1 to DC2 based on these offsets.

Example 2: Different number of partitions in DC1 and DC2

In this example, you migrate from MirrorMaker to Replicator and have a different number of partitions for inventory in DC1 and DC2.

Prerequisite:

Confluent Platform 5.0.0 or later is installed.
The src.consumer.group.id in Replicator must match group.id in MirrorMaker.

Stop the running MirrorMaker instance from DC1.

Configure and start Replicator. In this example, Replicator is run as an executable from the command line or from a Docker image.

Add this value to CONFLUENT_HOME/etc/kafka-connect-replicator/replicator_consumer.properties. Replace localhost:9082 with the bootstrap.servers of DC1, the source cluster:
```
bootstrap.servers=localhost:9082
topic.preserve.partitions=false
```
Add this value to CONFLUENT_HOME/etc/kafka-connect-replicator/replicator_producer.properties. Replace localhost:9092 with the bootstrap.servers of DC2, the destination cluster:
```
bootstrap.servers=localhost:9092
```

Ensure the replication factors are set to 2 or 3 for production, if they are not already:

echo "confluent.topic.replication.factor=3" >> ./etc/kafka-connect-replicator/quickstart-replicator.properties
echo "offset.storage.replication.factor=3" >> ./etc/kafka-connect-replicator/quickstart-replicator.properties
echo "config.storage.replication.factor=3" >> ./etc/kafka-connect-replicator/quickstart-replicator.properties
echo "status.storage.replication.factor=3" >> ./etc/kafka-connect-replicator/quickstart-replicator.properties

Start Replicator:

replicator --cluster.id <new-cluster-id> \
--producer.config replicator_producer.properties \
--consumer.config replicator_consumer.properties \
--replication.config ./etc/kafka-connect-replicator/quickstart-replicator.properties

Replicator will use the committed offsets by MirrorMaker from DC1 and start replicating messages from DC1 to DC2 based on these offsets.

Next steps

Sign up for Confluent Cloud and use the Cloud quick start to get started.
Download Confluent Platform and use the Confluent Platform quick start to get started.