Migrate from MirrorMaker to Replicator

Kafka MirrorMaker is a stand-alone tool for copying data between two Apache Kafka® clusters. It is little more than a Kafka consumer and producer hooked together. Data will be read from topics in the origin cluster and written to a topic with the same name in the destination cluster.

Important

Confluent Replicator is a more complete solution that handles topic configuration as well as data and integrates with Kafka Connect and Control Center to improve availability, scalability and ease of use.

This topic provides examples of how to migrate from an existing datacenter that is using MirrorMaker to Replicator. In these examples, messages are replicated from a specific point in time, not from the beginning. This is helpful if you have a large number of legacy messages that you do not want to migrate.

Assume there are two datacenters, DC1 (Active) and DC2 (Passive), that are each running an Apache Kafka® cluster. There is a single topic in DC1 and it has been replicated to DC2 with the same topic name. The topic name is inventory.

Example 1: Same Number of Partitions in DC1 and DC2

In this example, you migrate from MirrorMaker to Replicator and keep the same number of partitions for inventory in DC1 and DC2.

Prerequisites:
  • Confluent Platform 5.0.0 or later is installed.
  • You must have the same number of partitions for inventory in DC1 and DC2 to use this method.
  • The src.consumer.group.id in Replicator must match group.id in MirrorMaker.
  1. Stop the running MirrorMaker instance in DC1, where <mm pid> is the MirrorMaker process ID:

    kill <mm pid>
    
  2. Configure and start Replicator. In this example, Replicator is run as an executable from the command line or from a Docker image.

    1. Add these values to <path-to-confluent>/etc/kafka-connect-replicator/replicator_consumer.properties. Replace localhost:9082 with the bootstrap.servers of DC1, the source cluster:

      bootstrap.servers=localhost:9082
      topic.preserve.partitions=true
      
    2. Add this value to <path-to-confluent>/etc/kafka-connect-replicator/replicator_producer.properties. Replace localhost:9092 with the bootstrap.servers of DC2, the destination cluster:

      bootstrap.servers=localhost:9092
      
    3. Ensure the replication factors are set to 2 or 3 for production, if they are not already:

      echo "confluent.topic.replication.factor=3" >> ./etc/kafka-connect-replicator/quickstart-replicator.properties
      echo "offset.storage.replication.factor=3" >> ./etc/kafka-connect-replicator/quickstart-replicator.properties
      echo "config.storage.replication.factor=3" >> ./etc/kafka-connect-replicator/quickstart-replicator.properties
      echo "status.storage.replication.factor=3" >> ./etc/kafka-connect-replicator/quickstart-replicator.properties
      
    4. Start Replicator:

      <path-to-confluent>/bin/replicator --cluster.id <new-cluster-id> \
      --producer.config replicator_producer.properties \
      --consumer.config replicator_consumer.properties \
      --replication.config ./etc/kafka-connect-replicator/quickstart-replicator.properties
      

Replicator will use the committed offsets by MirrorMaker from DC1 and start replicating messages from DC1 to DC2 based on these offsets.

Example 2: Different Number of Partitions in DC1 and DC2

In this example, you migrate from MirrorMaker to Replicator and have a different number of partitions for inventory in DC1 and DC2.

Prerequisite:
  • Confluent Platform 5.0.0 or later is installed.
  • The src.consumer.group.id in Replicator must match group.id in MirrorMaker.
  1. Stop the running MirrorMaker instance from DC1.

  2. Configure and start Replicator. In this example, Replicator is run as an executable from the command line or from a Docker image.

    1. Add this value to <path-to-confluent>/etc/kafka-connect-replicator/replicator_consumer.properties. Replace localhost:9082 with the bootstrap.servers of DC1, the source cluster:

      bootstrap.servers=localhost:9082
      topic.preserve.partitions=false
      
    2. Add this value to <path-to-confluent>/etc/kafka-connect-replicator/replicator_producer.properties. Replace localhost:9092 with the bootstrap.servers of DC2, the destination cluster:

      bootstrap.servers=localhost:9092
      
    3. Ensure the replication factors are set to 2 or 3 for production, if they are not already:

      echo "confluent.topic.replication.factor=3" >> ./etc/kafka-connect-replicator/quickstart-replicator.properties
      echo "offset.storage.replication.factor=3" >> ./etc/kafka-connect-replicator/quickstart-replicator.properties
      echo "config.storage.replication.factor=3" >> ./etc/kafka-connect-replicator/quickstart-replicator.properties
      echo "status.storage.replication.factor=3" >> ./etc/kafka-connect-replicator/quickstart-replicator.properties
      
    4. Start Replicator:

      <path-to-confluent>/bin/replicator --cluster.id <new-cluster-id> \
      --producer.config replicator_producer.properties \
      --consumer.config replicator_consumer.properties \
      --replication.config ./etc/kafka-connect-replicator/quickstart-replicator.properties
      

Replicator will use the committed offsets by MirrorMaker from DC1 and start replicating messages from DC1 to DC2 based on these offsets.