.. _schemaregistry_mirroring: |sr| Multi Datacenter Setup =============================== Spanning multiple datacenters (DCs) with your |sr-long| synchronizes data across sites, further protects against data loss, and reduces latency. The recommended multi-datacenter deployment designates one datacenter as "primary" and all others as "secondary". If the "primary" datacenter fails and is unrecoverable, you must manually designate what was previously a "secondary" datacenter as the new "primary" per the steps in the Run Books below. |ak| Election -------------- Recommended Deployment ^^^^^^^^^^^^^^^^^^^^^^ .. figure:: ../images/multi-dc-setup-kafka.png :align: center Multi datacenter with |ak| based primary election The image above shows two datacenters - DC A, and DC B. Either could be on-premises, in `Confluent Cloud `__, or part of a :ref:`bridge to cloud ` solution. Each of the two datacenters has its own |ak-tm| cluster, |zk| cluster, and |sr|. The |sr| nodes in both datacenters link to the primary |ak| cluster in DC A, and the secondary datacenter (DC B) forwards |sr| writes to the primary (DC A). Note that |sr| nodes and hostnames must be addressable and routable across the two sites to support this configuration. |sr| instances in DC B have ``master.eligibility`` set to false, meaning that none can be elected primary during steady state operation with both datacenters online. To protect against complete loss of DC A, |ak| cluster A (the source) is replicated to |ak| cluster B (the target). This is achieved by running the :ref:`Replicator` local to the target cluster (DC B). In this active-passive setup, |crep| runs in one direction, copying |ak| data and configurations from the active DC A to the passive DC B. Producers write data to just the active cluster. Depending on the overall design, consumers can read data from the active cluster only, leaving the passive cluster for disaster recovery, or from both clusters to optimize reads on a geo-local cache. In the event of a partial or complete disaster in one datacenter, applications can failover to the secondary datacenter. Important Settings ^^^^^^^^^^^^^^^^^^ ``kafkastore.bootstrap.servers`` This should point to the primary |ak| cluster (DC A in this example). ``schema.registry.group.id`` Use this setting to override the ``group.id`` for the |ak| group used when |ak| is used for primary election. Without this configuration, ``group.id`` will be "schema-registry". If you want to run more than one |sr| cluster against a single |ak| cluster you, should make this setting unique for each cluster. ``master.eligibility`` A |sr| server with ``master.eligibility`` set to false is guaranteed to remain a secondary during any primary election. |sr| instances in a "secondary" datacenter should have this set to false, and |sr| instances local to the shared |ak| (primary) cluster should have this set to true. Hostnames must be reachable and resolve across datacenters to support forwarding of new schemas from DC B to DC A. Setup ^^^^^ Assuming you have |sr| running, here are the recommended steps to add |sr| instances in a new "secondary" datacenter (call it DC B): #. In DC B, make sure |ak| has ``unclean.leader.election.enable`` set to false. #. In DC B, run |crep| with |ak| in the "primary" datacenter (DC A) as the source and |ak| in DC B as the target. #. In |sr| config files in DC B, set the ``kafkastore.bootstrap.servers`` to point to |ak| cluster in DC A and set ``master.eligibility`` to false. #. Start your new |sr| instances with these configs. Run Book ^^^^^^^^ Let's say you have |sr| running in multiple datacenters, and you lose your "primary" datacenter; what do you do? First, note that the remaining |sr| instances running on the "secondary" can continue to serve any request that does not result in a write to |ak|. This includes GET requests on existing IDs and POST requests on schemas already in the registry. They will be unable to register new schemas. - If possible, revive the "primary" datacenter by starting |ak| and |sr| as before. - If you must designate a new datacenter (call it DC B) as "primary", reconfigure the ``kafkastore.bootstrap.servers`` in DC B to point to its local |ak| cluster and update |sr| config files to set ``master.eligibility`` to true. - Restart your |sr| instances with these new configs in a rolling fashion. |zk| Election ------------------ .. _zookeeper-deployment: Recommended Deployment ^^^^^^^^^^^^^^^^^^^^^^ .. figure:: ../images/multi-dc-setup.png :align: center Multi datacenter with Zookeeper based primary election The image above shows two datacenters - DC A, and DC B. Each of the two data centers has its own |zk| cluster, |ak| cluster, and |sr| cluster. Both |sr| clusters link to |ak| and |zk| in DC A, and the secondary datacenter (DC B) forwards |sr| writes to the primary (DC A). The |sr| nodes and hostnames must be addressable and routable across the two sites to support this configuration. The |sr| instances in DC B have ``master.eligibility`` set to false, meaning that none can ever be elected primary. In this active-passive setup, |crep| runs in one direction, copying |ak| data and configurations from the active DC A to the passive DC B. To protect against complete loss of DC A, |ak| cluster A (the source) is replicated to |ak| cluster B (the target). This is achieved by running the :ref:`Replicator` local to the target cluster. In the event of a partial or complete disaster in one datacenter, applications can failover to the secondary datacenter. .. _zookeeper-settings: Important Settings ^^^^^^^^^^^^^^^^^^ ``kafkastore.connection.url`` kafkastore.connection.url should be identical across all |sr| nodes. By sharing this setting, all |sr| instances will point to the same |zk| cluster. ``schema.registry.zk.namespace`` Namespace under which |sr| related metadata is stored in |zk|. This setting should be identical across all nodes in the same |sr|. ``master.eligibility`` A |sr| server with ``master.eligibility`` set to false is guaranteed to remain a secondary during any primary election. |sr| instances in a "secondary" datacenter should have this set to false, and |sr| instances local to the shared |ak| cluster should have this set to true. Hostnames must be reachable and resolve across datacenters to support forwarding of new schemas from DC B to DC A. .. _zookeeper-setup: Setup ^^^^^ Assuming you have |sr| running, here are the recommended steps to add |sr| instances in a new "secondary" datacenter (call it DC B): #. In DC B, make sure |ak| has ``unclean.leader.election.enable`` set to false. #. In DC B, run Replicator with |ak| in the "primary" datacenter (DC A) as the source and |ak| in DC B as the target. #. In |sr| config files in DC B, set ``kafkastore.connection.url`` and ``schema.registry.zk.namespace`` to match the instances already running, and set ``master.eligibility`` to false. #. Start your new |sr| instances with these configs. .. _zookeeper-run-book: Run Book ^^^^^^^^ Let's say you have |sr| running in multiple datacenters, and you have lost your "primary" datacenter; what do you do? First, note that the remaining |sr| instances will continue to be able to serve any request which does not result in a write to |ak|. This includes GET requests on existing IDs and POST requests on schemas already in the registry. - If possible, revive the "primary" datacenter by starting |ak| and |sr| as before. - If you must designate a new datacenter (call it DC B) as "primary", update |sr| config files so that ``kafkastore.connection.url`` points to the local |zk|, and change ``master.eligibility`` to true. Then restart your |sr| instances with these new configs in a rolling fashion. Suggested Reading ----------------- - For information about multi-cluster and multi-datacenter deployments in general, see :ref:`multi_dc`. - For a broader explanation of disaster recovery design configurations and use cases, see the whitepaper on `Disaster Recovery for Multi-Datacenter Apache Kafka Deployments `_. - For an overview of schema management in |cp|, including details of single primary architecture, see :ref:`schemaregistry_intro` and :ref:`sr-high-availability-single-primary`. - For information on optimizing a single datacenter for high availability, see :ref:`schemaregistry_single-dc`. - |sr| is also available in |ccloud|; for details on how to lift and shift or extend existing clusters to cloud, see :ref:`schemaregistry_migrate`.