|sr| Single and Multi-Datacenter Deployments
============================================

.. _schemaregistry_single-dc:

Single Datacenter Setup
-----------------------

Within a single datacenter or location, a multi-node, multi-broker cluster provides |ak| data
replication across the nodes.

Producers write and consumers read data to/from topic partition leaders. Leaders
replicate data to followers so that messages are copied to more than one broker. 

You can configure parameters on producers and consumers to optimize your single
cluster deployment for various goals, including message durability and high
availability.

|ak| :ref:`producers can set the acks configuration
parameter<cp-config-producer>` to control when a write is considered successful.
For example, setting producers to ``acks=all`` requires other brokers in the
cluster acknowledge receiving the data before the leader broker responds to the
producer.

If a leader broker fails, the |ak| cluster recovers when a follower broker is
elected leader and client applications can continue to write and read messages
through the new leader. 

-------------
|ak| Election
-------------

Recommended Deployment
^^^^^^^^^^^^^^^^^^^^^^

.. figure:: ../images/single-dc-setup.png
    :align: center

    Single datacenter with |ak| intra-cluster replication

The image above shows a single data center - DC A. For this example, |ak| is used for primary election, which is recommended.

.. note:: You can also set up a single cluster with |zk|, but this configuration is deprecated in favor of |ak| leader election.

Important Settings
^^^^^^^^^^^^^^^^^^

``kafkastore.bootstrap.servers``
This should point to the primary |ak| cluster (DC A in this example).

``schema.registry.group.id``
``schema.registry.group.id`` is used as the consumer ``group.id``. For single datacenter setup, make this setting the same for all nodes in the cluster. When set, ``schema.registry.group.id`` overrides ``group.id`` for the |ak| group when |ak| is used for primary election. (Without this configuration, ``group.id`` will be "schema-registry".)

``master.eligibility``
In a single datacenter setup, all |sr| instances will be local to the |ak| cluster and should have ``master.eligibility`` set to true.

Run Book
^^^^^^^^

Let's say you have |sr| running in a single datacenter , and the master node goes down; what do you do? First, note that the remaining |sr| instances can continue to serve requests.

- If one |sr| node goes down, another node is elected primary and the cluster auto-recovers.

- Restart the node, and it will come back as a follower (since a new primary was elected in the meantime).

.. _schemaregistry_mirroring:

Multi-Datacenter Setup
----------------------

Spanning multiple datacenters (DCs) with your |sr-long|  synchronizes data
across sites, further protects against data loss, and reduces latency. The
recommended multi-datacenter deployment designates one datacenter as "primary"
and all others as "secondary". If the "primary" datacenter fails and is
unrecoverable, you must manually designate what was previously a "secondary"
datacenter as the new "primary" per the steps in the Run Books below.

-------------
|ak| Election
-------------

Recommended Deployment
^^^^^^^^^^^^^^^^^^^^^^

.. figure:: ../images/multi-dc-setup-kafka.png
    :align: center

    Multi datacenter with |ak| based primary election

The image above shows two datacenters - DC A, and DC B. Either could be
on-premises, in `Confluent Cloud <https://docs.confluent.io/cloud/current/overview.html>`__, or part of a :ref:`bridge to cloud
<schemaregistry_migrate>` solution. Each of the two datacenters has its own
|ak-tm| cluster, |zk| cluster, and |sr|. 

The |sr| nodes in both datacenters link to the primary |ak| cluster in DC A, and
the secondary datacenter (DC B) forwards |sr| writes to the primary (DC A).
Note that |sr| nodes and hostnames must be addressable and routable
across the two sites to support this configuration.

|sr| instances in DC B have ``master.eligibility`` set to false, meaning that
none can be elected primary during steady state operation with both datacenters
online.

To protect against complete loss of DC A, |ak| cluster A (the source) is
replicated to |ak| cluster B (the target). This is achieved by running the 
:ref:`Replicator<connect_replicator>` local to the target cluster (DC B).

In this active-passive setup, |crep| runs in one direction, copying |ak| data
and configurations from the active DC A to the passive DC B. Since the |sr|
instances in both data centers point to the internal ``_schemas`` topic in DC A,
there is no need to replicate the :ref:`internal schemas topic <schemaregistry_design>` itself.

Producers write data to just the active cluster. Depending on the overall
design, consumers can read data from the active cluster only, leaving the
passive cluster for disaster recovery, or from both clusters to optimize reads
on a geo-local cache. 

In the event of a partial or complete disaster in one datacenter, applications
can failover to the secondary datacenter.

Important Settings
^^^^^^^^^^^^^^^^^^

``kafkastore.bootstrap.servers``
This should point to the primary |ak| cluster (DC A in this example).

``schema.registry.group.id``
Use this setting to override the ``group.id`` for the |ak| group used when |ak| is used for primary election. Without this configuration, ``group.id`` will be "schema-registry". If you want to run more than one |sr| cluster against a single |ak| cluster you, should make this setting unique for each cluster.

``master.eligibility``
A |sr| server with ``master.eligibility`` set to false is guaranteed to remain a secondary during any primary election. |sr| instances in a "secondary" datacenter should have this set to false, and |sr| instances local to the shared |ak| (primary) cluster should have this set to true.

Hostnames must be reachable and resolve across datacenters to support forwarding of new schemas from DC B to DC A.

Setup
^^^^^

Assuming you have |sr| running, here are the recommended steps to add |sr| instances in a new "secondary" datacenter (call it DC B):

#. In DC B, make sure |ak| has ``unclean.leader.election.enable`` set to false.

#. In DC B, run |crep| with |ak| in the "primary" datacenter (DC A) as the source and |ak| in DC B as the target.

#. In |sr| config files in DC B, set the ``kafkastore.bootstrap.servers`` to point to |ak| cluster in DC A and set ``master.eligibility`` to false.

#. Start your new |sr| instances with these configs.

Run Book
^^^^^^^^

Let's say you have |sr| running in multiple datacenters, and you lose your "primary" datacenter; what do you do? First, note that the remaining |sr| instances running on the "secondary" can continue to serve any request that does not result in a write to |ak|. This includes GET requests on existing IDs and POST requests on schemas already in the registry. They will be unable to register new schemas.

- If possible, revive the "primary" datacenter by starting |ak| and |sr| as before.

- If you must designate a new datacenter (call it DC B) as "primary", reconfigure the ``kafkastore.bootstrap.servers`` in DC B to point to its local |ak| cluster and update |sr| config files to set ``master.eligibility`` to true.

- Restart your |sr| instances with these new configs in a rolling fashion.

-------------
|zk| Election
-------------

.. _zookeeper-deployment:

Alternative Deployment
^^^^^^^^^^^^^^^^^^^^^^

.. important:: |zk| leader election is deprecated. |ak| leader election is recommended for multi-cluster deployments.

As an alternative to |ak| leader election, you can use |zk| leader election.
This would entail having two datacenters - DC A, and DC B. Each of the two data
centers has its own |zk| cluster, |ak| cluster, and |sr| cluster. Both |sr|
clusters link to |ak| and |zk| in DC A, and the secondary datacenter (DC B)
forwards |sr| writes to the primary (DC A). The |sr| nodes and hostnames must be
addressable and routable across the two sites to support this configuration.

The |sr| instances in DC B have ``master.eligibility`` set to false, meaning that none can ever be elected primary.

In this active-passive setup, |crep| runs in one direction, copying |ak| data
and configurations from the active DC A to the passive DC B.

To protect against complete loss of DC A, |ak| cluster A (the source) is replicated to |ak| cluster B (the target). This is achieved by running the :ref:`Replicator<connect_replicator>` local to the target cluster.

In the event of a partial or complete disaster in one datacenter, applications
can failover to the secondary datacenter.

.. _zookeeper-settings:

Important Settings
^^^^^^^^^^^^^^^^^^

``kafkastore.connection.url``
kafkastore.connection.url should be identical across all |sr| nodes. By sharing this setting, all |sr| instances will point to the same |zk| cluster.

``schema.registry.zk.namespace``
Namespace under which |sr| related metadata is stored in |zk|. This setting should be identical across all nodes in the same |sr|.

``master.eligibility``
A |sr| server with ``master.eligibility`` set to false is guaranteed to remain a secondary during any primary election. |sr| instances in a "secondary" datacenter should have this set to false, and |sr| instances local to the shared |ak| cluster should have this set to true.

Hostnames must be reachable and resolve across datacenters to support forwarding of new schemas from DC B to DC A.

.. _zookeeper-setup:

Setup
^^^^^

Assuming you have |sr| running, here are the recommended steps to add |sr| instances in a new "secondary" datacenter (call it DC B):

#. In DC B, make sure |ak| has ``unclean.leader.election.enable`` set to false.

#. In DC B, run Replicator with |ak| in the "primary" datacenter (DC A) as the source and |ak| in DC B as the target.

#. In |sr| config files in DC B, set ``kafkastore.connection.url`` and ``schema.registry.zk.namespace`` to match the instances already running, and set ``master.eligibility`` to false.

#. Start your new |sr| instances with these configs.

.. _zookeeper-run-book:

Run Book
^^^^^^^^

Let's say you have |sr| running in multiple datacenters, and you have lost your "primary" datacenter; what do you do? First, note that the remaining |sr| instances will continue to be able to serve any request which does not result in a write to |ak|. This includes GET requests on existing IDs and POST requests on schemas already in the registry.

- If possible, revive the "primary" datacenter by starting |ak| and |sr| as before.

- If you must designate a new datacenter (call it DC B) as "primary", update |sr| config files so that ``kafkastore.connection.url`` points to the local |zk|, and change ``master.eligibility`` to true. Then restart your |sr| instances with these new configs in a rolling fashion.

Suggested Reading
-----------------

- For information about multi-cluster and multi-datacenter deployments in general, see :ref:`multi_dc`.
- For a broader explanation of disaster recovery design configurations and use cases, see the whitepaper on `Disaster Recovery for Multi-Datacenter Apache Kafka Deployments <https://www.confluent.io/white-paper/disaster-recovery-for-multi-datacenter-apache-kafka-deployments/>`_.
- For an overview of schema management in |cp|, including details of single primary architecture, see :ref:`schemaregistry_intro` and :ref:`sr-high-availability-single-primary`.
- |sr| is also available in |ccloud|; for details on how to lift and shift or extend existing clusters to cloud, see :ref:`schemaregistry_migrate`.