.. _schemaregistry_migrate: Migrate Schemas =============== Starting with |cp| 5.2.0, you can use :ref:`connect_replicator` to migrate schemas stored in one |sr-long| to another cluster, which is either self-managed or in `Confluent Cloud `__. This page primarily covers migrating schemas on self-managed clusters to |ccloud|, but the first short section links to a full demo of how to migrate schemas across self-managed clusters. The "Suggested Reading" links at the end include |sr| configuration options and may be useful for both scenarios. .. tip:: Currently, when using Confluent |crep| to migrate schemas, |ccloud| is not supported as the source cluster. |ccloud| can only be the destination cluster. As an alternative, you can migrate schemas using the :ref:`REST API for Schema Registry ` to achieve the desired deployments. Specifics regarding |ccloud| limits on schemas and managing storage space are described in the APIs reference under :ref:`sr-in-cloud-manage-space`. .. include:: ../../multi-dc-deployments/replicator/includes/kafka-replicator-ccloud-compat.rst Migrate Schemas Across Self-Managed Clusters -------------------------------------------- You can also set up continuous migration or a one-time, lift-and-shift migration of schemas across self-managed (on-premises) clusters. For a demo showing how to migrate schemas from one self-managed cluster to another, see :ref:`quickstart-demos-replicator-schema-translation`. .. _migrate_self_managed_schemas_to_cloud: Migrate Schemas to |ccloud| --------------------------- `Confluent Cloud `__ is a fully managed streaming data service based on |cp|. Just as you can "lift and shift" or "extend to cloud" your Kafka applications from self-managed Kafka to |ccloud|, you can do the same with |sr-long|. If you already use |sr| to manage schemas for Kafka applications, and want to move some or all of that data and schema management to the cloud, you can use :ref:`connect_replicator` to migrate your existing schemas to |sr-ccloud|. (See :ref:`connect_replicator` and :ref:`replicator_executable`.) You can set up continuous migration of |sr| to maintain a hybrid deployment (extend to cloud) or lift and shift all to |ccloud| using a one-time migration. -------------------- Continuous Migration -------------------- For continuous migration, you can use your self-managed |sr| as a primary and |sr-ccloud| as a secondary (extend to cloud, also known as bridge to cloud). New schemas will be registered directly to the self-managed |sr| (origin), and |crep-full| will continuously copy schemas from it to |sr-ccloud| (destination), which is set to IMPORT mode. .. figure:: ../../images/sr-extend-to-cloud.png :align: center ------------------ One-time Migration ------------------ Choose a one-time migration to move all data to a fully-managed |ccloud| service (lift and shift). In this case, you migrate your existing self-managed |sr| to |sr-ccloud| as a primary. All new schemas are registered to |sr-ccloud| and stored in the centrialized |ak| cluster. In this scenario, there is no migration from |sr-ccloud| back to the self-managed |sr|. .. figure:: ../../images/sr-lift-and-shift.png :align: center ------------------ Topics and Schemas ------------------ Schemas are associated with |ak| topics, organized under subjects in |sr|. (See :ref:`schema_registry_terminology`.) The quick start below describes how to migrate |sr| and the schemas it contains, but not |ak| topics. For a continuous migration (extend to cloud), you need only do a schema migration, since your topics continue to live in the primary, self-managed cluster. For a one-time migration (lift and shift), you must follow schema migration with topic migration, using :ref:`Replicator` to migrate your topics to the |ccloud| cluster, as mentioned in :ref:`sr_next_steps_topics` after the quick start. .. tip:: .. include:: ../../includes/replicator-topic-rename.rst The property ``topic.rename.format`` is described in :ref:`rep-destination-topics` under :ref:`connect_replicator_config_options` for |crep|. ----------- Quick Start ----------- The quick start describes how to perform a |sr| migration applicable to any type of deployment (from on-premises servers or data centers to |sr-ccloud|). The examples also serve as a primer you can use to learn the basics of migrating schemas from a local cluster to |ccloud|. .. include:: ../../includes/install-cli-prereqs.rst Before You Begin ^^^^^^^^^^^^^^^^ If you are new to |cp|, consider first working through these quick starts and tutorials to get a baseline understanding of the platform (including the role of producers, consumers, and brokers), |ccloud|, and |sr|. Experience with these workflows will give you better context for schema migration. - :ref:`ce-quickstart` - `Quick Start for Apache Kafka using Confluent Cloud `__ - :ref:`schema_registry_tutorial` Before you begin schema migration, verify that you have: - `Access to Confluent Cloud `_ to serve as the destination |sr| - A local install of |cp|; for example, from a :ref:`ce-quickstart` download, or other cluster to serve as the origin |sr|. Schema migration requires that you configure and run |crep|. If you need more information than is included in the examples here, refer to the :ref:`replicator tutorial `. Migrate Schemas ^^^^^^^^^^^^^^^ To migrate |sr| and associated schemas to |ccloud|, follow these steps: #. Start the origin cluster. If you are running a local cluster; for example, from a :ref:`ce-quickstart` download, start only |sr| for the purposes of this tutorial using the |confluent-cli| :ref:`confluent_local` commands. .. code:: bash /bin/confluent local start schema-registry .. tip:: The examples here show how to use a |crep| worker in *standalone mode* for schema migration. In this mode, you cannot run |kconnect-long| and |crep| at the same time, because |crep| also runs |kconnect|. If you run |crep| in *distributed mode*, the setup is different and you do not have this limitation (you can use ``./bin/confluent local start``). For more about configuring and running |kconnect| workers (includig |crep|) in standalone and distributed modes, see :ref:`Running Workers ` in the Connect guide. #. Verify that ``schema-registry``, ``kafka``, and ``zookeeper`` are running. For example, run ``/bin/confluent local status``: :: schema-registry is [UP] kafka is [UP] zookeeper is [UP] #. Verify that no subjects exist on the destination |sr| in |ccloud|. .. code:: bash curl -u : /subjects If no subjects exist, your output will be empty (``[]``), which is what you want. If subjects exist, delete them. For example: .. code:: bash curl -X DELETE -u : /subjects/my-existing-subject #. Set the destination |sr| to IMPORT mode. For example: .. code:: bash curl -u : -X PUT -H "Content-Type: application/json" "https:///mode" --data '{"mode": "IMPORT"}' .. tip:: If subjects exist on the destination |sr|, the import will fail with a message similar to this: ``{"error_code":42205,"message":"Cannot import since found existing subjects"}`` #. Configure a |crep| worker to specify the addresses of brokers in the destination cluster, as described in :ref:`config-and-run-replicator`. .. tip:: |crep| in |cp| 5.2.0 and newer supports |sr| migration. The worker configuration file is in ``/etc/kafka/connect-standalone.properties``. :: # Connect Standalone Worker configuration bootstrap.servers=:9092 #. Configure :ref:`Replicator ` with |sr| and destination cluster information. For stand-alone |kconnect| instance, configure the following properties in ``etc/kafka-connect-replicator/quickstart-replicator.properties``: :: # basic connector configuration name=replicator-source connector.class=io.confluent.connect.replicator.ReplicatorSourceConnector key.converter=io.confluent.connect.replicator.util.ByteArrayConverter value.converter=io.confluent.connect.replicator.util.ByteArrayConverter header.converter=io.confluent.connect.replicator.util.ByteArrayConverter tasks.max=4 # source cluster connection info src.kafka.bootstrap.servers=localhost:9092 # destination cluster connection info dest.kafka.ssl.endpoint.identification.algorithm=https dest.kafka.sasl.mechanism=PLAIN dest.kafka.request.timeout.ms=20000 dest.kafka.bootstrap.servers=:9092 retry.backoff.ms=500 dest.kafka.sasl.jaas.config=org.apache.kafka.common.security.plain.PlainLoginModule required username="" password=""; dest.kafka.security.protocol=SASL_SSL # Schema Registry migration topics to replicate from source to destination # topic.whitelist indicates which topics are of interest to replicator topic.whitelist=_schemas # schema.registry.topic indicates which of the topics in the ``whitelist`` contains schemas schema.registry.topic=_schemas # Connection settings for destination Confluent Cloud Schema Registry schema.registry.url=https:// schema.registry.client.basic.auth.credentials.source=USER_INFO schema.registry.client.basic.auth.user.info=: #. In ``quickstart-replicator.properties``, the replication factor is set to ``1`` for demo purposes on a development cluster with one broker. For this schema migration tutorial, and in production, change this to at least ``3``: :: confluent.topic.replication.factor=3 .. seealso:: For an example of a JSON configuration for |crep| in distributed mode, see :devx-examples:`submit_replicator_schema_migration_config.sh|ccloud/connectors/submit_replicator_schema_migration_config.sh` on GitHub `examples repository `_. #. Start |crep| so that it can perform the schema migration. For example: .. code:: bash /bin/connect-standalone /etc/kafka/connect-standalone.properties \ /etc/kafka-connect-replicator/quickstart-replicator.properties The method or commands you use to start |crep| is dependent on your application setup, and may differ from this example. For more information, see :ref:`replicator_run` and :ref:`config-and-run-replicator`. #. Stop all producers that are producing to Kafka. #. Wait until the replication lag is 0. For more information, see :ref:`monitor-replicator-lag`. #. Stop |crep|. #. Enable mode changes in the self-managed source |sr| properties file by adding the following to the configuration and restarting. :: mode.mutability=true .. important:: Modes are only supported starting with version 5.2 of |sr|. This step and the one following (set |sr| to READYONLY) are precautionary and not strictly necessary. If using version `5.1` of |sr| or earlier, you can skip these two steps if you make certain to stop all producers so that no further schemas are registered in the source |sr|. #. Set the source |sr| to READONLY mode. .. code:: bash curl -u : -X PUT -H "Content-Type: application/json" "https:///mode" --data '{"mode": "READONLY"}' #. Set the destination |sr| to READWRITE mode. .. code:: bash curl -u : -X PUT -H "Content-Type: application/json" "https:///mode" --data '{"mode": "READWRITE"}' #. Stop all consumers. #. Configure all consumers to point to the destination |sr| in the cloud and restart them. For example, if you are configuring |sr| in a Java client, change |sr| URL from source to destination either in the code or in a properties file that specifies the |sr| URL, type of authentication USER_INFO, and credentials). For more examples, see :ref:`sr-tutorial-java-consumers`. #. Configure all producers to point to the destination |sr| in the cloud and restart them. For more examples, see :ref:`sr-tutorial-java-producers`. #. (Optional) Stop the source |sr|. .. _sr_next_steps_topics: Next Steps ^^^^^^^^^^ - If you are extending to cloud hybrid with continuous migration from a primary self-managed cluster to the cloud, the migration is complete. - If this is a one-time migration to |ccloud|, the next step is to use :ref:`Replicator` to migrate your topics to the cloud cluster. - For information on how to manage schemas and storage space in |ccloud| through the REST API, see :ref:`sr-in-cloud-manage-space` in the Developer Guide. - Looking for a guide on how to configure and use |sr| in |ccloud|? See `Schema Registry and Confluent Cloud `__ and `Quick Start for Apache Kafka using Confluent Cloud `__. Suggested Reading ----------------- - `Confluent Cloud `__ - :ref:`quickstart-demos-replicator-schema-translation` - `Schema Registry and Confluent Cloud `__ - :ref:`schemaregistry_config` - :ref:`multi_dc` - :ref:`schemaregistry_api` See also, these sections in :ref:`Replicator Configuration Options`: - :ref:`rep-source-topics` - :ref:`rep-destination-topics` - :ref:`schema_translation` Finally, :ref:`quickstart-demos-replicator-schema-translation` shows a demo of migrating schemas across self-managed, on-premises clusters.