.. _schemaregistry_migrate: Use |sr| to Migrate Schemas in |cp| =================================== Starting with |cp| 7.0.0, :ref:`Schema Linking ` is available. **Schema Linking is the recommended way of migrating schemas** from self-managed clusters (either to another self-managed cluster or to a |ccloud| cluster). For migrating schemas from one |ccloud| cluster to another, use cloud specific :cloud:`Schema Linking|sr/schema-linking.html`. For pre |cp| 7.0.0 releases, use :ref:`Replicator ` with `Schema Translation `__ to migrate schemas from a self-managed cluster to a target cluster which is either self-managed or in :cloud:`Confluent Cloud|index.html`. (This was first available in |cp| 5.2.0.) |crep| provides Schema Migration which supports replicating an entire |sr| environment to another (empty) |sr| cluster (both on |cp|), preserving all schema ID and version information. You can also use |crep| to manually migrate individual subject(s) between |sr| clusters (*subject-level migration*), instead of an entire environment. All migration methods are described below: - :ref:`replicator-migrate-self-managed` - :ref:`migrate-individual-schemas` - :ref:`migrate_self_managed_schemas_to_cloud` .. _replicator-migrate-self-managed: Migrate Schemas Across Self-Managed Clusters -------------------------------------------- You can set up continuous migration or a one-time, lift-and-shift migration of schemas across self-managed (on-premises) clusters. For a demo showing how to migrate schemas from one self-managed cluster to another, see :ref:`quickstart-demos-replicator-schema-translation`. .. _migrate-individual-schemas: Migrate an Individual Schema to an Already Populated |sr| (subject-level migration) ----------------------------------------------------------------------------------- You can use the :ref:`Schema Registry API ` to migrate individual schemas into existing, populated schema registries and preserve the schema ID and version, without doing a full schema migration into an empty Destination |sr|. This can also be used to manually register a schema at a specified ID on a |sr|. This assumes that the schema ID is not already taken, and is useful in recovery operations when a schema was deleted but is backed up on another cluster. The steps to accomplish this are as follows: #. Ensure ``mode.mutability=true`` in |sr| properties. #. Put the subject into IMPORT mode (the subject must be empty or non-existent to do this): .. code:: bash curl -X PUT -H "Content-Type: application/json" "http://localhost:8081/mode/my-cool-subject" --data '{"mode": "IMPORT"}' (See `subject-level IMPORT mode `__ in the API reference.) #. Register the schema. For example, the following call registers a schema for the subject ``my-cool-subject`` with a specific ID (``24``) and version (``1``): .. code:: bash curl -X POST -H "Content-Type: application/json" \ --data '{"schemaType": "AVRO", "version":1, "id":24, "schema":"{\"type\":\"record\",\"name\":\"value_a1\",\"namespace\":\"com.mycorp.mynamespace\",\"fields\":[{\"name\":\"field1\",\"type\":\"string\"}]}" }' \ http://localhost:8081/subjects/my-cool-subject/versions .. note:: The schema ID used for subject-level migration should not already be assigned to a schema in the destination |sr|; otherwise the registration of the schema with the same ID will replace the previous schema and any messages serialized with the previous schema will not be able to be deserialized properly. #. Return subject to READWRITE mode. .. code:: bash curl -X PUT -H "Content-Type: application/json" "http://localhost:8081/mode/my-cool-subject" --data '{"mode": "READWRITE"}' .. _migrate_self_managed_schemas_to_cloud: Migrate Schemas to |ccloud| --------------------------- :cloud:`Confluent Cloud|index.html` is a fully managed streaming data service based on |cp|. Just as you can "lift and shift" or "extend to cloud" your Kafka applications from self-managed Kafka to |ccloud|, you can do the same with |sr-long|. If you already use |sr| to manage schemas for Kafka applications, and want to move some or all of that data and schema management to the cloud, you can use :ref:`Replicator ` to migrate your existing schemas to |sr-ccloud|. (See :ref:`Replicator ` and :ref:`replicator_executable`.) You can set up continuous migration of |sr| to maintain a hybrid deployment (extend to cloud) or lift and shift all to |ccloud| using a one-time migration. -------------------- Continuous Migration -------------------- For continuous migration, you can use your self-managed |sr| as a primary and |sr-ccloud| as a secondary (extend to cloud, also known as bridge to cloud). New schemas will be registered directly to the self-managed |sr| (origin), and |crep-full| will continuously copy schemas from it to |sr-ccloud| (destination), which is set to IMPORT mode. .. figure:: ../../images/sr-extend-to-cloud.png :align: center ------------------ One-time Migration ------------------ Choose a one-time migration to move all data to a fully-managed |ccloud| service (lift and shift). In this case, you migrate your existing self-managed |sr| to |sr-ccloud| as a primary. All new schemas are registered to |sr-ccloud| and stored in the centrialized |ak| cluster. In this scenario, there is no migration from |sr-ccloud| back to the self-managed |sr|. .. figure:: ../../images/sr-lift-and-shift.png :align: center ------------------ Topics and Schemas ------------------ Schemas are associated with |ak| topics, organized under subjects in |sr|. (See :ref:`schema_registry_terminology`.) The quick start below describes how to migrate |sr| and the schemas it contains, but not |ak| topics. For a continuous migration (extend to cloud), you need only do a schema migration, since your topics continue to live in the primary, self-managed cluster. For a one-time migration (lift and shift), you must follow schema migration with topic migration, using :ref:`Replicator ` to migrate your topics to the |ccloud| cluster, as mentioned in :ref:`sr_next_steps_topics` after the quick start. .. tip:: .. include:: ../../includes/replicator-topic-rename.rst The property ``topic.rename.format`` is described in :ref:`rep-destination-topics` under :ref:`replicator_config_options`. ----------- Quick Start ----------- The quick start describes how to perform a |sr| migration applicable to any type of deployment (from on-premises servers or data centers to |sr-ccloud|). The examples also serve as a primer you can use to learn the basics of migrating schemas from a local cluster to |ccloud|. .. include:: ../../includes/install-cli-prereqs.rst Before You Begin ^^^^^^^^^^^^^^^^ If you are new to |cp|, consider first working through these quick starts and tutorials to get a baseline understanding of the platform (including the role of producers, consumers, and brokers), |ccloud|, and |sr|. Experience with these workflows will give you better context for schema migration. - :ref:`quickstart` - :cloud:`Quick Start for Apache Kafka using Confluent Cloud|get-started/index.html` - :ref:`schema_registry_onprem_tutorial` Before you begin schema migration, verify that you have: - `Access to Confluent Cloud `_ to serve as the destination |sr| - A local install of |cp|; for example, from a :ref:`quickstart` download, or other cluster to serve as the origin |sr|. Schema migration requires that you configure and run |crep|. If you need more information than is included in the examples here, refer to the :ref:`replicator tutorial `. Migrate Schemas ^^^^^^^^^^^^^^^ To migrate |sr| and associated schemas to |ccloud|, follow these steps: #. Start the origin cluster. If you are running a local cluster; for example, from a :ref:`quickstart` download, start only |sr| for the purposes of this tutorial using the |confluent-cli| :confluent-cli:`confluent local|command-reference/local/index.html` commands. .. code:: bash confluent local services schema-registry start .. tip:: The examples here show how to use a |crep| worker in *standalone mode* for schema migration. In this mode, you cannot run |kconnect-long| and |crep| at the same time, because |crep| also runs |kconnect|. If you run |crep| in *distributed mode*, the setup is different and you do not have this limitation (you can use ``./bin/confluent local services start``). For more about configuring and running |kconnect| workers (includig |crep|) in standalone and distributed modes, see :connect-common:`Running Workers|userguide.html#configuring-and-running-workers` in the Connect guide. #. Verify that ``schema-registry``, ``kafka``, and ``zookeeper`` are running. For example, run ``confluent local services status``: :: Schema Registry is [UP] Kafka is [UP] Zookeeper is [UP] #. Verify that no subjects exist on the destination |sr| in |ccloud|. .. code:: bash curl -u : /subjects If no subjects exist, your output will be empty (``[]``), which is what you want. If subjects exist, delete them. For example: .. code:: bash curl -X DELETE -u : /subjects/my-existing-subject #. Set the destination |sr| to IMPORT mode. For example: .. code:: bash curl -u : -X PUT -H "Content-Type: application/json" "https:///mode" --data '{"mode": "IMPORT"}' .. tip:: - If subjects exist on the destination |sr|, the import will fail with a message similar to this: ``{"error_code":42205,"message":"Cannot import since found existing subjects"}`` - Full-scale schema migration of all schemas requires that the destination |sr| be in :ref:`IMPORT mode `. The registry must be empty to transition to this mode. This type of migration produces an exact copy of the source registry (including schema IDs) but does not allow you to "merge" pre-existing schemas in the destination with the source schemas. - An alternative approach is to use the |sr| API to set mode mutability and `subject-level IMPORT mode `__, and :ref:`migrate-individual-schemas`. #. Configure a |crep| worker to specify the addresses of brokers in the destination cluster, as described in :ref:`config-and-run-replicator`. .. tip:: |crep| in |cp| 5.2.0 and newer supports |sr| migration. The worker configuration file is in ``CONFLUENT_HOME/etc/kafka/connect-standalone.properties``. .. codewithvars:: properties # Connect Standalone Worker configuration bootstrap.servers=:9092 #. Configure :ref:`Replicator ` with |sr| and destination cluster information. - For stand-alone |kconnect| instance, configure the following properties in ``CONFLUENT_HOME/etc/kafka-connect-replicator/quickstart-replicator.properties``: .. codewithvars:: properties # basic connector configuration name=replicator-source connector.class=io.confluent.connect.replicator.ReplicatorSourceConnector key.converter=io.confluent.connect.replicator.util.ByteArrayConverter value.converter=io.confluent.connect.replicator.util.ByteArrayConverter header.converter=io.confluent.connect.replicator.util.ByteArrayConverter tasks.max=4 # source cluster connection info src.kafka.bootstrap.servers=localhost:9092 # destination cluster connection info dest.kafka.ssl.endpoint.identification.algorithm=https dest.kafka.sasl.mechanism=PLAIN dest.kafka.request.timeout.ms=20000 dest.kafka.bootstrap.servers=:9092 retry.backoff.ms=500 dest.kafka.sasl.jaas.config=org.apache.kafka.common.security.plain.PlainLoginModule required username="" password=""; dest.kafka.security.protocol=SASL_SSL # Schema Registry migration topics to replicate from source to destination # topic.whitelist indicates which topics are of interest to replicator topic.whitelist=_schemas # schema.registry.topic indicates which of the topics in the ``whitelist`` contains schemas schema.registry.topic=_schemas # Connection settings for destination Confluent Cloud Schema Registry schema.registry.url=https:// schema.registry.client.basic.auth.credentials.source=USER_INFO schema.registry.client.basic.auth.user.info=: - If your clusters have TLS/SSL enabled, you must set the TLS/SSL configurations as appropriate for |sr| clients. .. codewithvars:: properties # TLS/SSL configurations for clients to Schema Registry schema.registry.client.schema.registry.ssl.truststore.location schema.registry.client.schema.registry.ssl.truststore.type schema.registry.client.schema.registry.ssl.truststore.password schema.registry.client.schema.registry.ssl.keystore.location schema.registry.client.schema.registry.ssl.keystore.type schema.registry.client.schema.registry.ssl.keystore.password schema.registry.client.schema.registry.ssl.key.password .. tip:: |sr| client configurations require the ``schema.registry.client`` prefix. To learn more, see :ref:`clients-to-sr-security-configs` in the :ref:`schemaregistry_security`. #. In ``quickstart-replicator.properties``, the replication factor is set to ``1`` for demo purposes on a development cluster with one broker. For this schema migration tutorial, and in production, change this to at least ``3``: :: confluent.topic.replication.factor=3 .. seealso:: For an example of a JSON configuration for |crep| in distributed mode, see :devx-examples:`submit_replicator_schema_migration_config.sh|ccloud/connectors/submit_replicator_schema_migration_config.sh` on GitHub `examples repository `_. #. Start |crep| so that it can perform the schema migration. For example: .. code:: bash connect-standalone ${CONFLUENT_HOME}/etc/kafka/connect-standalone.properties \ ${CONFLUENT_HOME}/etc/kafka-connect-replicator/quickstart-replicator.properties The method or commands you use to start |crep| is dependent on your application setup, and may differ from this example. For more information, see :ref:`replicator_run` and :ref:`config-and-run-replicator`. #. Stop all producers that are producing to |ak|. #. Wait until the replication lag is 0. For more information, see :ref:`monitor-replicator-lag`. #. Stop |crep|. #. Enable mode changes in the self-managed source |sr| properties file by adding the following to the configuration and restarting. :: mode.mutability=true .. important:: Modes are only supported starting with version 5.2 of |sr|. This step and the one following (set |sr| to READYONLY) are precautionary and not strictly necessary. If using version `5.1` of |sr| or earlier, you can skip these two steps if you make certain to stop all producers so that no further schemas are registered in the source |sr|. #. Set the source |sr| to READONLY mode. .. code:: bash curl -u : -X PUT -H "Content-Type: application/json" "https:///mode" --data '{"mode": "READONLY"}' #. Set the destination |sr| to READWRITE mode. .. code:: bash curl -u : -X PUT -H "Content-Type: application/json" "https:///mode" --data '{"mode": "READWRITE"}' #. Stop all consumers. #. Configure all consumers to point to the destination |sr| in the cloud and restart them. For example, if you are configuring |sr| in a Java client, change |sr| URL from source to destination either in the code or in a properties file that specifies the |sr| URL, type of authentication USER_INFO, and credentials). For more examples, see :ref:`sr-tutorial-java-consumers`. #. Configure all producers to point to the destination |sr| in the cloud and restart them. For more examples, see :ref:`sr-tutorial-java-producers`. #. (Optional) Stop the source |sr|. Next Steps ^^^^^^^^^^ - If you are extending to cloud hybrid with continuous migration from a primary self-managed cluster to the cloud, the migration is complete. - If this is a one-time migration to |ccloud|, the next step is to use :ref:`Replicator` to migrate your topics to the cloud cluster. - For information on how to manage schemas and storage space in |ccloud| through the REST API, see :cloud:`Manage Schemas in Confluent Cloud|sr/index.html`. - Looking for a guide on how to configure and use |sr| in |ccloud|? See :cloud:`Quick Start for Schema Management on Confluent Cloud|get-started/schema-registry.html` and :cloud:`Quick Start for Apache Kafka using Confluent Cloud|get-started/index.html`. Limitations ----------- - Currently, when using Confluent |crep| to migrate schemas, |ccloud| is not supported as the source cluster. |ccloud| can only be the destination cluster. As an alternative, you can migrate schemas using the :ref:`REST API for Schema Registry ` to achieve the desired deployments. Specifics regarding |ccloud| limits on schemas and managing storage space are described in the APIs reference in :cloud:`Manage Schemas in Confluent Cloud|sr/index.html`. - |crep| does not support an "active-active" |sr| setup. It only supports migration (either one-time or continuous) from an active |sr| to a passive |sr|. .. include:: ../../multi-dc-deployments/replicator/includes/kafka-replicator-ccloud-compat.rst .. _sr_next_steps_topics: Related Content --------------- Schema Linking is the recommended way to migrate schemas on |cp| 7.0.0 or newer releases. :ref:`schema-linking-cp-overview` :cloud:`Confluent Cloud|index.html` These more general topics are helpful for understanding how |sr| and schemas are managed on multi data center deployments. - :cloud:`Confluent Cloud|index.html` - :ref:`quickstart-demos-replicator-schema-translation` - :cloud:`Quick Start for Schema Management on Confluent Cloud|get-started/schema-registry.html` - :ref:`schemaregistry_config` - :ref:`multi_dc` - :ref:`schemaregistry_api` To learn more, see these sections in :ref:`replicator_config_options`: - :ref:`rep-source-topics` - :ref:`rep-destination-topics` - :ref:`schema_translation` Finally, :ref:`quickstart-demos-replicator-schema-translation` shows a demo of migrating schemas across self-managed, on-premises clusters, using the legacy |crep| methods.