Delete Schemas and Manage Storage Space on Confluent Cloud¶
Confluent Cloud Schema Registry limits the number of schema versions supported. You can free up space for new schemas by identifying and deleting unused schemas.
Maximum schemas limits¶
Confluent Cloud Schema Registry limits the number of schema versions supported in the registry for Basic, Standard, and Dedicated cluster types, as described in Kafka Cluster Types in Confluent Cloud. You can view per-package limits on schemas as described in Stream Governance Packages, Cloud Providers, and Region Support.
When the maximum limit is reached and the registry is full, any requests to register new schemas will generate the following HTTP response:
HTTP/1.1 422 Unprocessable Entity
Content-Type: application/vnd.schemaregistry.v1+json
{
"error_code": 403,
"message": "Schema Limit Exceeded"
}
See the topics below for how to identify unused schemas and free up storage space for new schemas if needed.
Identify and delete unused schemas¶
You can use the schema deletion tool to find and delete schemas in the Confluent Cloud Schema Registry.
The README provides full instructions on how to build and use the schema deletion tool independently, or as a Confluent CLI plugin.
The basic workflow is:
- You provide subject names, which are based on
TopicNameStrategy
, to analyze and detect unused schemas. - The tool then identifies topics using the
TopicNameStrategy
and analyzes messages to identify unused schemas in those topics.
For example, if you select a subject named stocks-value
, the tool analyzes messages from the
topic stocks
and identifies unused schemas (if any) in the subject stocks-value
.
Any schema IDs not present in the topic’s messages are identified as unused schemas, and listed as candidates for deletion.
You can then enter the schemas you want to be delete.
To learn more about subject naming strategies, see Subject Name Strategy.
Caution
This tool provides you with the ability to Hard delete a schema in the Schema Registry, so use this option with caution.
Free up storage space in the registry for new schemas¶
Simply deleting schemas will not free up space in the registry because this will always result in a soft delete and schema IDs are not reusable. The schema count tracks IDs, so the used number will increase as new schemas are added, regardless of whether you soft deleted schemas.
Confluent Cloud supports “hard delete” of a schema as a two-step process (soft delete, followed by hard delete) with the use of the query string,
?permanent=true
on the second delete. If your Schema Registry reaches the maximum schemas limit, you can free up space for additional
schemas by following the procedures described below to Hard delete a schema.
Delete schemas¶
The Confluent Cloud Schema Registry subjects APIs support deleting a specific schema version or all versions of a subject. On a soft delete, the API only deletes the version and the underlying schema ID would still be available for any lookup.
Soft delete a schema¶
You can perform a soft delete on all schema versions registered under a subject or on a specific version of a subject. A soft delete removes the specified schema versions but does not free up space in the registry because the schema metadata, including schema IDs, is retained and schema IDs are not reusable.
# Deletes all schema versions registered under the subject "Kafka-value"
curl --silent -X DELETE -u $SR_APIKEY:$SR_APISECRET $SCHEMA_REGISTRY_URL/subjects/Kafka-value
[1]
# Deletes version 1 of the schema registered under subject "Kafka-value"
curl --silent -X DELETE -u $SR_APIKEY:$SR_APISECRET $SCHEMA_REGISTRY_URL/subjects/Kafka-value/versions/1
1
# Deletes the most recently registered schema under subject "Kafka-value"
curl --silent -X DELETE -u $SR_APIKEY:$SR_APISECRET $SCHEMA_REGISTRY_URL/subjects/Kafka-value/versions/latest
1
Tip
There may be a variance in what is reported on the Confluent Cloud Console and CLI for number of deleted schemas. To learn more, see Hard delete a schema and Looking under the hood at schema deletion, versioning, and compatibility.
Hard delete a schema¶
Both Confluent Platform and Confluent Cloud Schema Registry support hard delete of a schema with the use of the query string, ?permanent=true
. A hard delete removes all metadata, including schema IDs. This can be useful for freeing up space for new schemas on Confluent Cloud, which sets limits on the number of schema versions supported in the registry.
You can perform a hard delete on all schema versions registered under a subject or on a specific version of a subject.
To accomplish a hard delete of a schema (all versions or a specific version), use this two-step process. You must first soft delete the schema, then hard-delete it.
Perform a soft delete of all versions of the schema.
curl --silent -X DELETE -u $SR_APIKEY:$SR_APISECRET $SCHEMA_REGISTRY_URL/subjects/my-existing-subject
Perform a hard delete of all versions of the schema by appending
?permanent=true
to the command.curl --silent -X DELETE -u $SR_APIKEY:$SR_APISECRET $SCHEMA_REGISTRY_URL/subjects/<my-existing-subject>?permanent=true
The following commands hard-delete all versions of the schema registered under the subject “Kafka-value”:
curl --silent -X DELETE -u $SR_APIKEY:$SR_APISECRET $SCHEMA_REGISTRY_URL/subjects/Kafka-value
curl --silent -X DELETE -u $SR_APIKEY:$SR_APISECRET $SCHEMA_REGISTRY_URL/subjects/Kafka-value?permanent=true
To hard-delete version 1 of the schema registered under the subject Kafka-value:
curl --silent -X DELETE -u $SR_APIKEY:$SR_APISECRET $SCHEMA_REGISTRY_URL/subjects/Kafka-value/versions/1
curl --silent -X DELETE -u $SR_APIKEY:$SR_APISECRET $SCHEMA_REGISTRY_URL/subjects/Kafka-value/versions/1?permanent=true
Tip
For the version-specific delete, a hard delete requires a version number as input; not just version:latest
, which results in a soft delete only.
Recover a soft-deleted schema¶
You can recover a soft-deleted schema using the Schema Registry GET /subjects
and POST /subjects
APIs as follows:
Retrieve the schema with
GET /subjects?deleted=true
, as described in GET /subjects API description. For example:curl --silent -u $SR_APIKEY:$SR_APISECRET -X GET $SCHEMA_REGISTRY_URL/subjects?deleted=true | jq .
Use POST/subjects/(string: subject)/versions to re-register the schema.
Guidelines for using these APIs¶
The above APIs are primarily intended to be used be in a development environment where it’s common to go through iterations before finalizing a schema. While it’s not recommended to be used in a production environment, there are few scenarios where these APIs can be used in production, but with utmost care.
- A new schema to be registered has compatibility issues with one of the existing schema versions
- An old version of the schema needs to be registered again for the same subject
- The schemas are used only in real-time streaming systems and the older version(s) are absolutely no longer required
- A topic needs to be recycled
- You want to free up space for new schemas on Confluent Cloud (with a hard delete of schemas)
It is also important to note that any registered compatibility settings for the subject would also be deleted while using Delete Subject or when you delete the only available schema version.
View available schemas on Confluent Cloud¶
You can view available and used schemas, and infer the number of soft deleted schemas.
On the Confluent CLI, you can use confluent schema-registry cluster describe to view the number of available schemas for the currently selected environment. For example:
$ confluent schema-registry cluster describe
+-------------------------+----------------------------------------------------+
| Name | Stream Governance Package |
| Cluster ID | lsrc-g2p81 |
| Endpoint URL | https://psrc-0xx5p.us-central1.gcp.confluent.cloud |
| Used Schemas | 25 |
| Available Schemas | 975 |
| Global Compatibility | <Requires API Key> |
| Mode | <Requires API Key> |
| Cloud | gcp |
| Region | us-central1 |
| Package | advanced |
+-------------------------+----------------------------------------------------+
“Used Schemas” reports the total of active and soft deleted schemas. If you know the number of active schemas you have, you can infer the number of soft deleted schemas. If a schema is permanently deleted, the number of used schemas will decrease and available schemas will increase accordingly.
Looking under the hood at schema deletion, versioning, and compatibility¶
In reality, deleting a schema removes only the versioned instance(s) of the schema. The actual schema (with the hashed ID) does not go away. The canonical MD5 hash of the schema still exists in the system. This is why you can perform a lookup on a deleted schema in the rest API.
If, on a subject that has three versions, you delete the third one but register again the same schema it will be registered under version 3 and the associated schema will have the same ID as before, because the hash matched the existing one.
When the Avro serializer serializes Avro data, it stores the ID (not the version, which is specific to the subject). If the version (in a subject) is deleted, the schema and its ID remain in Schema Registry so that the data and can still be deserialized using Schema Registry
This has implications for versioning and compatibility. Suppose, for a given subject, you have two schemas:
- Schema ID: 23, version 1 with fields (a,b)
- Schema ID: 45, version 2 with fields (a,b,c)
If you delete version 2, version 1 becomes the latest version.
If the compatibility level is BACKWARD, the next schema that is registered to the subject need only be backward compatible with version 1, not version 2.