Stream Catalog GraphQL API Usage and Examples on Confluent Cloud

Stream Catalog leverages GraphQL under the hood and also exposes Stream Catalog GraphQL API for use in your deployments.

Overview

The following sections provide an overview of GraphQL and explain how it is used with the Stream Catalog.

What is it?

GraphQL is a query language for APIs that at its core enables declarative data fetching in order to give clients the power to specify exactly the data they need from an API. It’s a new API standard that provides a more efficient, powerful, and flexible alternative to REST (emphasis on alternative, GraphQL is not a replacement and normally coexists side by side with REST).

Why is it important?

The Confluent Stream Catalog provides a centralized metadata repository for customers in cloud environments. The GraphQL API allows users to take advantage of the graph nature of the Stream Catalog, which is modeled as a graph of entities and relationships, and provides them with a more natural, efficient, and productive way of exploring the catalog.

When to use REST API and when to use GraphQL API

GraphQL only supports search, so the question would be when to use the REST /search API vs GraphQL. The only capability that the REST /search API has over the GraphQL search is searching for business metadata attributes. Currently, business metadata attribute search is not supported in the GraphQL API. Otherwise, the GraphQL search is preferred, as it can search across relationships.

This blog post explains the power of catalog GraphQL API in more detail: How to Find, Share, and Organize Your Data Streams with Stream Catalog

Getting started

The Confluent Stream Catalog provides a centralized repository of schemas and other metadata entities within an environment, as well as the relationships between them. When querying for one or more related metadata entities, GraphQL can be used to return all requested metadata entities within a single response.

The Stream Catalog GraphQL API is a read-only API that only supports queries, and not mutations nor subscriptions.

GraphQL endpoint

The Stream Catalog GraphQL endpoint is https://<SR ENDPOINT>/catalog/graphql. This endpoint supports both POST and GET requests per recommended practices for GraphQL implementations.

GraphQL schema

The GraphQL schema can be introspected using any number of GraphQL tools. The schema can also be seen here.

Entity queries

Fetch list of entities

You can fetch a single entity or multiple entities of the same type using a simple query.

Example: Fetch a list of fields:

query {
  sr_field {
    name
  }
}

Fetch nested entities using relationships

You can fetch a single entity and its related entities by specifying the desired relationships.

Example: Fetch a list of fields and the name of each field’s record, schema, and subject_version:

query {
  sr_field {
    name
    record {
      name
      schema {
        name
        subject_versions {
          name
        }
      }
    }
  }
}

Example: Fetch a list of subject_version entities and the name of each field in the corresponding schemas.

Tip

This example uses an inline fragment with a type condition of sr_record, since a schema can contain several types besides records.

query {
  sr_subject_version(where: {name: {_starts_with: "my_subject"}}) {
    name
    schema {
      id
      types {
          ... on sr_record {
          name
          fields {
            name
          }
        }
      }
    }
  }
}

Filtering using the “where” argument

You can use the where argument to filter results based on some of an entity’s attributes. You can combine filters using the _and/_or operators.

Example: Fetch the field whose name is “field1”:

query {
  sr_field(where: {name: {_eq: "field1"}}) {
    name
    createTime
  }
}

Example: Fetch the field whose name is “field1” and schema ID is 1:

query {
  sr_field(where: {_and: [{name: {_eq: "field1"}}, {id: {_eq: 1}}]}) {
    name
    createTime
  }
}

The following operators can be used in the where argument:

  • _eq
  • _gt
  • _lt
  • _gte
  • _lte

For string attributes the following operator can additionally be used:

  • _starts_with

For date attributes the following operators can additionally be used:

  • _between
  • _since

Example: Fetch a field created during a certain period:

query {
  sr_field(where: {createTime: {_between: {start: "2020-01-01T00:00:00" end: "2022-01-01T00:00:00"}}}) {
    name
    createTime
  }
}

Example: Fetch a field created since a certain duration:

query {
  sr_field(where: {createTime: {_since: last_7_days}}) {
    name
    createTime
  }
}

Valid values for the since parameter:

  • last_7_days
  • last_30_days
  • last_month
  • this_month
  • today
  • yesterday
  • this_year
  • last_year
  • this_quarter
  • last_quarter
  • last_3_months
  • last_6_months
  • last_12_months

Sort Using the “order_by” Argument

Results can be sorted by using the order_by argument.

Example: Sort the fields in ascending order of the name:

query {
  sr_field(order_by: {name: asc}) {
    name
    createTime
  }
}

The order_by argument can specify that the sort direction is asc (ascending) or desc (descending).

Pagination with the “limit” and “offset” Arguments

Results can be paginated with the limit and offset arguments.

If limit is not set, the default is 100 and the max limit is 10000.

Example: Fetch five (5) fields, starting with the sixth (6th) one:

query {
  sr_field(limit: 5, offset: 5) {
    name
    createTime
  }
}

Filtering by tag with the “tags” argument

Results can be filtered by specifying that results contain one or more tags.

Example: Fetch fields tagged with PII or SECRET:

query {
  sr_field(tags: ["PII", "SECRET"]) {
    name
    createTime
  }
}

Including deleted objects with the “deleted” argument

Normally only active (non-deleted) entities are returned. Deleted entities can additionally be returned by specifying the deleted argument as true.

Example: Fetch all fields, including deleted ones:

query {
  sr_field(deleted: true) {
    name
    createTime
    status
  }
}

API limits

Query limits

The GraphQL API provides two query limits:

Query complexity limit
The complexity limit is a limit on the total number of data fields in the query. The maximum query complexity is 200.
Query depth limit
The depth limit is a limit on the total depth of the query. The maximum query depth is 20.

Time limits

The GraphQL API provides a maximum time limit of 30 seconds for any GraphQL query.

Rate limits

The GraphQL API provides a maximum rate limit of 25 requests per second.