Information Schema in Confluent Cloud for Apache Flink

An information schema, or data dictionary, is a standard SQL schema with a collection of predefined views that enable accessing metadata about objects in Confluent Cloud for Apache Flink®️. The Confluent INFORMATION_SCHEMA is based on the SQL-92 ANSI Information Schema, with the addition of views and functions that are specific to Confluent Cloud for Apache Flink. The ANSI standard uses “catalog” to refer to a database. In Confluent Cloud, “schema” refers to a database. Conceptually, the terms are equivalent.

The views in the INFORMATION_SCHEMA provide information about database objects, such as tables, columns, and constraints. The views are organized into tables that you can query by using standard SQL statements.

For example, you can use the INFORMATION_SCHEMA.COLUMNS table to get details about the columns in a table, like the column name, data type, and whether it allows null values.

Similarly, you can use the INFORMATION_SCHEMA.TABLES table to get information about the tables in a catalog, like the table name, schema, and number of rows.

Every Flink catalog has a corresponding INFORMATION_SCHEMA, so you can always run a statement like SELECT (...) FROM <catalog-name>.INFORMATION_SCHEMA.TABLES WHERE (...).

Global views are available in every INFORMATION_SCHEMA, which means that you can query for information across catalogs. For example, you can query the global INFORMATION_SCHEMA.CATALOGS view to list all catalogs.

The information schema is a powerful tool for querying metadata about your Flink catalogs and databases, and you can use it for a variety of purposes, such as generating reports, documenting a schema, and troubleshooting performance issues.

The following views are supported in the Confluent INFORMATION_SCHEMA:

Query syntax in INFORMATION_SCHEMA

Metadata queries on the INFORMATION_SCHEMA tables support the following syntax.

Supported data types:

  • INT
  • STRING

Supported operators:

  • SELECT
  • WHERE
  • UNION ALL

Supported expressions:

  • CAST(NULL AS dt), CAST(x as dt)
  • UNION ALL (see this example)
  • AND, OR
  • = , <>, IS NULL, IS NOT NULL
  • AS
  • STRING and INT literals

The following limitations apply to INFORMATION_SCHEMA:

  • You can use INFORMATION_SCHEMA views only in SELECT statements, not in INSERT INTO statements.
  • You can’t use INFORMATION_SCHEMA in joins with real tables.
  • Only the previously listed equality and basic expressions are supported.

Available views

CATALOGS

The global catalogs view.

The rows returned are limited to the schemas that you have permission to interact with.

This view is an extension to the SQL standard.

Column Name Data Type Nullable Standard Description
CATALOG_ID STRING No No The ID of the catalog/environment, for example, env-xmzdkk.
CATALOG_NAME STRING No No The human readable name of the catalog/environment, for example, default.
Example

Run the following code to query for all catalogs across environments.

SELECT
  `CATALOG_ID`,
  `CATALOG_NAME`
FROM `INFORMATION_SCHEMA`.`CATALOGS`;

COLUMNS

Describes columns of tables and virtual tables (views) in the catalog.

Column Name Data Type Nullable Standard Description
COLUMN_NAME STRING No Yes Column reference.
COMMENT STRING Yes No An optional comment that describes the relation.
DATA_TYPE STRING No Yes Type root, for example, VARCHAR or ROW.
DISTRIBUTION_ORDINAL_POSITION INT Yes No If the table IS_DISTRIBUTED, contains the position of the key in a DISTRIBUTED BY clause.
FULL_DATA_TYPE STRING No No Fully qualified data type. for example, VARCHAR(32) or ROW<…>.
GENERATION_EXPRESSION STRING Yes Yes For computed columns.
IS_GENERATED STRING No Yes Indicates whether column is a computed column. Values are YES or NO.
IS_HIDDEN STRING No No Indicates whether a column is a system column. Values are YES or NO.
IS_METADATA STRING No No Indicates whether column is a metadata column. Values are YES or NO.
IS_PERSISTED STRING No No Indicates whether a metadata column is stored during INSERT INTO. Also YES if a physical column. Values are YES or NO.
METADATA_KEY STRING Yes No For metadata columns.
TABLE_CATALOG STRING No Yes The human readable name of the catalog.
TABLE_CATALOG_ID STRING No No The ID of the catalog.
TABLE_NAME STRING No Yes The name of the relation.
TABLE_SCHEMA STRING No Yes The human readable name of the database.
TABLE_SCHEMA_ID STRING No No The ID of the database.

INFORMATION_SCHEMA_CATALOG_NAME

Local catalog view. Returns the name of the current information schema’s catalog.

Column Name Data Type Nullable Standard Description
CATALOG_ID STRING No No The ID of the catalog/environment, for example, env-xmzdkk.
CATALOG_NAME STRING No Yes The human readable name of the catalog/environment, for example, default.
Example

Run the following code to query for the name of this information schema’s catalog.

SELECT
  `CATALOG_ID`,
  `CATALOG_NAME`
FROM `INFORMATION_SCHEMA`.`INFORMATION_SCHEMA_CATALOG_NAME`

KEY_COLUMN_USAGE

Side view of TABLE_CONSTRAINTS for key columns.

Column Name Data Type Nullable Standard Description
COLUMN_NAME STRING No Yes The name of the constrained column.
CONSTRAINT_CATALOG STRING No Yes Catalog name containing the constraint.
CONSTRAINT_CATALOG_ID STRING No No Catalog ID containing the constraint.
CONSTRAINT_SCHEMA STRING No Yes Schema name containing the constraint.
CONSTRAINT_SCHEMA_ID STRING No No Schema ID containing the constraint.
CONSTRAINT_NAME STRING No Yes Name of the constraint.
ORDINAL_POSITION INT No Yes The ordinal position of the column within the constraint key (starting at 1).
TABLE_CATALOG STRING No Yes The human readable name of the catalog.
TABLE_CATALOG_ID STRING No No The ID of the catalog.
TABLE_NAME STRING No Yes The name of the relation.
TABLE_SCHEMA STRING No Yes The human readable name of the database.
TABLE_SCHEMA_ID STRING No No The ID of the database.
Example

Run the following code to query for a side view of TABLE_CONSTRAINTS for key columns.

SELECT *
FROM `INFORMATION_SCHEMA`.`KEY_COLUMN_USAGE`

SCHEMATA / DATABASES

Describes databases within the catalog.

For convenience, DATABASES is an alias for SCHEMATA.

The rows returned are limited to the schemas that you have permission to interact with.

Column Name Data Type Nullable Standard Description
CATALOG_ID STRING No No The ID of the catalog/environment, for example, env-xmzdkk.
CATALOG_NAME STRING No Yes The human readable name of the catalog/environment, for example, default.
SCHEMA_ID STRING No No The ID of the database/cluster, for example, lkc-kgjwwv.
SCHEMA_NAME STRING No Yes The human readable name of the database/cluster, for example, MyCluster.
Example

Run the following code to list all Flink databases within a catalog, (Kafka clusters within an environment), excluding information schema.

SELECT
  `SCHEMA_ID`,
  `SCHEMA_NAME`
FROM `INFORMATION_SCHEMA`.`SCHEMATA`
WHERE `SCHEMA_NAME` <> 'INFORMATION_SCHEMA';

TABLES

Contains the object level metadata for tables and virtual tables (views) within the catalog.

The rows returned are limited to the schemas that you have permission to interact with.

Column Name Data Type Nullable Standard Description
COMMENT STRING Yes No An optional comment that describes the relation.
DISTRIBUTION_ALGORITHM STRING Yes No Currently, only HASH.
DISTRIBUTION_BUCKETS INT Yes No Number of buckets, if defined.
IS_DISTRIBUTED STRING No No Indicates whether the table is bucketed using the DISTRIBUTED BY clause. Values are YES or NO.
IS_WATERMARKED STRING No No Indicates whether the table has a watermark from the WATERMARK FOR clause. Values are YES or NO.
TABLE_CATALOG STRING No Yes The human readable name of the catalog.
TABLE_CATALOG_ID STRING No No The ID of the catalog.
TABLE_NAME STRING No Yes The name of the relation.
TABLE_SCHEMA STRING No Yes The human readable name of the database.
TABLE_SCHEMA_ID STRING No No The ID of the database.
TABLE_TYPE STRING No Yes Values are BASE TABLE or VIEW.
WATERMARK_COLUMN STRING Yes No Time attribute column for which the watermark is defined.
WATERMARK_EXPRESSION STRING Yes No Watermark expression.
WATERMARK_IS_HIDDEN STRING Yes No Indicates whether the watermark is the default, system-provided one.
Examples

Run the following code to list all tables within a catalog (Kafka topics within an environment), excluding the information schema.

SELECT
  `TABLE_CATALOG`,
  `TABLE_SCHEMA`,
  `TABLE_NAME`
FROM `INFORMATION_SCHEMA`.`TABLES`
WHERE `TABLE_SCHEMA` <> 'INFORMATION_SCHEMA';

Run the following code to list all tables within a database (Kafka topics within a cluster).

SELECT
  `TABLE_CATALOG`,
  `TABLE_SCHEMA`,
  `TABLE_NAME`
FROM `<current-catalog>`.`INFORMATION_SCHEMA`.`TABLES`
WHERE `TABLE_SCHEMA` = '<current-database>';

TABLE_CONSTRAINTS

Side view of TABLES for all primary key constraints within the catalog.

Column Name Data Type Nullable Standard Description
CONSTRAINT_CATALOG STRING No Yes Catalog name containing the constraint.
CONSTRAINT_CATALOG_ID STRING No No Catalog ID containing the constraint.
CONSTRAINT_SCHEMA STRING No Yes Schema name containing the constraint.
CONSTRAINT_SCHEMA_ID STRING No No Schema ID containing the constraint.
CONSTRAINT_NAME STRING No Yes Name of the constraint.
CONSTRAINT_TYPE STRING No Yes Currently, only PRIMARY KEY.
ENFORCED STRING No Yes YES if constraint is enforced, otherwise NO.
TABLE_CATALOG STRING No Yes The human readable name of the catalog.
TABLE_CATALOG_ID STRING No No The ID of the catalog.
TABLE_NAME STRING No Yes The name of the relation.
TABLE_SCHEMA STRING No Yes The human readable name of the database.
TABLE_SCHEMA_ID STRING No No The ID of the database.
Examples

Run the following code to query for a side view of TABLES for all primary key constraints within the catalog.

SELECT *
FROM `INFORMATION_SCHEMA`.`TABLE_CONSTRAINTS`;

TABLE_OPTIONS

Side view of TABLES for WITH.

Extension to the SQL Standard Information Schema.

Column Name Data Type Nullable Description
TABLE_CATALOG STRING No The human readable name of the catalog.
TABLE_CATALOG_ID STRING No The ID of the catalog.
TABLE_NAME STRING No The name of the relation.
TABLE_SCHEMA STRING No The human readable name of the database.
TABLE_SCHEMA_ID STRING No The ID of the database.
OPTION_KEY STRING No Option key.
OPTION_VALUE STRING No Option value.
Examples

Run the following code to query for a side view of TABLES for WITH.

SELECT *
FROM `INFORMATION_SCHEMA`.`TABLE_OPTIONS`;