Data Types in Confluent Cloud for Apache Flink¶
Confluent Cloud for Apache Flink® has a rich set of native data types that you can use in SQL statements and queries.
The query planner supports the following SQL types.
Data type definition¶
A data type describes the logical type of a value in a SQL table. You use data types to declare the input and output types of an operation.
The Flink data types are similar to the SQL standard data type terminology, but for efficient handling of scalar expressions, they also contain information about the nullability of a value.
These are examples of SQL data types:
INT
INT NOT NULL
INTERVAL DAY TO SECOND(3)
ROW<fieldOne ARRAY<BOOLEAN>, fieldTwo TIMESTAMP(3)>
The following sections list all pre-defined data types in Flink SQL.
Character strings¶
CHAR¶
Represents a fixed-length character string.
Declaration
CHAR
CHAR(n)
Formats
The following table shows examples of the CHAR type in different formats.
JSON for data type | {"type":"CHAR","nullable":true,"length":8}
|
CLI/UI format | CHAR(8)
|
JSON for payload | "Example string"
|
CLI/UI format for payload | Example string
|
Declare this type by using CHAR(n)
, where n
is the number of
code points. n
must have a value between 1 and 2,147,483,647
(both inclusive). If no length is specified, n
is equal to 1.
CHAR(0)
is not supported for CAST or persistence in catalogs, but it exists
in protocols.
VARCHAR / STRING¶
Represents a variable-length character string.
Declaration
VARCHAR
VARCHAR(n)
STRING
Formats
The following table shows examples of the VARCHAR type in different formats.
JSON for data type | {"type":"VARCHAR","nullable":true,"length":8}
|
CLI/UI format | VARCHAR(800)
|
JSON for payload | "Example string"
|
CLI/UI format for payload | Example string
|
Declare this type by using VARCHAR(n)
, where n
is the maximum
number of code points. n
must have a value between 1 and
2,147,483,647 (both inclusive). If no length is specified, n
is
equal to 1.
STRING
is equivalent to VARCHAR(2147483647)
.
VARCHAR(0)
is not supported for CAST or persistence in catalogs, but it
exists in protocols.
Binary strings¶
BINARY¶
Represents a fixed-length binary string (=a sequence of bytes).
Declaration
BINARY
BINARY(n)
Formats
The following table shows examples of the BINARY type in different formats.
JSON for data type | {"type":"BINARY","nullable":true,"length":1}
|
CLI/UI format | BINARY(3)
|
JSON for payload | "x'7f0203'"
|
CLI/UI format for payload | x'7f0203'
|
Declare this type by using BINARY(n)
, where n
is the number
of bytes. n
must have a value between 1 and 2,147,483,647
(both inclusive). If no length is specified, n
is equal to 1.
The string representation is hexadecimal format.
BINARY(0)
is not supported for CAST or persistence in catalogs, but it
exists in protocols.
VARBINARY / BYTES¶
Represents a variable-length binary string (=a sequence of bytes).
Declaration
VARBINARY
VARBINARY(n)
BYTES
Formats
The following table shows examples of the VARBINARY type in different formats.
JSON for data type | {"type":"VARBINARY","nullable":true,"length":1}
|
CLI/UI format | VARBINARY(800)
|
JSON for payload | "x'7f0203'"
|
CLI/UI format for payload | x'7f0203'
|
Declare this type by using VARBINARY(n)
where n
is the
maximum number of bytes. n
must have a value between 1 and
2,147,483,647 (both inclusive). If no length is specified, n
is
equal to 1.
BYTES
is equivalent to VARBINARY(2147483647)
.
VARCHAR(0)
is not supported for CAST or persistence in catalogs, but it
exists in protocols.
Exact numerics¶
BIGINT¶
Represents an 8-byte signed integer with values from -9,223,372,036,854,775,808 to 9,223,372,036,854,775,807.
Declaration
BIGINT
Formats
The following table shows examples of the BIGINT type in different formats.
JSON for data type | {"type":"BIGINT","nullable":true}
|
CLI/UI format | BIGINT
|
JSON for payload | "23"
|
CLI/UI format for payload | 23
|
DECIMAL¶
Represents a decimal number with fixed precision and scale.
Declaration
DECIMAL
DECIMAL(p)
DECIMAL(p, s)
DEC
DEC(p)
DEC(p, s)
NUMERIC
NUMERIC(p)
NUMERIC(p, s)
Formats
The following table shows examples of the DECIMAL type in different formats.
JSON for data type | {"type":"DECIMAL","nullable":true,"precision":5,"scale":3}
|
CLI/UI format | DECIMAL(5, 3)
|
JSON for payload | "12.123"
|
CLI/UI format for payload | 12.123
|
Declare this type by using DECIMAL(p, s)
where p
is the
number of digits in a number (precision) and s
is the number of
digits to the right of the decimal point in a number (scale).
p
must have a value between 1 and 38 (both inclusive).
The default value for p
is 10.
s
must have a value between 0 and p
(both inclusive).
The default value for s
is 0.
The right side is padded with 0.
The left side must be padded with spaces, like all other values.
NUMERIC(p, s)
and DEC(p, s)
are synonyms for this type.
INT¶
Represents a 4-byte signed integer with values from -2,147,483,648 to 2,147,483,647.
Declaration
INT
INTEGER
Formats
The following table shows examples of the INT type in different formats.
JSON for data type | {"type":"INT","nullable":true}
|
CLI/UI format | INT
|
JSON for payload | "23"
|
CLI/UI format for payload | 23
|
INTEGER
is a synonym for this type.
SMALLINT¶
Represents a 2-byte signed integer with values from -32,768 to 32,767.
Declaration
SMALLINT
Formats
The following table shows examples of the SMALLINT type in different formats.
JSON for data type | {"type":"SMALLINT","nullable":true}
|
CLI/UI format | SMALLINT
|
JSON for payload | "23"
|
CLI/UI format for payload | 23
|
TINYINT¶
Represents a 1-byte signed integer with values from -128 to 127.
Declaration
TINYINT
Formats
The following table shows examples of the TINYINT type in different formats.
JSON for data type | {"type":"TINYINT","nullable":true}
|
CLI/UI format | TINYINT
|
JSON for payload | "23"
|
CLI/UI format for payload | 23
|
Approximate numerics¶
DOUBLE¶
Represents an 8-byte double precision floating point number.
Declaration
DOUBLE
DOUBLE PRECISION
Formats
The following table shows examples of the DOUBLE type in different formats.
JSON for data type | {"type":"DOUBLE","nullable":true}
|
CLI/UI format | DOUBLE
|
JSON for payload | "1.1111112120000001E7"
|
CLI/UI format for payload | 1.1111112120000001E7
|
DOUBLE PRECISION
is a synonym for this type.
FLOAT¶
Represents a 4-byte single precision floating point number.
Declaration
FLOAT
Formats
The following table shows examples of the FLOAT type in different formats.
JSON for data type | {"type":"FLOAT","nullable":true}
|
CLI/UI format | FLOAT
|
JSON for payload | "1.1111112E7"
|
CLI/UI format for payload | 1.1111112E7
|
Compared to the SQL standard, this type doesn’t take parameters.
Date and time¶
DATE¶
Represents a date consisting of year-month-day
with values ranging
from 0000-01-01
to 9999-12-31
.
Declaration
DATE
Formats
The following table shows examples of the DATE type in different formats.
JSON for data type | {"type":"DATE","nullable":true}
|
CLI/UI format | DATE
|
JSON for payload | "2023-04-06"
|
CLI/UI format for payload | 2023-04-06
|
Compared to the SQL standard, the range starts at year 0000
.
INTERVAL DAY TO SECOND¶
Data type for a group of day-time interval types.
Declaration
INTERVAL DAY
INTERVAL DAY(p1)
INTERVAL DAY(p1) TO HOUR
INTERVAL DAY(p1) TO MINUTE
INTERVAL DAY(p1) TO SECOND(p2)
INTERVAL HOUR
INTERVAL HOUR TO MINUTE
INTERVAL HOUR TO SECOND(p2)
INTERVAL MINUTE
INTERVAL MINUTE TO SECOND(p2)
INTERVAL SECOND
INTERVAL SECOND(p2)
Formats
The following table shows examples of the INTERVAL DAY TO SECOND type in different formats.
JSON for data type | {"type":"INTERVAL_DAY_TIME","nullable":true,"precision":1,"fractionalPrecision":3,"resolution":"DAY_TO_SECOND"}
|
CLI/UI format | INTERVAL DAY(1) TO SECOND(3)
|
JSON for payload | "+2 07:33:20.000"
|
CLI/UI format for payload | +2 07:33:20.000
|
Declare this type by using the above combinations, where p1
is
the number of digits of days (day precision) and p2
is the number
of digits of fractional seconds (fractional precision).
p1
must have a value between 1 and 6 (both inclusive).
If no p1
is specified, it is equal to 2 by default.
p2
must have a value between 0 and 9 (both inclusive).
If no p2
is specified, it is equal to 6 by default.
The type must be parameterized to one of these resolutions with up to nanosecond precision:
- Interval of days
- Interval of days to hours
- Interval of days to minutes
- Interval of days to seconds
- Interval of hours
- Interval of hours to minutes
- Interval of hours to seconds
- Interval of minutes
- Interval of minutes to seconds
- Interval of seconds
An interval of day-time consists of
+days hours:months:seconds.fractional
with values ranging from
-999999 23:59:59.999999999
to +999999 23:59:59.999999999
. The
value representation is the same for all types of resolutions. For
example, an interval of seconds of 70 is always represented in an
interval-of-days-to-seconds format (with default precisions):
+00 00:01:10.000000
.
Formatting intervals are tricky, because they have different resolutions:
- DAY
- DAY_TO_HOUR
- DAY_TO_MINUTE
- DAY_TO_SECOND
- HOUR
- HOUR_TO_MINUTE
- HOUR_TO_SECOND
- MINUTE
- MINUTE_TO_SECOND
- SECOND
Depending on the resolution, use:
INTERVAL DAY(1)
INTERVAL DAY(1) TO HOUR
INTERVAL DAY(1) TO MINUTE
INTERVAL DAY(1) TO SECOND(3)
INTERVAL HOUR
INTERVAL HOUR TO MINUTE
INTERVAL HOUR TO SECOND(3)
INTERVAL MINUTE
INTERVAL MINUTE TO SECOND(3)
INTERVAL SECOND(3)
INTERVAL YEAR TO MONTH¶
Data type for a group of year-month interval types.
Declaration
INTERVAL YEAR
INTERVAL YEAR(p)
INTERVAL YEAR(p) TO MONTH
INTERVAL MONTH
Formats
The following table shows examples of the INTERVAL YEAR TO MONTH type in different formats.
JSON for data type | {"type":"INTERVAL_YEAR_MONTH","nullable":true,"precision":4,"resolution":"YEAR_TO_MONTH"}
|
CLI/UI format | INTERVAL YEAR(4) TO MONTH
|
JSON for payload | "+2000-02"
|
CLI/UI format for payload | +2000-02
|
Declare this type by using the above combinations, where p
is the
number of digits of years (year precision).
p
must have a value between 1 and 4 (both inclusive). If no year
precision is specified, p
is equal to 2.
The type must be parameterized to one of these resolutions:
- Interval of years
- Interval of years to months
- Interval of months
An interval of year-month consists of +years-months
with values
ranging from -9999-11
to +9999-11
.
The value representation is the same for all types of resolutions. For
example, an interval of months of 50 is always represented in an
interval-of-years-to-months format (with default year precision):
+04-02
.
Formatting intervals are tricky, because they have different resolutions:
- YEAR
- YEAR_TO_MONTH
- MONTH
Depending on the resolution, use:
INTERVAL YEAR(4)
INTERVAL YEAR(4) TO MONTH
INTERVAL MONTH
TIME¶
Represents a time without timezone consisting of
hour:minute:second[.fractional]
with up to nanosecond precision and
values ranging from 00:00:00.000000000
to 23:59:59.999999999
.
Declaration
TIME
TIME(p)
TIME_WITHOUT_TIME_ZONE
TIME_WITHOUT_TIME_ZONE(p)
Formats
The following table shows examples of the TIME type in different formats.
JSON for data type | {"type":"TIME_WITHOUT_TIME_ZONE","nullable":true,"precision":3}
|
CLI/UI format | TIME(3)
|
JSON for payload | "10:56:22.541"
|
CLI/UI format for payload | 10:56:22.541
|
Declare this type by using TIME(p)
, where p
is the number of
digits of fractional seconds (precision).
p
must have a value between 0 and 9 (both inclusive). If no
precision is specified, p
is equal to 0.
Compared to the SQL standard, leap seconds (23:59:60
and 23:59:61
)
are not supported, as the semantics are closer to java.time.LocalTime
.
A time with timezone is not provided.
TIME
acts like a pure string and isn’t related to a time zone of any kind,
including UTC.
TIME WITHOUT TIME ZONE
is a synonym for this type.
TIMESTAMP¶
Represents a timestamp without timezone consisting of
year-month-day hour:minute:second[.fractional]
with up to nanosecond
precision and values ranging from 0000-01-01 00:00:00.000000000
to
9999-12-31 23:59:59.999999999
.
Declaration
TIMESTAMP
TIMESTAMP(p)
TIMESTAMP WITHOUT TIME ZONE
TIMESTAMP(p) WITHOUT TIME ZONE
Formats
The following table shows examples of the TIMESTAMP type in different formats.
JSON for data type | {"type":"TIMESTAMP_WITHOUT_TIME_ZONE","nullable":true,"precision":3}
|
CLI/UI format | TIMESTAMP(3)
|
JSON for payload | "2023-04-06 10:59:32.628"
|
CLI/UI format for payload | 2023-04-06 10:59:32.628
|
Declare this type by using TIMESTAMP(p)
, where p
is the
number of digits of fractional seconds (precision).
p
must have a value between 0 and 9 (both inclusive). If no
precision is specified, p
is equal to 6.
A space separates the date and time parts.
Compared to the SQL standard, leap seconds (23:59:60
and 23:59:61
)
are not supported, as the semantics are closer to java.time.LocalDateTime
.
A conversion from and to BIGINT
(a JVM long
type) is not
supported, as this would imply a timezone, but this type is time-zone
free. For more java.time.Instant
-like semantics use TIMESTAMP_LTZ
.
TIMESTAMP
acts like a pure string and isn’t related to a time zone of any
kind, including UTC.
TIMESTAMP WITHOUT TIME ZONE
is a synonym for this type.
TIMESTAMP_LTZ¶
Represents a timestamp with the local timezone consisting of
year-month-day hour:minute:second[.fractional] zone
with up to
nanosecond precision and values ranging from
0000-01-01 00:00:00.000000000 +14:59
to
9999-12-31 23:59:59.999999999 -14:59
.
Declaration
TIMESTAMP_LTZ
TIMESTAMP_LTZ(p)
TIMESTAMP WITH LOCAL TIME ZONE
TIMESTAMP(p) WITH LOCAL TIME ZONE
Formats
The following table shows examples of the TIMESTAMP_LTZ type in different formats.
JSON for data type | {"type":"TIMESTAMP_WITH_LOCAL_TIME_ZONE","nullable":true,"precision":3}
|
CLI/UI format | TIMESTAMP(3) WITH LOCAL TIME ZONE
|
JSON for payload | "2023-04-06 11:06:47.224"
|
CLI/UI format for payload | 2023-04-06 11:06:47.224
|
Declare this type by using TIMESTAMP_LTZ(p)
, where p
is the
number of digits of fractional seconds (precision).
p
must have a value between 0 and 9 (both inclusive). If no
precision is specified, p
is equal to 6.
Leap seconds (23:59:60
and 23:59:61
) are not supported, as the
semantics are closer to java.time.OffsetDateTime
.
Compared to TIMESTAMP WITH TIME ZONE
, the timezone offset
information is not stored physically in every datum. Instead, the type
assumes java.time.Instant
semantics in the UTC timezone at the edges
of the table ecosystem. Every datum is interpreted in the local timezone
configured in the current session for computation and visualization.
This type fills the gap between time-zone free and time-zone mandatory timestamp types by allowing the interpretation of UTC timestamps according to the configured session timezone.
TIMESTAMP_LTZ
resembles a TIMESTAMP
without a timezone, but the string
always considers the sessions/query’s timezone. Internally, it is always in the
UTC time zone.
If you require the short format, prefer TIMESTAMP_LTZ(3)
.
TIMESTAMP WITH LOCAL TIME ZONE
is a synonym for this type.
TIMESTAMP and TIMESTAMP_LTZ comparison¶
Although TIMESTAMP and TIMESTAMP_LTZ are similarly named, they represent different concepts.
- TIMESTAMP_LTZ
- TIMESTAMP_LTZ in SQL is similar to the
Instant
class in Java. - TIMESTAMP_LTZ represents a moment, or a specific point in the UTC timeline.
- TIMESTAMP_LTZ stores time as a UTC integer, which can be converted dynamically to every other timezone.
- When printing or casting TIMESTAMP_LTZ as a character string, the
sql.local-time-zone
setting is considered.
- TIMESTAMP_LTZ in SQL is similar to the
- TIMESTAMP
- TIMESTAMP in SQL is similar to
LocalDateTime
in Java. - TIMESTAMP has no time zone or offset from UTC, so it can’t represent a moment.
- TIMESTAMP stores time as character string, not related to any timezone.
- TIMESTAMP in SQL is similar to
Collection data types¶
ARRAY¶
Represents an array of elements with same subtype.
Declaration
ARRAY<t>
t ARRAY
Formats
The following table shows examples of the ARRAY type in different formats.
JSON for data type | {"type":"ARRAY","nullable":true,"elementType":{"type":"INTEGER","nullable":true}}
|
CLI/UI format | ARRAY<INT>
|
JSON for payload | ["1", "2", "3", null]
|
CLI/UI format for payload | [1, 2, 3, NULL]
|
Declare this type by using ARRAY<t>
, where t
is the data type
of the contained elements.
Compared to the SQL standard, the maximum cardinality of an array cannot be specified and is fixed at 2,147,483,647. Also, any valid type is supported as a subtype.
t ARRAY
is a synonym for being closer to the SQL standard. For
example, INT ARRAY
is equivalent to ARRAY<INT>
.
MAP¶
Represents an associative array that maps keys (including NULL
) to
values (including NULL
).
Declaration
MAP<kt, vt>
Formats
The following table shows examples of the MAP type in different formats.
JSON for data type | {"type":"MAP","nullable":true,"keyType":{"type":"INTEGER","nullable":true},"valueType":{"type":"VARCHAR","nullable":true,"length":2147483647}}
|
CLI/UI format | MAP<STRING>
|
JSON for payload | [["1", "a"], ["2", "b"], [null, "c"]]
|
CLI/UI format for payload | {1=a, 2=b, NULL=c}
|
Declare this type by using MAP<kt, vt>
where kt
is the data
type of the key elements and vt
is the data type of the value
elements.
A map can’t contain duplicate keys. Each key can map to at most one value.
There is no restriction of element types. It is the responsibility of the user to ensure uniqueness.
The map type is an extension to the SQL standard.
MULTISET¶
Represents a multiset (=bag).
Declaration
MULTISET<t>
t MULTISET
Formats
The following table shows examples of the MULTISET type in different formats.
JSON for data type | {"type":"MULTISET","nullable":true,"elementType":{"type":"INTEGER","nullable":true}}
|
CLI/UI format | MULTISET<INT>
|
JSON for payload | [["a", "1"], ["b", "2"], [null, "1"]]
|
CLI/UI format for payload | {a=1, b=2, NULL=1}
|
Declare this type by using MULTISET<t>
where t
is the data
type of the contained elements.
Unlike a set, the multiset allows for multiple instances for each of its
elements with a common subtype. Each unique value (including NULL
)
is mapped to some multiplicity.
There is no restriction of element types; it is the responsibility of the user to ensure uniqueness.
t MULTISET
is a synonym for being closer to the SQL standard. For
example, INT MULTISET
is equivalent to MULTISET<INT>
.
ROW¶
Represents a sequence of fields.
Declaration
ROW<name0 type0, name1 type1, ...>
ROW<name0 type0 'description0', name1 type1 'description1', ...>
ROW(name0 type0, name1 type1, ...)
ROW(name0 type0 'description0', name1 type1 'description1', ...)
Formats
The following table shows examples of the ROW type in different formats.
JSON for data type | {"type":"ROW","nullable":true,"fields":[{"name":"a","fieldType":{"type":"INTEGER","nullable":true}},{"name":"b","fieldType":{"type":"VARCHAR","nullable":true,"length":2147483647}}]}
|
CLI/UI format | MULTISET<INT>
|
JSON for payload | [["a", "1"], ["b", "2"], [null, "1"]]
|
CLI/UI format for payload | {a=1, b=2, NULL=1}
|
Declare this type by using ROW<n0 t0 'd0', n1 t1 'd1', ...>
,
where n
is the unique name of a field, t
is the logical type of
a field, d
is the description of a field.
A field consists of a field name, field type, and an optional description. The most specific type of a row of a table is a row type. In this case, each column of the row corresponds to the field of the row type that has the same ordinal position as the column.
Compared to the SQL standard, an optional field description simplifies the handling with complex structures.
A row type is similar to the STRUCT
type known from other
non-standard-compliant frameworks.
ROW(...)
is a synonym for being closer to the SQL standard. For
example, ROW(fieldOne INT, fieldTwo BOOLEAN)
is equivalent to
ROW<fieldOne INT, fieldTwo BOOLEAN>
.
If the fields of the data type contain characters other than [A-Za-z_]
,
use escaping notation. Double backticks escape the backtrick character,
for example:
ROW<`a-b` INT, b STRING, `weird_col``_umn` STRING>
Rows fields can contain comments, for example:
{"type":"ROW","nullable":true,"fields":[{"name":"a","fieldType":{"type":"INTEGER","nullable":true},"description":"hello"}]}
Format using single quotes. Double single quotes escape single quotes, for example:
ROW<a INT 'This field''s content'>
Other data types¶
BOOLEAN¶
Represents a boolean with a (possibly) three-valued logic of TRUE
,
FALSE
, and UNKNOWN
.
Declaration
BOOLEAN
Formats
The following table shows examples of the BOOLEAN type in different formats.
JSON for data type | {"type":"BOOLEAN","nullable":true}
|
CLI/UI format | NULL
|
JSON for payload | null
|
CLI/UI format for payload | NULL
|
NULL¶
Data type for representing untyped NULL
values.
Declaration
NULL
Formats
The following table shows examples of the NULL type in different formats.
JSON for data type | {"type":"NULL"}
|
CLI/UI format | NULL
|
JSON for payload | null
|
CLI/UI format for payload | NULL
|
The NULL type is an extension to the SQL standard. A NULL type has no
other value except NULL
, thus, it can be cast to any nullable type
similar to JVM semantics.
This type helps in representing unknown types in API calls that use a
NULL
literal as well as bridging to formats such as JSON or Avro
that define such a type as well.
This type is not very useful in practice and is described here only for completeness.
Casting¶
Flink SQL can perform casting between a defined input type and target type.
While some casting operations can always succeed regardless of the input
value, others can fail at runtime when there’s no way to create a value for the
target type. For example, it’s always possible to convert INT
to STRING
,
but you can’t always convert a STRING
to INT
.
During the planning stage, the query validator rejects queries for invalid type
pairs with a ValidationException
, for example, when trying to cast a
TIMESTAMP
to an INTERVAL
. Valid type pairs that can fail at runtime are
accepted by the query validator, but this requires you to handle cast failures
correctly.
In Flink SQL, casting can be performed by using one of these two built-in functions:
- CAST: The regular cast function defined by the SQL standard. It can fail the job if the cast operation is fallible and the provided input is not valid. Type inference preserves the nullability of the input type.
- TRY_CAST: An extension to the regular cast function
that returns
NULL
if the cast operation fails. Its return type is always nullable.
For example:
-- returns 42 of type INT NOT NULL
SELECT CAST('42' AS INT);
-- returns NULL of type VARCHAR
SELECT CAST(NULL AS VARCHAR);
-- throws an exception and fails the job
SELECT CAST('non-number' AS INT);
-- returns 42 of type INT
SELECT TRY_CAST('42' AS INT);
-- returns NULL of type VARCHAR
SELECT TRY_CAST(NULL AS VARCHAR);
-- returns NULL of type INT
SELECT TRY_CAST('non-number' AS INT);
-- returns 0 of type INT NOT NULL
SELECT COALESCE(TRY_CAST('non-number' AS INT), 0);
The following matrix shows the supported cast pairs, where “Y” means supported, “!” means fallible, and “N” means unsupported:
Input / Target | CHAR¹ / VARCHAR¹ / STRING | BINARY¹ / VARBINARY¹ / BYTES | BOOLEAN | DECIMAL | TINYINT | SMALLINT | INTEGER | BIGINT | FLOAT | DOUBLE | DATE | TIME | TIMESTAMP | TIMESTAMP_LTZ | INTERVAL | ARRAY | MULTISET | MAP | ROW |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
CHAR / VARCHAR / STRING | Y | ! | ! | ! | ! | ! | ! | ! | ! | ! | ! | ! | ! | ! | N | N | N | N | N |
BINARY / VARBINARY / BYTES | Y | Y | N | N | N | N | N | N | N | N | N | N | N | N | N | N | N | N | N |
BOOLEAN | Y | N | Y | Y | Y | Y | Y | Y | Y | Y | N | N | N | N | N | N | N | N | N |
DECIMAL | Y | N | N | Y | Y | Y | Y | Y | Y | Y | N | N | N | N | N | N | N | N | N |
TINYINT | Y | N | Y | Y | Y | Y | Y | Y | Y | Y | N | N | N² | N² | N | N | N | N | N |
SMALLINT | Y | N | Y | Y | Y | Y | Y | Y | Y | Y | N | N | N² | N² | N | N | N | N | N |
INTEGER | Y | N | Y | Y | Y | Y | Y | Y | Y | Y | N | N | N² | N² | Y⁵ | N | N | N | N |
BIGINT | Y | N | Y | Y | Y | Y | Y | Y | Y | Y | N | N | N² | N² | Y⁶ | N | N | N | N |
FLOAT | Y | N | N | Y | Y | Y | Y | Y | Y | Y | N | N | N | N | N | N | N | N | N |
DOUBLE | Y | N | N | Y | Y | Y | Y | Y | Y | Y | N | N | N | N | N | N | N | N | N |
DATE | Y | N | N | N | N | N | N | N | N | N | Y | N | Y | Y | N | N | N | N | N |
TIME | Y | N | N | N | N | N | N | N | N | N | N | Y | Y | Y | N | N | N | N | N |
TIMESTAMP | Y | N | N | N | N | N | N | N | N | N | Y | Y | Y | Y | N | N | N | N | N |
TIMESTAMP_LTZ | Y | N | N | N | N | N | N | N | N | N | Y | Y | Y | Y | N | N | N | N | N |
INTERVAL | Y | N | N | N | N | N | Y⁵ | Y⁶ | N | N | N | N | N | N | Y | N | N | N | N |
ARRAY | Y | N | N | N | N | N | N | N | N | N | N | N | N | N | N | !³ | N | N | N |
MULTISET | Y | N | N | N | N | N | N | N | N | N | N | N | N | N | N | N | !³ | N | N |
MAP | Y | N | N | N | N | N | N | N | N | N | N | N | N | N | N | N | N | !³ | N |
ROW | Y | N | N | N | N | N | N | N | N | N | N | N | N | N | N | N | N | N | !³ |
Notes:
- All the casting to constant length or variable length also trims and pads, according to the type definition.
TO_TIMESTAMP
andTO_TIMESTAMP_LTZ
must be used instead ofCAST
/TRY_CAST
.- Supported iff the children type pairs are supported. Fallible iff the children type pairs are fallible.
- Supported iff the
RAW
class and serializer are equals. - Supported iff
INTERVAL
is aMONTH TO YEAR
range. - Supported iff
INTERVAL
is aDAY TO TIME
range.
Data type extraction¶
In many locations in the API, Flink tries to extract data types automatically from class information by using reflection to avoid repetitive manual schema work. But extracting a data type using reflection is not always successful, because logical information might be missing. In these cases, it may be necessary to add additional information close to a class or field declaration for supporting the extraction logic.
The following table lists classes that map implicitly to a data type without requiring further information. Other JVM bridging classes require the @DataTypeHint annotation.
Class | Data Type |
---|---|
boolean | BOOLEAN NOT NULL |
byte | TINYINT NOT NULL |
byte[] | BYTES |
double | DOUBLE NOT NULL |
float | FLOAT NOT NULL |
int | INT NOT NULL |
java.lang.Boolean | BOOLEAN |
java.lang.Byte | TINYINT |
java.lang.Double | DOUBLE |
java.lang.Float | FLOAT |
java.lang.Integer | INT |
java.lang.Long | BIGINT |
java.lang.Short | SMALLINT |
java.lang.String | STRING |
java.sql.Date | DATE |
java.sql.Time | TIME(0) |
java.sql.Timestamp | TIMESTAMP(9) |
java.time.Duration | INTERVAL SECOND(9) |
java.time.Instant | TIMESTAMP_LTZ(9) |
java.time.LocalDate | DATE |
java.time.LocalTime | TIME(9) |
java.time.LocalDateTime | TIMESTAMP(9) |
java.time.OffsetDateTime | TIMESTAMP(9) WITH TIME ZONE |
java.time.Period | INTERVAL YEAR(4) TO MONTH |
java.util.Map<K, V> | MAP<K, V> |
short | SMALLINT NOT NULL |
structured type T | anonymous structured type T |
long | BIGINT NOT NULL |
T[] | ARRAY<T> |