Important

You are viewing documentation for an older version of Confluent Platform. For the latest, click here.

Flatten

The following provides usage information for the Apache Kafka® SMT org.apache.kafka.connect.transforms.Flatten.

Description

Flatten a nested data structure, generating names for each field by concatenating the field names at each level with a configurable delimiter character. Applies to a Struct when a schema is present, or a Map in the case of schemaless data. The JSON Single Message Transforms (SMT) delimiter is the period . character, which is also the default.

Use the concrete transformation type designed for the record key (org.apache.kafka.connect.transforms.Flatten$Key) or value (org.apache.kafka.connect.transforms.Flatten$Value).

When using an Avro schema, the default SMT delimiter period . cannot be used. Use an underscore _ for the delimiter. Use the following properties to change the delimiter:

"transforms": "flatten",
"transforms.flatten.type": "org.apache.kafka.connect.transforms.Flatten$Value",
"transforms.flatten.delimiter": "_"

JSON Example

The configuration snippet below shows how to use Flatten to concatenate field names with the period . delimiter character.

"transforms": "flatten",
"transforms.flatten.type": "org.apache.kafka.connect.transforms.Flatten$Value",
"transforms.flatten.delimiter": "."

Before:

{
  "content": {
    "id": 42,
    "name": {
      "first": "David",
      "middle": null,
      "last": "Wong"
    }
  }
}

After:

{
  "content.id": 42,
  "content.name.first": "David",
  "content.name.middle": null,
  "content.name.last": "Wong"
}

Avro Example

The Avro schema specification only allows alphanumeric characters and the underscore _ character in field names. The configuration snippet below shows how to use Flatten to concatenate field names with the underscore _ delimiter character.

"transforms": "flatten",
"transforms.flatten.type": "org.apache.kafka.connect.transforms.Flatten$Value",
"transforms.flatten.delimiter": "_"

Before:

{
  "content": {
    "id": 42,
    "name": {
      "first": "David",
      "middle": null,
      "last": "Wong"
    }
  }
}

After:

{
  "content_id": 42,
  "content_name_first": "David",
  "content_name_middle": null,
  "content_name_last": "Wong"
 }

Properties

Name Description Type Default Valid Values Importance
delimiter Delimiter to insert between field names from the input record when generating field names for the output record. string . _ medium