Kafka Connect FromXML SMT Usage Reference for Confluent Cloud
The FromXML single message transform (SMT) reads XML data, which is stored as bytes or a string and converts the XML to a structure that is strongly typed in connect. For example, it allows data to be converted from XML and stored as AVRO in a topic.
To apply the FromXML SMT, add the following to your connector configuration:
{
"transforms" : "fromXml",
"transforms.fromXml.type" : "io.confluent.connect.cloud.transforms.xml.FromXml$Value",
"transforms.fromXml.schema.path" : "file:src/test/resources/books.xsd"
}
Examples
The example below shows how to use FromXML SMT.
Before:
{ "topic" : "test", "kafkaPartition" : 1, "valueSchema" : { "type" : "STRING", "isOptional" : false }, "value" : "<?xml version=\"1.0\"?>\n<x:books xmlns:x=\"urn:books\">\n <book id=\"bk001\">\n <author>Writer</author>\n <title>The First Book</title>\n <genre>Fiction</genre>\n <price>44.95</price>\n <pub_date>2000-10-01</pub_date>\n <review>An amazing story of nothing.</review>\n </book>\n\n <book id=\"bk002\">\n <author>Poet</author>\n \"title\">The Poet's First Poem</title>\n \"genre\">Poem</genre>\n \"price\">24.95</price>\n \"pub_date\">2000-10-01</pub_date>\n \"review\">Least poetic poems.</review>\n </book>\n</x:books>", "timestampType" : "NO_TIMESTAMP_TYPE", "offset" : 1574310211719, "headers" : [ ] }
Adding the SMT to your connector configuration:
To apply the
FromXMLSMT, add the following to your connector configuration:{ "transforms" : "fromXml", "transforms.fromXml.type" : "io.confluent.connect.cloud.transforms.xml.FromXml$Value", "transforms.fromXml.schema.path" : "file:src/test/resources/books.xsd" }
After:
After the
FromXMLSMT applies, the value transforms as follows:{ "topic" : "test", "kafkaPartition" : 1, "valueSchema" : { "name" : "io.confluent.connect.cloud.transforms.xml.model.BooksForm", "type" : "STRUCT", "isOptional" : true, "fieldSchemas" : { "book" : { "type" : "ARRAY", "isOptional" : true, "valueSchema" : { "name" : "io.confluent.connect.cloud.transforms.xml.model.BookForm", "type" : "STRUCT", "isOptional" : true, "fieldSchemas" : { "author" : { "type" : "STRING", "isOptional" : false }, "title" : { "type" : "STRING", "isOptional" : false }, "genre" : { "type" : "STRING", "isOptional" : false }, "price" : { "type" : "FLOAT32", "isOptional" : true }, "pub_date" : { "name" : "org.apache.kafka.connect.data.Date", "type" : "INT32", "version" : 1, "isOptional" : false }, "review" : { "type" : "STRING", "isOptional" : false }, "id" : { "type" : "STRING", "isOptional" : true } } } } } }, "value" : { "schema" : { "name" : "io.confluent.connect.cloud.transforms.xml.model.BooksForm", "type" : "STRUCT", "isOptional" : true, "fieldSchemas" : { "book" : { "type" : "ARRAY", "isOptional" : true, "valueSchema" : { "name" : "io.confluent.connect.cloud.transforms.xml.model.BookForm", "type" : "STRUCT", "isOptional" : true, "fieldSchemas" : { "author" : { "type" : "STRING", "isOptional" : false }, "title" : { "type" : "STRING", "isOptional" : false }, "genre" : { "type" : "STRING", "isOptional" : false }, "price" : { "type" : "FLOAT32", "isOptional" : true }, "pub_date" : { "name" : "org.apache.kafka.connect.data.Date", "type" : "INT32", "version" : 1, "isOptional" : false }, "review" : { "type" : "STRING", "isOptional" : false }, "id" : { "type" : "STRING", "isOptional" : true } } } } } }, "fieldValues" : [ { "name" : "book", "schema" : { "type" : "ARRAY", "isOptional" : true, "valueSchema" : { "name" : "io.confluent.connect.cloud.transforms.xml.model.BookForm", "type" : "STRUCT", "isOptional" : true, "fieldSchemas" : { "author" : { "type" : "STRING", "isOptional" : false }, "title" : { "type" : "STRING", "isOptional" : false }, "genre" : { "type" : "STRING", "isOptional" : false }, "price" : { "type" : "FLOAT32", "isOptional" : true }, "pub_date" : { "name" : "org.apache.kafka.connect.data.Date", "type" : "INT32", "version" : 1, "isOptional" : false }, "review" : { "type" : "STRING", "isOptional" : false }, "id" : { "type" : "STRING", "isOptional" : true } } } }, "storage" : [ { "schema" : { "name" : "io.confluent.connect.cloud.transforms.xml.model.BookForm", "type" : "STRUCT", "isOptional" : true, "fieldSchemas" : { "author" : { "type" : "STRING", "isOptional" : false }, "title" : { "type" : "STRING", "isOptional" : false }, "genre" : { "type" : "STRING", "isOptional" : false }, "price" : { "type" : "FLOAT32", "isOptional" : true }, "pub_date" : { "name" : "org.apache.kafka.connect.data.Date", "type" : "INT32", "version" : 1, "isOptional" : false }, "review" : { "type" : "STRING", "isOptional" : false }, "id" : { "type" : "STRING", "isOptional" : true } } }, "fieldValues" : [ { "name" : "author", "schema" : { "type" : "STRING", "isOptional" : false }, "storage" : "Writer" }, { "name" : "title", "schema" : { "type" : "STRING", "isOptional" : false }, "storage" : "The First Book" }, { "name" : "genre", "schema" : { "type" : "STRING", "isOptional" : false }, "storage" : "Fiction" }, { "name" : "price", "schema" : { "type" : "FLOAT32", "isOptional" : true }, "storage" : 44.95 }, { "name" : "pub_date", "schema" : { "name" : "org.apache.kafka.connect.data.Date", "type" : "INT32", "version" : 1, "isOptional" : false }, "storage" : 970358400000 }, { "name" : "review", "schema" : { "type" : "STRING", "isOptional" : false }, "storage" : "An amazing story of nothing." }, { "name" : "id", "schema" : { "type" : "STRING", "isOptional" : true }, "storage" : "bk001" } ] }, { "schema" : { "name" : "io.confluent.connect.cloud.transforms.xml.model.BookForm", "type" : "STRUCT", "isOptional" : true, "fieldSchemas" : { "author" : { "type" : "STRING", "isOptional" : false }, "title" : { "type" : "STRING", "isOptional" : false }, "genre" : { "type" : "STRING", "isOptional" : false }, "price" : { "type" : "FLOAT32", "isOptional" : true }, "pub_date" : { "name" : "org.apache.kafka.connect.data.Date", "type" : "INT32", "version" : 1, "isOptional" : false }, "review" : { "type" : "STRING", "isOptional" : false }, "id" : { "type" : "STRING", "isOptional" : true } } }, "fieldValues" : [ { "name" : "author", "schema" : { "type" : "STRING", "isOptional" : false }, "storage" : "Poet" }, { "name" : "title", "schema" : { "type" : "STRING", "isOptional" : false }, "storage" : "The Poet's First Poem" }, { "name" : "genre", "schema" : { "type" : "STRING", "isOptional" : false }, "storage" : "Poem" }, { "name" : "price", "schema" : { "type" : "FLOAT32", "isOptional" : true }, "storage" : 24.95 }, { "name" : "pub_date", "schema" : { "name" : "org.apache.kafka.connect.data.Date", "type" : "INT32", "version" : 1, "isOptional" : false }, "storage" : 970358400000 }, { "name" : "review", "schema" : { "type" : "STRING", "isOptional" : false }, "storage" : "Least poetic poems." }, { "name" : "id", "schema" : { "type" : "STRING", "isOptional" : true }, "storage" : "bk002" } ] } ] } ] }, "timestampType" : "NO_TIMESTAMP_TYPE", "offset" : 1574310211719, "headers" : [ ] }
Properties
Name | Description | Type | Default | Valid Values | Importance |
|---|---|---|---|---|---|
| A list of URLs that specify the location of the XML schemas the connector must load. Both HTTP and HTTPS paths are supported. | LIST | HIGH | ||
| The Java package | STRING |
| HIGH | |
| Boolean value that tells the | BOOLEAN | [true, false] | LOW | |
| Specifies whether the | BOOLEAN |
| [true, false] | LOW |
| If set to | BOOLEAN | [true, false] | LOW |
Predicates
Transformations can be configured with predicates so that the transformation is applied only to records which satisfy a condition. You can use predicates in a transformation chain and, when combined with the Kafka Connect Filter (Kafka) SMT Usage Reference for Confluent Cloud, predicates can conditionally filter out specific records. For details and examples, see Predicates.