Kafka Connect FromXML SMT Usage Reference for Confluent Cloud
The FromXML single message transform (SMT) reads XML data, which is stored as bytes or a string and converts the XML to a structure that is strongly
typed in connect. For example, it allows data to be converted from XML and stored as AVRO in a topic.
Note
The FromXML SMT is supported on the HTTP V2 Source, HTTP V2 Sink, and IBM MQ Source connectors.
To apply the FromXML SMT, add the following to your connector configuration:
{
"transforms" : "fromXml",
"transforms.fromXml.type" : "com.github.jcustenborder.kafka.connect.transform.xml.FromXml$Value",
"transforms.fromXml.schema.path" : "file:src/test/resources/com/github/jcustenborder/kafka/connect/transform/xml/books.xsd"
}
Examples
The example below shows how to use FromXML SMT.
Before:
{ "topic" : "test", "kafkaPartition" : 1, "valueSchema" : { "type" : "STRING", "isOptional" : false }, "value" : "<?xml version=\"1.0\"?>\n<x:books xmlns:x=\"urn:books\">\n <book id=\"bk001\">\n <author>Writer</author>\n <title>The First Book</title>\n <genre>Fiction</genre>\n <price>44.95</price>\n <pub_date>2000-10-01</pub_date>\n <review>An amazing story of nothing.</review>\n </book>\n\n <book id=\"bk002\">\n <author>Poet</author>\n \"title\">The Poet's First Poem</title>\n \"genre\">Poem</genre>\n \"price\">24.95</price>\n \"pub_date\">2000-10-01</pub_date>\n \"review\">Least poetic poems.</review>\n </book>\n</x:books>", "timestampType" : "NO_TIMESTAMP_TYPE", "offset" : 1574310211719, "headers" : [ ] }
Adding the SMT to your connector configuration:
To apply the
FromXMLSMT, add the following to your connector configuration:{ "transforms" : "fromXml", "transforms.fromXml.type" : "com.github.jcustenborder.kafka.connect.transform.xml.FromXml$Value", "transforms.fromXml.schema.path" : "file:src/test/resources/com/github/jcustenborder/kafka/connect/transform/xml/books.xsd" }
After:
After the
FromXMLSMT applies, the value transforms as follows:{ "topic" : "test", "kafkaPartition" : 1, "valueSchema" : { "name" : "com.github.jcustenborder.kafka.connect.transform.xml.model.BooksForm", "type" : "STRUCT", "isOptional" : true, "fieldSchemas" : { "book" : { "type" : "ARRAY", "isOptional" : true, "valueSchema" : { "name" : "com.github.jcustenborder.kafka.connect.transform.xml.model.BookForm", "type" : "STRUCT", "isOptional" : true, "fieldSchemas" : { "author" : { "type" : "STRING", "isOptional" : false }, "title" : { "type" : "STRING", "isOptional" : false }, "genre" : { "type" : "STRING", "isOptional" : false }, "price" : { "type" : "FLOAT32", "isOptional" : true }, "pub_date" : { "name" : "org.apache.kafka.connect.data.Date", "type" : "INT32", "version" : 1, "isOptional" : false }, "review" : { "type" : "STRING", "isOptional" : false }, "id" : { "type" : "STRING", "isOptional" : true } } } } } }, "value" : { "schema" : { "name" : "com.github.jcustenborder.kafka.connect.transform.xml.model.BooksForm", "type" : "STRUCT", "isOptional" : true, "fieldSchemas" : { "book" : { "type" : "ARRAY", "isOptional" : true, "valueSchema" : { "name" : "com.github.jcustenborder.kafka.connect.transform.xml.model.BookForm", "type" : "STRUCT", "isOptional" : true, "fieldSchemas" : { "author" : { "type" : "STRING", "isOptional" : false }, "title" : { "type" : "STRING", "isOptional" : false }, "genre" : { "type" : "STRING", "isOptional" : false }, "price" : { "type" : "FLOAT32", "isOptional" : true }, "pub_date" : { "name" : "org.apache.kafka.connect.data.Date", "type" : "INT32", "version" : 1, "isOptional" : false }, "review" : { "type" : "STRING", "isOptional" : false }, "id" : { "type" : "STRING", "isOptional" : true } } } } } }, "fieldValues" : [ { "name" : "book", "schema" : { "type" : "ARRAY", "isOptional" : true, "valueSchema" : { "name" : "com.github.jcustenborder.kafka.connect.transform.xml.model.BookForm", "type" : "STRUCT", "isOptional" : true, "fieldSchemas" : { "author" : { "type" : "STRING", "isOptional" : false }, "title" : { "type" : "STRING", "isOptional" : false }, "genre" : { "type" : "STRING", "isOptional" : false }, "price" : { "type" : "FLOAT32", "isOptional" : true }, "pub_date" : { "name" : "org.apache.kafka.connect.data.Date", "type" : "INT32", "version" : 1, "isOptional" : false }, "review" : { "type" : "STRING", "isOptional" : false }, "id" : { "type" : "STRING", "isOptional" : true } } } }, "storage" : [ { "schema" : { "name" : "com.github.jcustenborder.kafka.connect.transform.xml.model.BookForm", "type" : "STRUCT", "isOptional" : true, "fieldSchemas" : { "author" : { "type" : "STRING", "isOptional" : false }, "title" : { "type" : "STRING", "isOptional" : false }, "genre" : { "type" : "STRING", "isOptional" : false }, "price" : { "type" : "FLOAT32", "isOptional" : true }, "pub_date" : { "name" : "org.apache.kafka.connect.data.Date", "type" : "INT32", "version" : 1, "isOptional" : false }, "review" : { "type" : "STRING", "isOptional" : false }, "id" : { "type" : "STRING", "isOptional" : true } } }, "fieldValues" : [ { "name" : "author", "schema" : { "type" : "STRING", "isOptional" : false }, "storage" : "Writer" }, { "name" : "title", "schema" : { "type" : "STRING", "isOptional" : false }, "storage" : "The First Book" }, { "name" : "genre", "schema" : { "type" : "STRING", "isOptional" : false }, "storage" : "Fiction" }, { "name" : "price", "schema" : { "type" : "FLOAT32", "isOptional" : true }, "storage" : 44.95 }, { "name" : "pub_date", "schema" : { "name" : "org.apache.kafka.connect.data.Date", "type" : "INT32", "version" : 1, "isOptional" : false }, "storage" : 970358400000 }, { "name" : "review", "schema" : { "type" : "STRING", "isOptional" : false }, "storage" : "An amazing story of nothing." }, { "name" : "id", "schema" : { "type" : "STRING", "isOptional" : true }, "storage" : "bk001" } ] }, { "schema" : { "name" : "com.github.jcustenborder.kafka.connect.transform.xml.model.BookForm", "type" : "STRUCT", "isOptional" : true, "fieldSchemas" : { "author" : { "type" : "STRING", "isOptional" : false }, "title" : { "type" : "STRING", "isOptional" : false }, "genre" : { "type" : "STRING", "isOptional" : false }, "price" : { "type" : "FLOAT32", "isOptional" : true }, "pub_date" : { "name" : "org.apache.kafka.connect.data.Date", "type" : "INT32", "version" : 1, "isOptional" : false }, "review" : { "type" : "STRING", "isOptional" : false }, "id" : { "type" : "STRING", "isOptional" : true } } }, "fieldValues" : [ { "name" : "author", "schema" : { "type" : "STRING", "isOptional" : false }, "storage" : "Poet" }, { "name" : "title", "schema" : { "type" : "STRING", "isOptional" : false }, "storage" : "The Poet's First Poem" }, { "name" : "genre", "schema" : { "type" : "STRING", "isOptional" : false }, "storage" : "Poem" }, { "name" : "price", "schema" : { "type" : "FLOAT32", "isOptional" : true }, "storage" : 24.95 }, { "name" : "pub_date", "schema" : { "name" : "org.apache.kafka.connect.data.Date", "type" : "INT32", "version" : 1, "isOptional" : false }, "storage" : 970358400000 }, { "name" : "review", "schema" : { "type" : "STRING", "isOptional" : false }, "storage" : "Least poetic poems." }, { "name" : "id", "schema" : { "type" : "STRING", "isOptional" : true }, "storage" : "bk002" } ] } ] } ] }, "timestampType" : "NO_TIMESTAMP_TYPE", "offset" : 1574310211719, "headers" : [ ] }
Properties
Name |
Description |
Type |
Default |
Valid Values |
Importance |
|---|---|---|---|---|---|
|
A list of URLs that specify the location of the XML schemas the connector must load. Both HTTP and HTTPS paths are supported. |
LIST |
HIGH |
||
|
The Java package |
STRING |
|
HIGH |
|
|
Boolean value that tells the |
BOOLEAN |
[true, false] |
LOW |
|
|
Specifies whether the |
BOOLEAN |
|
[true, false] |
LOW |
|
If set to |
BOOLEAN |
[true, false] |
LOW |
Predicates
Transformations can be configured with predicates so that the transformation is applied only to records which satisfy a condition. You can use predicates in a transformation chain and, when combined with the Kafka Connect Filter (Kafka) SMT Usage Reference for Confluent Cloud, predicates can conditionally filter out specific records. For details and examples, see Predicates.