Kafka Connect Flatten SMT for Confluent Platform

The following provides usage information for the Apache Kafka® SMT org.apache.kafka.connect.transforms.Flatten.

Description

Flatten a nested data structure, generating names for each field by concatenating the field names at each level with a configurable delimiter character. Applies to a Struct when a schema is present, or a Map in the case of schemaless data. The JSON Single Message Transforms (SMT) delimiter is the period . character, which is also the default.

Use the concrete transformation type designed for the record key (org.apache.kafka.connect.transforms.Flatten$Key) or value (org.apache.kafka.connect.transforms.Flatten$Value).

When using an Avro schema, the default SMT delimiter period . cannot be used. Use an underscore _ for the delimiter. Use the following properties to change the delimiter:

"transforms": "flatten",
"transforms.flatten.type": "org.apache.kafka.connect.transforms.Flatten$Value",
"transforms.flatten.delimiter": "_"

Note

The Flatten SMT does not flatten arrays. Arrays are preserved as is.

JSON Example

The configuration snippet below shows how to use Flatten to concatenate field names with the period . delimiter character.

"transforms": "flatten",
"transforms.flatten.type": "org.apache.kafka.connect.transforms.Flatten$Value",
"transforms.flatten.delimiter": "."

Before:

{
  "content": {
    "id": 42,
    "name": {
      "first": "David",
      "middle": null,
      "last": "Wong"
    }
  }
}

After:

{
  "content.id": 42,
  "content.name.first": "David",
  "content.name.middle": null,
  "content.name.last": "Wong"
}

Avro Example

The Avro schema specification only allows alphanumeric characters and the underscore _ character in field names. The configuration snippet below shows how to use Flatten to concatenate field names with the underscore _ delimiter character.

"transforms": "flatten",
"transforms.flatten.type": "org.apache.kafka.connect.transforms.Flatten$Value",
"transforms.flatten.delimiter": "_"

Before:

{
  "content": {
    "id": 42,
    "name": {
      "first": "David",
      "middle": null,
      "last": "Wong"
    }
  }
}

After:

{
  "content_id": 42,
  "content_name_first": "David",
  "content_name_middle": null,
  "content_name_last": "Wong"
 }

Tip

For additional examples, see Flatten for managed connectors.

Properties

Name Description Type Default Valid Values Importance
delimiter Delimiter to insert between field names from the input record when generating field names for the output record. string . _ medium

Predicates

Transformations can be configured with predicates so that the transformation is applied only to records which satisfy a condition. You can use predicates in a transformation chain and, when combined with the Kafka Connect Filter (Kafka) SMT for Confluent Platform, predicates can conditionally filter out specific records. For details and examples, see Predicates.