Redact Confluent Logs

Modern software that runs in a Java Virtual Machine (JVM) is most often built up from hundreds of component libraries, which come from a wide variety of vendors and open source projects. Typically, each component library creates log messages to capture errors, warnings, informative messages, and debug information throughout their classes and methods. In rare cases, a log statement may inadvertently include sensitive information, unbeknownst to the component author, and once packaged up, end users may encounter scenarios that expose sensitive information in application logs. Of course, this can potentially lead to security concerns, and should be reported to the software provider. Minimally, you should reach out to the provider to request a fix, if one is available.

Regardless of the availability of such fixes for your component libraries, you can use Confluent Log Redactor plugin for log4j to configure regular expression patterns (redaction rules) to identify and redact specific patterns of sensitive information from your logs, before they are delegated to other appenders and emitted. You can configure Log Redactor for a component (such as Kafka, or Connect) by updating its log4j properties file.

For example, you might see that an HTTP Authorization header appears in your logs, and file a fix request ticket with the component provider. You can also configure Confluent Log Redaction to create a simple rule to match Authorization: Basic [0-9a-zA-Z\+\=\/]+ and replace it with Authorization: Basic *****.

Install the Confluent Log Redactor plugin

The Confluent Log Redactor plugin is enabled by default (for fresh installs) for Connect.

Configure the Log Redactor plugin

To configure the Log Redactor plugin, you need to:

  1. Back up your configuration file changes.
  2. Ensure that the JAR file is in the classpath for Connect.
  3. Update the log4j.properties files.
  4. Reference the Log Redactor class.

Configuration steps are covered in the following subsections.

Set up redaction rules

The redaction rules are specified in JSON format. The file content looks like the following example:

{
  "version": "1",
  "rules": [
    {
      "description": "This is the first rule",
      "trigger": "triggerstring 1",
      "search": "regex 1",
      "replace": "replace 1"
    },
    {
      "description": "This is the second rule",
      "trigger": "triggerstring 2",
      "caseSensitive": false,
      "search": "regex 2",
      "replace": "replace 2"
    }
  ]
}

If the log message matches a rule, the search field is used to search the message, and all occurrences will be replaced with replace.

Field Description Required?
trigger A simple string compare. May be used to provide a performance hint; a simple string compare may be faster than a regular expression. If it does not exist, the message will always apply search. Note that in future versions, this hint might be ignored. No
caseSensitive A boolean indicating if the trigger and search are to be used in case-sensitive or case-insensitive matching. Defaults to true (case-sensitive matching). No
search A regular expression. Make sure that proper escaping is used. Yes
replace A simple string. In practice, it usually looks something like XXXXXXX. If missing, the rule will be detect-only and not redact. No
description Intended for self-documentation purposes. No

In the current implementation, the ordering of the rules is significant. The rules are evaluated strictly in the order given. Thus, in theory later rules might process the output of earlier rules (aaa->bbb, bbb->ccc). The use of rules that depend on this behavior is discouraged as Confluent offers no guarantee that this behavior will be maintained in future versions.

Example of a rules.json file

Here is an example of a rules.json file:

{
  "version": 1,
  "rules": [
    {
      "description": "No more vowels",
      "search": "[aeiou]",
      "replace": "x"
    },
    {
      "description": "Passwords",
      "trigger": "password",
      "search": "password=.*",
      "replace": "password=xxxxx"
    }
  ]
}

This example has two rules. The first rule banishes lowercase vowels from all log messages and replaces them with x’s. The second rule looks for lines containing password and replaces any password=... occurrences with password=xxxxx.

Watch for policy rule changes and updating at runtime

Our log redactor can redact log content against dynamically changing redaction rules found in a file on the filesystem.

To set up the log redactor, use the following configuration in the log4j.properties file:

log4j.appender.redactor=io.confluent.log4j.redactor.RedactorAppender
log4j.appender.redactor.appenderRefs=stdout
log4j.appender.redactor.policy=io.confluent.log4j.redactor.RedactorPolicy
log4j.appender.redactor.policy.rules=/path/to/rules/file
log4j.appender.redactor.policy.refreshInterval=60000

log4j.logger.myLogger=stdout, redactor
The log4j.appender.redactor.policy.refreshInterval, if present, is used to
specify a time in milliseconds for how often the file system is checked for changes. When a change is detected, the redactor automatically adopts the changes. If unspecified, the policy rules are only read once at startup.

Configure Log Redactor for Metadata Service (MDS)

Confluent Log Redactor is included in Confluent Platform and can be configured for Metadata Service (MDS) by updating its log4j properties file.

Here is an example of an updated log4j.properties file:

log4j.rootLogger=INFO, stdout, file, redactor

...

# Configures the Log Redactor rewrite appender which redacts log messages using the specified redaction regex
# rules. The `policy.rules` property specifies the location of the redaction rules file to be used.
# The appender redacts logs before forwarding them to other appenders specified in the `appenderRefs` property.
log4j.appender.redactor=io.confluent.log4j.redactor.RedactorAppender
log4j.appender.redactor.appenderRefs=stdout, file
log4j.appender.redactor.policy=io.confluent.log4j.redactor.RedactorPolicy
log4j.appender.redactor.policy.rules=${log4j.config.dir}/metadata-log-redactor-rules.json