Monitor Confluent Platform with Confluent for Kubernetes

Monitor your Confluent for Kubernetes (CFK) environment using the following tools and resources:

  • Confluent Health+ with Telemetry

  • JMX metrics monitoring integrations

  • Confluent Control Center (Legacy)

Confluent Health+

Confluent Health+ provides ongoing, real-time analysis of performance and configuration data for your Confluent Platform deployment. From this analysis, Health+ sends out notifications to alert users to potential environmental issues before they become critical problems.

For more information, see Confluent Health+.

Telemetry Reporter

The Confluent Telemetry Reporter is a plugin that runs inside each Confluent Platform service to push metadata about the service to Confluent. Telemetry Reporter enables product features based on the metadata, like Health+. Data is sent over HTTP using an encrypted connection.

Telemetry is disabled by default in CFK. You can enable and configure it globally at the CFK level.

When you globally enable Telemetry, you have an option to disable for specific Confluent Platform components.

Each Confluent Platform component CR provides the status condition, in status.conditions, whether Telemetry is enabled or disabled.

For more information and supported settings for Telemetry Reporter, see Confluent Telemetry Reporter.

For a list of the metrics that are collected for Health+, see Telemetry Reporter Metrics.

Globally configure Telemetry

To globally enable Telemetry Reporter for CFK and all Confluent Platform components:

  1. Set the followings in the CFK values file.

    telemetry:
      operator:
         enabled: true  --- [1]
      enabled: true     --- [2]
    
    • [1] Enable Telemetry Report for CFK.

    • [2] Enable Telemetry Report for all Confluent Platform components.

  2. Apply the change with the following command:

    helm upgrade --install confluent-operator \
      confluentinc/confluent-for-kubernetes \
      --values <path-to-values-file> \
      --namespace <namespace>
    

To globally configure the Telemetry Reporter settings for all Confluent Platform components:

  1. Set the following in the CFK values file:

    telemetry:
      secretRef:                 --- [1]
      directoryPathInContainer:  --- [2]
    
    • [1] [2] CFK supports the secretRef and directoryPathInContainer methods to load Telemetry configuration through Helm. Specify only one method.

      The Telemetry configuration should be specified in telemetry.txt, and the file must contain api.key and api.secret.

      If using a proxy, additional properties are required as shown below.

      api.key=<cloud_key>
      api.secret=<cloud_secret>
      proxy.url=<proxy_url>           # Only required if proxy is enabled
      proxy.username=<proxy_username> # Only required if proxy requires credential
      proxy.password=<proxy_password> # Only required if proxy requires credential
      
    • [1] secretRef takes the precedence over directoryPathInContainer if both are configured.

      The expected key is telemetry.txt.

      If the referenced secretRef is not read or data is not in the expected format, CFK will fail to start.

    • [2] Provide the directory path where telemetry.txt is present.

      If telemetry.txt is not in the expected format, CFK will fail to start.

    See Provide secrets for Confluent Platform application CR for providing the keys/values and required annotations when using Vault.

  2. To apply changes in Telemetry settings, in the referenced Secret, or in the telemetry.txt file, manually restart CFK and Confluent Platform:

Disable Telemetry for a Confluent Platform component

To disable Telemetry for a specific Confluent Platform component, set the following in the component CR and apply the change with the kubectl apply command:

telemetry:
  global: false

JMX Metrics

CFK deploys all Confluent components with JMX metrics enabled by default. These JMX metrics are made available on all pods at the following endpoints: CFK can deploy Confluent components with JMX metrics.

Note

For enhanced security, JMX metrics are disabled by default starting with CFK 2.9.9. In earlier versions, JMX metrics were enabled by default.

When enabled, these JMX metrics are made available on all pods at the following endpoints:

  • JMX metrics are available on port 7203 of each pod.

  • Jolokia (a REST interface for JMX metrics) is available on port 7777 of each pod.

  • JMX Prometheus exporter is available on port 7778.

    Authentication / encryption is not supported for Prometheus exporter in CFK.

Backward compatibility

The default behavior for JMX remote access has changed:

  • Previous behavior: The JMX port was accessible remotely without authentication. Users could read and write through the exposed JMX port.

  • Current behavior: If you do not configure JMX authentication in the CR specification, remote JMX access is disabled by default. The JMX port is only accessible locally within the pod.

Impact on existing deployments:

If you currently use remote JMX access without authentication, you must configure JMX authentication to continue using remote JMX. Internal applications like Jolokia metrics and JMX Prometheus exporter continue to work because they use the in-process JMX API and are not affected by remote connector settings.

Note

  • JMX authentication files do not support hot-reloading because the JVM only loads JMX credentials at startup. Changes to password or access control files require a pod restart to take effect.

  • The JMX password and access control secretRef are not owned by the CR and are not watched. JMX credential updates require a manual pod restart.

Configure JMX authentication

You can configure password-based authentication and access control for JMX connections to secure JMX endpoints.

JMX authentication requires explicit configuration. You must provide either a secretRef or directoryPathInContainer to enable JMX authentication. The operator does not support auto-generated passwords.

Note

Use JMX authentication with TLS/SSL encryption in production environments. See Configure Network Encryption for Confluent Platform Using Confluent for Kubernetes for TLS configuration.

Configure password-based authentication

To enable password-based authentication for JMX:

  1. Create a password file with the following format:

    <username> <password>
    <username> <password>
    

    For example:

    admin secretpassword
    readonly readonlypassword
    

    Set the password file permissions to 0600 (read/write for owner only).

    Note

    When you use secretRef, the operator automatically sets file permissions by specifying items in the Kubernetes SecretVolumeSource. The Kubernetes Kubelet applies these permissions when mounting the secrets into pods, ensuring the JMX password file is readable only by the Kafka process (UID 1001) without requiring manual user intervention.

    When you use directoryPathInContainer with Vault, you must set the correct file permissions before the Vault sidecar injects the password file into the container.

  2. Create a Kubernetes Secret with the password file:

    kubectl -n <namespace> create secret generic my-jmx-password \
      --from-literal=jmx='password: |
      admin secretpassword
      readonly readonlypassword'
    
  3. Configure the component CR:

    kind: Kafka
    spec:
      metrics:
        jmx:
          authentication:
            secretRef: my-jmx-password
    

Alternatively, you can use directoryPathInContainer to provide the path to a JMX password file (jmxremote.password) injected by a sidecar (for example, Vault).

Configure access control

Use JMX access control to restrict MBean operations with read-only or read-write permissions.

To configure access control:

  1. Create an access control file with the following format:

    <username> <access-level>
    

    Valid access levels are readonly and readwrite.

    For example:

    admin readwrite
    readonly readonly
    

    Set the access control file permissions to 0644 (read for all, write for owner).

    Note

    When you use secretRef, the operator automatically sets file permissions. When you use directoryPathInContainer with Vault, you must set the correct file permissions before the Vault sidecar injects the access control file into the container.

  2. Create a Kubernetes Secret with the access control file:

    kubectl -n <namespace> create secret generic my-jmx-access \
      --from-literal=jmx='access: |
      admin readwrite
      readonly readonly'
    
  3. Configure the component CR:

    kind: Kafka
    spec:
      metrics:
        jmx:
          authentication:
            secretRef: my-jmx-password
          accessControl:
            enabled: true
            secretRef: my-jmx-access
    

If you set accessControl.enabled to true without providing secretRef or directoryPathInContainer, the operator creates a default configuration with read-only access. When using the default access control file, you must add at least one of the following roles in the password file: monitorRole or controlRole. The default access control file only defines these roles with read-only access levels.

If you set accessControl.enabled to false, the system does not apply access control, and all users can read and write.

Configure with Vault injection

As an alternative to Kubernetes Secrets, you can use directoryPathInContainer to provide JMX password and access control files injected by Vault.

To configure JMX authentication and access control with Vault:

  1. Configure the component CR with Vault annotations:

    apiVersion: platform.confluent.io/v1beta1
    kind: Kafka
    metadata:
      name: kafka
    spec:
      metrics:
        jmx:
          authentication:
            directoryPathInContainer: /vault/secrets/jmx
          accessControl:
            enabled: true
            directoryPathInContainer: /vault/secrets/jmx
      podTemplate:
        annotations:
          # Vault Agent Injection
          vault.hashicorp.com/agent-inject: "true"
          vault.hashicorp.com/agent-inject-status: "update"
          vault.hashicorp.com/preserve-secret-case: "true"
          vault.hashicorp.com/role: "confluent-operator"
          vault.hashicorp.com/agent-run-as-user: "1001"
          vault.hashicorp.com/agent-run-as-group: "1001"
    
          # Inject JMX password file
          vault.hashicorp.com/agent-inject-secret-jmx-password: "secret/kafka/jmx/password"
          vault.hashicorp.com/secret-volume-path-jmx-password: "/vault/secrets/jmx"
          vault.hashicorp.com/agent-inject-file-jmx-password: "jmxremote.password"
          vault.hashicorp.com/agent-inject-template-jmx-password: |
            {{- with secret "secret/kafka/jmx/password" -}}
            {{ .Data.data.content }}
            {{- end }}
          vault.hashicorp.com/agent-inject-perms-jmx-password: "0600"
    
          # Inject JMX access control file
          vault.hashicorp.com/agent-inject-secret-jmx-access: "secret/kafka/jmx/access"
          vault.hashicorp.com/secret-volume-path-jmx-access: "/vault/secrets/jmx"
          vault.hashicorp.com/agent-inject-file-jmx-access: "jmxremote.access"
          vault.hashicorp.com/agent-inject-template-jmx-access: |
            {{- with secret "secret/kafka/jmx/access" -}}
            {{ .Data.data.content }}
            {{- end }}
          vault.hashicorp.com/agent-inject-perms-jmx-access: "0644"
    

    Important

    When using directoryPathInContainer, you must set the correct file permissions before the Vault sidecar injects the files into the container:

    • Password file (jmxremote.password): 0600 (read/write for owner only)

    • Access control file (jmxremote.access): 0644 (read for all, write for owner)

    As shown in the example above, use the vault.hashicorp.com/agent-inject-perms-<file> annotation to set these permissions.

    When using DPIC (Dynamic Pod Identity Credentials), you must also include the following annotations:

    • vault.hashicorp.com/agent-run-as-user: "1001"

    • vault.hashicorp.com/agent-run-as-group: "1001"

    The Vault sidecar injects the JMX password file (jmxremote.password) and access control file (jmxremote.access) into the specified directory path with the required file permissions.

Configure security on JMX metrics endpoints

By default, authentication, encryption, and external access are not provided on JMX/Prometheus metric endpoints, but you have options to configure authentication and TLS for JMX/Prometheus metric endpoints in the component CR:

spec:
  metrics:
    authentication:
      type:                    --- [1]
    prometheus:                --- [2]
      rules:                   --- [3]
        - attrNameSnakeCase:
          cache:
          help:
          labels:
          name:
          pattern:
          type:
          value:
          valueFactor:
      blacklist:               --- [4]
      whitelist:               --- [5]
    tls:
      enabled:                 --- [6]
  • [1] Set to mtls for mTLS authentication.

    If you set this to mtls, you must set tls.enabled: true ([6]).

    Note that CFK does not currently support mTLS authentication for Prometheus even with this set to mtls.

  • [2] Specify Prometheus configurations to override the default settings.

    See Prometheus for more information about the rules, blacklist, and whitelist properties.

  • [3] A list of rules to apply.

    For example:

    spec:
      metrics:
        prometheus:
          rules:
            - pattern: "org.apache.kafka.metrics<type=(\w+), name=(\w+)><>Value: (\d+)"
              name: "kafka_$1_$2"
              value: "$3"
              valueFactor: "0.000001"
              labels:
                "$1": "$4"
                "$2": "$3"
              help: "Kafka metric $1 $2"
              cache: false
              type: "GAUGE"
              attrNameSnakeCase: false
    
  • [4] An array of patterns (in the string format) to identify what not to query.

    For example:

    spec:
      metrics:
        prometheus:
          blacklist:
          - "org.apache.kafka.metrics:*"
    
  • [5] An array of patterns (in the string format) to identify what to query.

    For example:

    spec:
      metrics:
        prometheus:
          whitelist:
          - "org.apache.kafka.metrics:type=ColumnFamily,*"
    
  • [6] If set to true, metrics are configured with global or component TLS as described in Configure Network Encryption for Confluent Platform Using Confluent for Kubernetes.

    This setting is ignored for Prometheus. CFK currently does not support TLS for Prometheus.

Note

CFK does not support configuring external access to JMX/Prometheus metric endpoints at the component CR level. To enable external access, set up your own Kubernetes external access mechanism that allows incoming requests to an external FQDN to be routed to the pod’s JMX/Prometheus metric endpoint port.

Configure Prometheus and Grafana

You can configure Prometheus to capture and aggregate JMX metrics from Confluent components. Then you configure Grafana to visualize those metrics in a dashboard.

For example configuration scenarios, see an example of monitoring with Prometheus and Grafana in jmx-monitoring-stacks and an example of monitoring with Grafana metrics dashboard.

Confluent Control Center (Legacy)

Confluent Control Center (Legacy) is a web-based tool for managing and monitoring Confluent Platform. Control Center (Legacy) provides a user interface that enables developers and operators to:

  • Get a quick overview of cluster health

  • Observe and control messages, topics, and Schema Registry

  • Develop and run ksqlDB queries

For the metrics available for monitoring, see Metrics available in Control Center.

Configure Control Center (Legacy) to monitor Kafka clusters

The Confluent Metrics Reporter collects various metrics from an Apache Kafka® cluster. Control Center (Legacy) then uses those metrics to provide a detailed monitoring view of the Kafka cluster.

By default, the Confluent Metrics Reporter is enabled and configured to send Kafka metrics to a set of topics on the same Kafka cluster.

To send metrics to a different cluster, or to configure specific authentication settings, configure the Kafka custom resource (CR):

kind: Kafka
spec:
  metricReporter:
    enabled:                       --- [1]
    authentication:
      type:                        --- [2]
      jaasConfigPassThrough:
        secretRef:                 --- [3]
        directoryPathInContainer:  --- [4]
    tls:
      enabled:                     --- [5]
  • [1] Set to true or false to enable or disable the metrics reporting.

  • [2] Set to the authentication type to use for Kafka. See Configure authentication to access Kafka for details.

  • [3] Set to the Kubernetes Secret name used to authenticate to Kafka.

  • [4] Set to the directory path in the Kafka container where the Kafka authentication credentials are injected by Vault.

    See Provide secrets for Confluent Platform component CR for providing the credential and required annotations when using Vault.

  • [5] Set to true if the Kafka cluster has TLS network encryption enabled.

Once Confluent Metrics Reporter is set up for a Kafka cluster, configure Control Center (Legacy) to monitor the cluster.

By default, Control Center (Legacy) is set up to monitor the Kafka cluster it is using to store its own state. This Kafka cluster is defined using spec.dependencies.kafka in the Confluent Control Center (Legacy) CR.

If there is another Kafka cluster to monitor, you can configure that in the Control Center (Legacy) CR as below:

kind: ControlCenter
spec:
  monitoringKafkaClusters:
  - name:                         --- [1]
    bootstrapEndpoint:            --- [2]
    authentication:
      type:                       —-- [3]
      jaasConfig:                 —-- [4]
      jaasConfigPassThrough:      —-- [5]
      oauthbearer:                —-- [6]
        secretRef:                —-- [7]
        directoryPathInContainer: --- [8]
    tls:
      enabled:                    —-- [9]
  • [1] Set to Kafka cluster name.

  • [2] Set to the Kafka bootstrap endpoint.

  • [3] Set to the Kafka client authentication type.

    When RBAC is not enabled, valid options are plain and mtls.

    When RBAC is enabled, the only valid option is oauthbearer.

  • [4] [5] For authenticating to a Kafka cluster using SASL/PLAIN, see Client-side SASL/PLAIN authentication for Kafka.

  • [6] When Confluent Control Center (Legacy) authorization type is set to RBAC (spec.authorization.type: rbac) and the authentication type is set to oauthbearer in [3], use the OAuth method to authenticate to the Kafka cluster.

  • [7] The username and password are loaded through secretRef. The expected key is bearer.txt, and the value for the key is:

    username=<username>
    password=<password>
    

    An example command to create a secret to use for this property:

    kubectl create secret generic oauth-client \
      --from-file=bearer.txt=/some/path/bearer.txt \
      --namespace confluent
    
  • [8] The directory in the Confluent Control Center (Legacy) container where the expected Bearer credentials are injected by vault. See above ([7]) for the expected format.

    See Provide secrets for Confluent Platform component CR for providing the credential and required annotations when using Vault.

  • [9] For authenticating to a Kafka cluster using mTLS, see Client-side mTLS authentication for Kafka.

Configure Control Center (Legacy) to monitor remote Kafka clusters

To monitor a Kafka cluster in a different Kubernetes cluster:

  1. Configure the Control Center (Legacy) CR as described in monitoring additional Kafka clusters.

    When RBAC is enabled, you must set authentication.type to oauthbearer, and provide oauthbearer credentials in the Control Center (Legacy) CR.

  2. Configure the replication listener with external access on the remote Kafka as described in Configure Kafka in MRC with external access URLs.

    If RBAC is enabled on the remote Kafka cluster, the Kafka token listener piggy backs on the replication listener.

  3. Configure DNS where Control Center (Legacy) runs to be able to resolve the Kafka replication listener endpoint.

Configure Control Center (Legacy) to monitor ksqlDB, Connect and Schema Registry clusters

You can configure Control Center (Legacy) to provide a detailed monitoring or management view of ksqlDB, Connect, and Schema Registry clusters.

The following is an example of the dependencies section in a Control Center (Legacy) CR. The example connects two Schema Registry clusters, two ksqlDB clusters, and two Connect clusters to Control Center (Legacy):

kind: ControlCenter
spec:
 dependencies:
   schemaRegistry:
     url: https://schemaregistry.confluent.svc.cluster.local:8081
     tls:
       enabled: true
     authentication:
       type: mtls
     clusters:
     - name: schemaregistry-dev
       url: https://schemaregistry-dev.confluent.svc.cluster.local:8081
       tls:
         enabled: true
       authentication:
        type: mtls
   ksqldb:
   - name: ksql-dev
     url: https://ksqldb.confluent.svc.cluster.local:8088
     tls:
       enabled: true
     authentication:
       type: mtls
   - name: ksql-dev1
     url: https://ksqldb-dev.confluent.svc.cluster.local:8088
     tls:
       enabled: true
     authentication:
       type: mtls
   connect:
   - name: connect-dev
     url: https://connect.confluent.svc.cluster.local:8083
     tls:
       enabled: true
     authentication:
       type: mtls
   - name: connect-dev2
     url: https://connect-dev.confluent.svc.cluster.local:8083
     tls:
       enabled: true
     authentication:
       type: mtls

For an example scenario to configure Confluent Control Center (Legacy) to monitor multiple ksqlDB, Connect, and Schema Registry clusters, see Connect Control Center to Multiple Connect, ksqlDB, and Schema Registry Clusters.