Monitor Confluent Platform with Confluent for Kubernetes

Monitor your Confluent for Kubernetes (CFK) environment using the following tools and resources:

  • Confluent Health+ with Telemetry
  • JMX metrics monitoring integrations
  • Confluent Control Center

Confluent Health+

Confluent Health+ provides ongoing, real-time analysis of performance and configuration data for your Confluent Platform deployment. From this analysis, Health+ sends out notifications to alert users to potential environmental issues before they become critical problems.

For more information, see Confluent Health+.

Telemetry Reporter

The Confluent Telemetry Reporter is a plugin that runs inside each Confluent Platform service to push metadata about the service to Confluent. Telemetry Reporter enables product features based on the metadata, like Health+. Data is sent over HTTP using an encrypted connection.

Telemetry is disabled by default in CFK. You can enable and configure it globally at the CFK level.

When you globally enable Telemetry, you have an option to disable for specific Confluent Platform components.

Each Confluent Platform component CR provides the status condition, in status.conditions, whether Telemetry is enabled or disabled.

For more information and supported settings for Telemetry Reporter, see Confluent Telemetry Reporter.

For a list of the metrics that are collected for Health+, see Telemetry Reporter Metrics.

Globally configure Telemetry

To globally enable Telemetry Reporter for CFK and all Confluent Platform components:

  1. Set the followings in the CFK values file.

    telemetry:
      operator:
         enabled: true  --- [1]
      enabled: true     --- [2]
    
    • [1] Enable Telemetry Report for CFK.
    • [2] Enable Telemetry Report for all Confluent Platform components.
  2. Apply the change with the following command:

    helm upgrade --install confluent-operator \
      confluentinc/confluent-for-kubernetes \
      --values <path-to-values-file> \
      --namespace <namespace>
    

To globally configure the Telemetry Reporter settings for all Confluent Platform components:

  1. Set the following in the CFK values file:

    telemetry:
      secretRef:                 --- [1]
      directoryPathInContainer:  --- [2]
    
    • [1] [2] CFK supports the secretRef and directoryPathInContainer methods to load Telemetry configuration through Helm. Specify only one method.

      The Telemetry configuration should be specified in telemetry.txt, and the file must contain api.key and api.secret.

      If using a proxy, additional properties are required as shown below.

      api.key=<cloud_key>
      api.secret=<cloud_secret>
      proxy.url=<proxy_url>           # Only required if proxy is enabled
      proxy.username=<proxy_username> # Only required if proxy requires credential
      proxy.password=<proxy_password> # Only required if proxy requires credential
      
    • [1] secretRef takes the precedence over directoryPathInContainer if both are configured.

      The expected key is telemetry.txt.

      If the referenced secretRef is not read or data is not in the expected format, CFK will fail to start.

    • [2] Provide the directory path where telemetry.txt is present.

      If telemetry.txt is not in the expected format, CFK will fail to start.

    See Provide secrets for Confluent Platform application CR for providing the keys/values and required annotations when using Vault.

  2. To apply changes in Telemetry settings, in the referenced Secret, or in the telemetry.txt file, manually restart CFK and Confluent Platform:

Disable Telemetry for a Confluent Platform component

To disable Telemetry for a specific Confluent Platform component, set the following in the component CR and apply the change with the kubectl apply command:

telemetry:
  global: false

JMX Metrics

CFK deploys all Confluent components with JMX metrics enabled by default. These JMX metrics are made available on all pods at the following endpoints:

  • JMX metrics are available on port 7203 of each pod.

  • Jolokia (a REST interface for JMX metrics) is available on port 7777 of each pod.

  • JMX Prometheus exporter is available on port 7778.

    Authentication / encryption is not supported for Prometheus exporter in CFK.

Configure security on JMX metrics endpoints

By default, authentication and encryption are not enabled on JMX/Prometheus metric endpoints, but you have options to configure authentication and TLS for JMX/Prometheus metric endpoints in the component CR:

spec:
  metrics:
    authentication:
      type:                    --- [1]
    prometheus:                --- [2]
      rules:                   --- [3]
        - attrNameSnakeCase:
          cache:
          help:
          labels:
          name:
          pattern:
          type:
          value:
          valueFactor:
      blacklist:               --- [4]
      whitelist:               --- [5]
    tls:
      enabled:                 --- [6]
  • [1] Set to mtls for mTLS authentication.

    If you set this to mtls, you must set tls.enabled: true ([6]).

    Note that CFK does not currently support mTLS authentication for Prometheus even with this set to mtls.

  • [2] Specify Prometheus configurations to override the default settings.

    See Prometheus for more information about the rules, blacklist, and whitelist properties.

  • [3] A list of rules to apply.

    For example:

    spec:
      metrics:
        prometheus:
          rules:
            - pattern: "org.apache.kafka.metrics<type=(\w+), name=(\w+)><>Value: (\d+)"
              name: "kafka_$1_$2"
              value: "$3"
              valueFactor: "0.000001"
              labels:
                "$1": "$4"
                "$2": "$3"
              help: "Kafka metric $1 $2"
              cache: false
              type: "GAUGE"
              attrNameSnakeCase: false
    
  • [4] An array of patterns (in the string format) to identify what not to query.

    For example:

    spec:
      metrics:
        prometheus:
          blacklist:
          - "org.apache.kafka.metrics:*"
    
  • [5] An array of patterns (in the string format) to identify what to query.

    For example:

    spec:
      metrics:
        prometheus:
          whitelist:
          - "org.apache.kafka.metrics:type=ColumnFamily,*"
    
  • [6] If set to true, metrics are configured with global or component TLS as described in Configure Network Encryption for Confluent Platform Using Confluent for Kubernetes.

    This setting is ignored for Prometheus. CFK currently does not support TLS for Prometheus.

Configure external access on JMX metrics

By default, CFK does not configure external access on JMX metric endpoints.

As an advanced configuration option, CFK supports enabling external access on JMX metrics endpoint as an advanced configuration option. You need this setup only if the Jolokia HTTP-based solution is not enough.

To enable external access for a Confluent Platform component at the component CR level, use the annotation:

  1. Set up an Kubernetes external access service that allows incoming requests to an external FQDN to be routed to the pod’s JMX/Prometheus metric endpoint port. See Networking Service.

  2. Apply the annotation on the Confluent Platform component CR using a comma-separated list of the DNS (or IP) and the port you got from the previous step when you created a service.

    "platform.confluent.io/jmx-rmi-server-hostnames": "pod1.external-address:<port1>,pod2.external-address<port2>, ... ,podn.external-address<portn>"
    

    For example, using the external IP address of the node, 35.35.35.35 and nodePorts 30001 to 30003, the following CR configures a node port external access on the Kafka nodes:

    kind: Kafka
    metadata:
      name: kafka
      namespace: operator
      annotations:
        "platform.confluent.io/jmx-rmi-server-urls": "35.35.35.35:30001,35.35.35.35:30002,35.35.35.35:30003"
    spec:
      replica: 3
    

Configure Prometheus and Grafana

You can configure Prometheus to capture and aggregate JMX metrics from Confluent components. Then you configure Grafana to visualize those metrics in a dashboard.

For example configuration scenarios, see an example of monitoring with Prometheus and Grafana in jmx-monitoring-stacks and an example of monitoring with Grafana metrics dashboard.

Confluent Control Center

Confluent Control Center is a web-based tool for managing and monitoring Confluent Platform. Control Center provides a user interface that enables developers and operators to:

  • Get a quick overview of cluster health
  • Observe and control messages, topics, and Schema Registry
  • Develop and run ksqlDB queries

For the metrics available for monitoring, see Metrics available in Control Center.

Configure Control Center to monitor Kafka clusters

The Confluent Metrics Reporter collects various metrics from an Apache Kafka® cluster. Control Center then uses those metrics to provide a detailed monitoring view of the Kafka cluster.

By default, the Confluent Metrics Reporter is enabled and configured to send Kafka metrics to a set of topics on the same Kafka cluster.

To send metrics to a different cluster, or to configure specific authentication settings, configure the Kafka custom resource (CR):

kind: Kafka
spec:
  metricReporter:
    enabled:                       --- [1]
    authentication:
      type:                        --- [2]
      jaasConfigPassThrough:
        secretRef:                 --- [3]
        directoryPathInContainer:  --- [4]
    tls:
      enabled:                     --- [5]
  • [1] Set to true or false to enable or disable the metrics reporting.

  • [2] Set to the authentication type to use for Kafka. See Configure authentication to access Kafka for details.

  • [3] Set to the Kubernetes Secret name used to authenticate to Kafka.

  • [4] Set to the directory path in the Kafka container where the Kafka authentication credentials are injected by Vault.

    See Provide secrets for Confluent Platform component CR for providing the credential and required annotations when using Vault.

  • [5] Set to true if the Kafka cluster has TLS network encryption enabled.

Once Confluent Metrics Reporter is set up for a Kafka cluster, configure Control Center to monitor the cluster.

By default, Control Center is set up to monitor the Kafka cluster it is using to store its own state. This Kafka cluster is defined using spec.dependencies.kafka in the Confluent Control Center CR.

If there is another Kafka cluster to monitor, you can configure that in the Control Center CR as below:

kind: ControlCenter
spec:
  monitoringKafkaClusters:
  - name:                         --- [1]
    bootstrapEndpoint:            --- [2]
    authentication:
      type:                       —-- [3]
      jaasConfig:                 —-- [4]
      jaasConfigPassThrough:      —-- [5]
      oauthbearer:                —-- [6]
        secretRef:                —-- [7]
        directoryPathInContainer: --- [8]
    tls:
      enabled:                    —-- [9]
  • [1] Set to Kafka cluster name.

  • [2] Set to the Kafka bootstrap endpoint.

  • [3] Set to the Kafka client authentication type.

    When RBAC is not enabled, valid options are plain and mtls.

    When RBAC is enabled, the only valid option is oauthbearer.

  • [4] [5] For authenticating to a Kafka cluster using SASL/PLAIN, see Client-side SASL/PLAIN authentication for Kafka.

  • [6] When Confluent Control Center authorization type is set to RBAC (spec.authorization.type: rbac) and the authentication type is set to oauthbearer in [3], use the OAuth method to authenticate to the Kafka cluster.

  • [7] The username and password are loaded through secretRef. The expected key is bearer.txt, and the value for the key is:

    username=<username>
    password=<password>
    

    An example command to create a secret to use for this property:

    kubectl create secret generic oauth-client \
      --from-file=bearer.txt=/some/path/bearer.txt \
      --namespace confluent
    
  • [8] The directory in the Confluent Control Center container where the expected Bearer credentials are injected by vault. See above ([7]) for the expected format.

    See Provide secrets for Confluent Platform component CR for providing the credential and required annotations when using Vault.

  • [9] For authenticating to a Kafka cluster using mTLS, see Client-side mTLS authentication for Kafka.

Configure Control Center to monitor remote Kafka clusters

To monitor a Kafka cluster in a different Kubernetes cluster:

  1. Configure the Control Center CR as described in monitoring additional Kafka clusters.

    When RBAC is enabled, you must set authentication.type to oauthbearer, and provide oauthbearer credentials in the Control Center CR.

  2. Configure the replication listener with external access on the remote Kafka as described in Configure Kafka in MRC with external access URLs.

    If RBAC is enabled on the remote Kafka cluster, the Kafka token listener piggy backs on the replication listener.

  3. Configure DNS where Control Center runs to be able to resolve the Kafka replication listener endpoint.

Configure Control Center to monitor ksqlDB, Connect and Schema Registry clusters

You can configure Control Center to provide a detailed monitoring or management view of ksqlDB, Connect, and Schema Registry clusters.

The following is an example of the dependencies section in a Control Center CR. The example connects two Schema Registry clusters, two ksqlDB clusters, and two Connect clusters to Control Center:

kind: ControlCenter
spec:
 dependencies:
   schemaRegistry:
     url: https://schemaregistry.confluent.svc.cluster.local:8081
     tls:
       enabled: true
     authentication:
       type: mtls
     clusters:
     - name: schemaregistry-dev
       url: https://schemaregistry-dev.confluent.svc.cluster.local:8081
       tls:
         enabled: true
       authentication:
        type: mtls
   ksqldb:
   - name: ksql-dev
     url: https://ksqldb.confluent.svc.cluster.local:8088
     tls:
       enabled: true
     authentication:
       type: mtls
   - name: ksql-dev1
     url: https://ksqldb-dev.confluent.svc.cluster.local:8088
     tls:
       enabled: true
     authentication:
       type: mtls
   connect:
   - name: connect-dev
     url: https://connect.confluent.svc.cluster.local:8083
     tls:
       enabled: true
     authentication:
       type: mtls
   - name: connect-dev2
     url: https://connect-dev.confluent.svc.cluster.local:8083
     tls:
       enabled: true
     authentication:
       type: mtls

For an example scenario to configure Confluent Control Center to monitor multiple ksqlDB, Connect, and Schema Registry clusters, see Connect Control Center to Multiple Connect, ksqlDB, and Schema Registry Clusters.