Monitor Confluent Platform with Confluent for Kubernetes
Monitor your Confluent for Kubernetes (CFK) environment using the following tools and resources:
Confluent Health+ with Telemetry
JMX metrics monitoring integrations
Confluent Control Center (Legacy)
Confluent Health+
Confluent Health+ provides ongoing, real-time analysis of performance and configuration data for your Confluent Platform deployment. From this analysis, Health+ sends out notifications to alert users to potential environmental issues before they become critical problems.
For more information, see Confluent Health+.
Telemetry Reporter
The Confluent Telemetry Reporter is a plugin that runs inside each Confluent Platform service to push metadata about the service to Confluent. Telemetry Reporter enables product features based on the metadata, like Health+. Data is sent over HTTP using an encrypted connection.
Telemetry is disabled by default in CFK. You can enable and configure it globally at the CFK level.
When you globally enable Telemetry, you have an option to disable for specific Confluent Platform components.
Each Confluent Platform component CR provides the status condition, in status.conditions, whether Telemetry is enabled or disabled.
For more information and supported settings for Telemetry Reporter, see Confluent Telemetry Reporter.
For a list of the metrics that are collected for Health+, see Telemetry Reporter Metrics.
Globally configure Telemetry
To globally enable Telemetry Reporter for CFK and all Confluent Platform components:
Set the followings in the CFK values file.
telemetry: operator: enabled: true --- [1] enabled: true --- [2]
[1] Enable Telemetry Report for CFK.
[2] Enable Telemetry Report for all Confluent Platform components.
Apply the change with the following command:
helm upgrade --install confluent-operator \ confluentinc/confluent-for-kubernetes \ --values <path-to-values-file> \ --namespace <namespace>
To globally configure the Telemetry Reporter settings for all Confluent Platform components:
Set the following in the CFK values file:
telemetry: secretRef: --- [1] directoryPathInContainer: --- [2]
[1] [2] CFK supports the
secretRefanddirectoryPathInContainermethods to load Telemetry configuration through Helm. Specify only one method.The Telemetry configuration should be specified in
telemetry.txt, and the file must containapi.keyandapi.secret.If using a proxy, additional properties are required as shown below.
api.key=<cloud_key> api.secret=<cloud_secret> proxy.url=<proxy_url> # Only required if proxy is enabled proxy.username=<proxy_username> # Only required if proxy requires credential proxy.password=<proxy_password> # Only required if proxy requires credential
[1]
secretReftakes the precedence overdirectoryPathInContainerif both are configured.The expected key is
telemetry.txt.If the referenced
secretRefis not read or data is not in the expected format, CFK will fail to start.[2] Provide the directory path where
telemetry.txtis present.If
telemetry.txtis not in the expected format, CFK will fail to start.
See Provide secrets for Confluent Platform application CR for providing the keys/values and required annotations when using Vault.
To apply changes in Telemetry settings, in the referenced Secret, or in the
telemetry.txtfile, manually restart CFK and Confluent Platform:Restart CFK:
kubectl rollout restart deployment/confluent-operator
Restart Confluent Platform components as described in Restart Confluent Platform Using Confluent for Kubernetes.
Disable Telemetry for a Confluent Platform component
To disable Telemetry for a specific Confluent Platform component, set the following in the component CR and apply the change with the kubectl apply command:
telemetry:
global: false
JMX Metrics
CFK deploys all Confluent components with JMX metrics enabled by default. These JMX metrics are made available on all pods at the following endpoints: CFK can deploy Confluent components with JMX metrics.
Note
For enhanced security, JMX metrics are disabled by default starting with CFK 2.9.9. In earlier versions, JMX metrics were enabled by default.
When enabled, these JMX metrics are made available on all pods at the following endpoints:
JMX metrics are available on port 7203 of each pod.
Jolokia (a REST interface for JMX metrics) is available on port 7777 of each pod.
JMX Prometheus exporter is available on port 7778.
Authentication / encryption is not supported for Prometheus exporter in CFK.
Backward compatibility
The default behavior for JMX remote access has changed:
Previous behavior: The JMX port was accessible remotely without authentication. Users could read and write through the exposed JMX port.
Current behavior: If you do not configure JMX authentication in the CR specification, remote JMX access is disabled by default. The JMX port is only accessible locally within the pod.
Impact on existing deployments:
If you currently use remote JMX access without authentication, you must configure JMX authentication to continue using remote JMX. Internal applications like Jolokia metrics and JMX Prometheus exporter continue to work because they use the in-process JMX API and are not affected by remote connector settings.
Note
JMX authentication files do not support hot-reloading because the JVM only loads JMX credentials at startup. Changes to password or access control files require a pod restart to take effect.
The JMX password and access control
secretRefare not owned by the CR and are not watched. JMX credential updates require a manual pod restart.
Configure JMX authentication
You can configure password-based authentication and access control for JMX connections to secure JMX endpoints.
JMX authentication requires explicit configuration. You must provide either a secretRef or directoryPathInContainer to enable JMX authentication. The operator does not support auto-generated passwords.
Note
Use JMX authentication with TLS/SSL encryption in production environments. See Configure Network Encryption for Confluent Platform Using Confluent for Kubernetes for TLS configuration.
Configure password-based authentication
To enable password-based authentication for JMX:
Create a password file with the following format:
<username> <password> <username> <password>
For example:
admin secretpassword readonly readonlypassword
Set the password file permissions to
0600(read/write for owner only).Note
When you use
secretRef, the operator automatically sets file permissions by specifying items in the KubernetesSecretVolumeSource. The Kubernetes Kubelet applies these permissions when mounting the secrets into pods, ensuring the JMX password file is readable only by the Kafka process (UID 1001) without requiring manual user intervention.When you use
directoryPathInContainerwith Vault, you must set the correct file permissions before the Vault sidecar injects the password file into the container.Create a Kubernetes Secret with the password file:
kubectl -n <namespace> create secret generic my-jmx-password \ --from-literal=jmx='password: | admin secretpassword readonly readonlypassword'
Configure the component CR:
kind: Kafka spec: metrics: jmx: authentication: secretRef: my-jmx-password
Alternatively, you can use directoryPathInContainer to provide the path to a JMX password file (jmxremote.password) injected by a sidecar (for example, Vault).
Configure access control
Use JMX access control to restrict MBean operations with read-only or read-write permissions.
To configure access control:
Create an access control file with the following format:
<username> <access-level>
Valid access levels are
readonlyandreadwrite.For example:
admin readwrite readonly readonly
Set the access control file permissions to
0644(read for all, write for owner).Note
When you use
secretRef, the operator automatically sets file permissions. When you usedirectoryPathInContainerwith Vault, you must set the correct file permissions before the Vault sidecar injects the access control file into the container.Create a Kubernetes Secret with the access control file:
kubectl -n <namespace> create secret generic my-jmx-access \ --from-literal=jmx='access: | admin readwrite readonly readonly'
Configure the component CR:
kind: Kafka spec: metrics: jmx: authentication: secretRef: my-jmx-password accessControl: enabled: true secretRef: my-jmx-access
If you set accessControl.enabled to true without providing secretRef or directoryPathInContainer, the operator creates a default configuration with read-only access. When using the default access control file, you must add at least one of the following roles in the password file: monitorRole or controlRole. The default access control file only defines these roles with read-only access levels.
If you set accessControl.enabled to false, the system does not apply access control, and all users can read and write.
Configure with Vault injection
As an alternative to Kubernetes Secrets, you can use directoryPathInContainer to provide JMX password and access control files injected by Vault.
To configure JMX authentication and access control with Vault:
Configure the component CR with Vault annotations:
apiVersion: platform.confluent.io/v1beta1 kind: Kafka metadata: name: kafka spec: metrics: jmx: authentication: directoryPathInContainer: /vault/secrets/jmx accessControl: enabled: true directoryPathInContainer: /vault/secrets/jmx podTemplate: annotations: # Vault Agent Injection vault.hashicorp.com/agent-inject: "true" vault.hashicorp.com/agent-inject-status: "update" vault.hashicorp.com/preserve-secret-case: "true" vault.hashicorp.com/role: "confluent-operator" vault.hashicorp.com/agent-run-as-user: "1001" vault.hashicorp.com/agent-run-as-group: "1001" # Inject JMX password file vault.hashicorp.com/agent-inject-secret-jmx-password: "secret/kafka/jmx/password" vault.hashicorp.com/secret-volume-path-jmx-password: "/vault/secrets/jmx" vault.hashicorp.com/agent-inject-file-jmx-password: "jmxremote.password" vault.hashicorp.com/agent-inject-template-jmx-password: | {{- with secret "secret/kafka/jmx/password" -}} {{ .Data.data.content }} {{- end }} vault.hashicorp.com/agent-inject-perms-jmx-password: "0600" # Inject JMX access control file vault.hashicorp.com/agent-inject-secret-jmx-access: "secret/kafka/jmx/access" vault.hashicorp.com/secret-volume-path-jmx-access: "/vault/secrets/jmx" vault.hashicorp.com/agent-inject-file-jmx-access: "jmxremote.access" vault.hashicorp.com/agent-inject-template-jmx-access: | {{- with secret "secret/kafka/jmx/access" -}} {{ .Data.data.content }} {{- end }} vault.hashicorp.com/agent-inject-perms-jmx-access: "0644"
Important
When using
directoryPathInContainer, you must set the correct file permissions before the Vault sidecar injects the files into the container:Password file (
jmxremote.password):0600(read/write for owner only)Access control file (
jmxremote.access):0644(read for all, write for owner)
As shown in the example above, use the
vault.hashicorp.com/agent-inject-perms-<file>annotation to set these permissions.When using DPIC (Dynamic Pod Identity Credentials), you must also include the following annotations:
vault.hashicorp.com/agent-run-as-user: "1001"vault.hashicorp.com/agent-run-as-group: "1001"
The Vault sidecar injects the JMX password file (
jmxremote.password) and access control file (jmxremote.access) into the specified directory path with the required file permissions.
Configure security on JMX metrics endpoints
By default, authentication, encryption, and external access are not provided on JMX/Prometheus metric endpoints, but you have options to configure authentication and TLS for JMX/Prometheus metric endpoints in the component CR:
spec:
metrics:
authentication:
type: --- [1]
prometheus: --- [2]
rules: --- [3]
- attrNameSnakeCase:
cache:
help:
labels:
name:
pattern:
type:
value:
valueFactor:
blacklist: --- [4]
whitelist: --- [5]
tls:
enabled: --- [6]
[1] Set to
mtlsfor mTLS authentication.If you set this to
mtls, you must settls.enabled: true([6]).Note that CFK does not currently support mTLS authentication for Prometheus even with this set to
mtls.[2] Specify Prometheus configurations to override the default settings.
See Prometheus for more information about the
rules,blacklist, andwhitelistproperties.[3] A list of rules to apply.
For example:
spec: metrics: prometheus: rules: - pattern: "org.apache.kafka.metrics<type=(\w+), name=(\w+)><>Value: (\d+)" name: "kafka_$1_$2" value: "$3" valueFactor: "0.000001" labels: "$1": "$4" "$2": "$3" help: "Kafka metric $1 $2" cache: false type: "GAUGE" attrNameSnakeCase: false
[4] An array of patterns (in the string format) to identify what not to query.
For example:
spec: metrics: prometheus: blacklist: - "org.apache.kafka.metrics:*"
[5] An array of patterns (in the string format) to identify what to query.
For example:
spec: metrics: prometheus: whitelist: - "org.apache.kafka.metrics:type=ColumnFamily,*"
[6] If set to
true, metrics are configured with global or component TLS as described in Configure Network Encryption for Confluent Platform Using Confluent for Kubernetes.This setting is ignored for Prometheus. CFK currently does not support TLS for Prometheus.
Note
CFK does not support configuring external access to JMX/Prometheus metric endpoints at the component CR level. To enable external access, set up your own Kubernetes external access mechanism that allows incoming requests to an external FQDN to be routed to the pod’s JMX/Prometheus metric endpoint port.
Configure Prometheus and Grafana
You can configure Prometheus to capture and aggregate JMX metrics from Confluent components. Then you configure Grafana to visualize those metrics in a dashboard.
For example configuration scenarios, see an example of monitoring with Prometheus and Grafana in jmx-monitoring-stacks and an example of monitoring with Grafana metrics dashboard.
Confluent Control Center (Legacy)
Confluent Control Center (Legacy) is a web-based tool for managing and monitoring Confluent Platform. Control Center (Legacy) provides a user interface that enables developers and operators to:
Get a quick overview of cluster health
Observe and control messages, topics, and Schema Registry
Develop and run ksqlDB queries
For the metrics available for monitoring, see Metrics available in Control Center.
Configure Control Center (Legacy) to monitor Kafka clusters
The Confluent Metrics Reporter collects various metrics from an Apache Kafka® cluster. Control Center (Legacy) then uses those metrics to provide a detailed monitoring view of the Kafka cluster.
By default, the Confluent Metrics Reporter is enabled and configured to send Kafka metrics to a set of topics on the same Kafka cluster.
To send metrics to a different cluster, or to configure specific authentication settings, configure the Kafka custom resource (CR):
kind: Kafka
spec:
metricReporter:
enabled: --- [1]
authentication:
type: --- [2]
jaasConfigPassThrough:
secretRef: --- [3]
directoryPathInContainer: --- [4]
tls:
enabled: --- [5]
[1] Set to
trueorfalseto enable or disable the metrics reporting.[2] Set to the authentication type to use for Kafka. See Configure authentication to access Kafka for details.
[3] Set to the Kubernetes Secret name used to authenticate to Kafka.
[4] Set to the directory path in the Kafka container where the Kafka authentication credentials are injected by Vault.
See Provide secrets for Confluent Platform component CR for providing the credential and required annotations when using Vault.
[5] Set to
trueif the Kafka cluster has TLS network encryption enabled.
Once Confluent Metrics Reporter is set up for a Kafka cluster, configure Control Center (Legacy) to monitor the cluster.
By default, Control Center (Legacy) is set up to monitor the Kafka cluster it is using to store its own state. This Kafka cluster is defined using spec.dependencies.kafka in the Confluent Control Center (Legacy) CR.
If there is another Kafka cluster to monitor, you can configure that in the Control Center (Legacy) CR as below:
kind: ControlCenter
spec:
monitoringKafkaClusters:
- name: --- [1]
bootstrapEndpoint: --- [2]
authentication:
type: —-- [3]
jaasConfig: —-- [4]
jaasConfigPassThrough: —-- [5]
oauthbearer: —-- [6]
secretRef: —-- [7]
directoryPathInContainer: --- [8]
tls:
enabled: —-- [9]
[1] Set to Kafka cluster name.
[2] Set to the Kafka bootstrap endpoint.
[3] Set to the Kafka client authentication type.
When RBAC is not enabled, valid options are
plainandmtls.When RBAC is enabled, the only valid option is
oauthbearer.[4] [5] For authenticating to a Kafka cluster using SASL/PLAIN, see Client-side SASL/PLAIN authentication for Kafka.
[6] When Confluent Control Center (Legacy) authorization type is set to RBAC (
spec.authorization.type: rbac) and the authentication type is set tooauthbearerin [3], use the OAuth method to authenticate to the Kafka cluster.[7] The username and password are loaded through secretRef. The expected key is
bearer.txt, and the value for the key is:username=<username> password=<password>
An example command to create a secret to use for this property:
kubectl create secret generic oauth-client \ --from-file=bearer.txt=/some/path/bearer.txt \ --namespace confluent
[8] The directory in the Confluent Control Center (Legacy) container where the expected Bearer credentials are injected by vault. See above ([7]) for the expected format.
See Provide secrets for Confluent Platform component CR for providing the credential and required annotations when using Vault.
[9] For authenticating to a Kafka cluster using mTLS, see Client-side mTLS authentication for Kafka.
Configure Control Center (Legacy) to monitor remote Kafka clusters
To monitor a Kafka cluster in a different Kubernetes cluster:
Configure the Control Center (Legacy) CR as described in monitoring additional Kafka clusters.
When RBAC is enabled, you must set
authentication.typetooauthbearer, and provideoauthbearercredentials in the Control Center (Legacy) CR.Configure the replication listener with external access on the remote Kafka as described in Configure Kafka in MRC with external access URLs.
If RBAC is enabled on the remote Kafka cluster, the Kafka token listener piggy backs on the replication listener.
Configure DNS where Control Center (Legacy) runs to be able to resolve the Kafka replication listener endpoint.
Configure Control Center (Legacy) to monitor ksqlDB, Connect and Schema Registry clusters
You can configure Control Center (Legacy) to provide a detailed monitoring or management view of ksqlDB, Connect, and Schema Registry clusters.
The following is an example of the dependencies section in a Control Center (Legacy) CR. The example connects two Schema Registry clusters, two ksqlDB clusters, and two Connect clusters to Control Center (Legacy):
kind: ControlCenter
spec:
dependencies:
schemaRegistry:
url: https://schemaregistry.confluent.svc.cluster.local:8081
tls:
enabled: true
authentication:
type: mtls
clusters:
- name: schemaregistry-dev
url: https://schemaregistry-dev.confluent.svc.cluster.local:8081
tls:
enabled: true
authentication:
type: mtls
ksqldb:
- name: ksql-dev
url: https://ksqldb.confluent.svc.cluster.local:8088
tls:
enabled: true
authentication:
type: mtls
- name: ksql-dev1
url: https://ksqldb-dev.confluent.svc.cluster.local:8088
tls:
enabled: true
authentication:
type: mtls
connect:
- name: connect-dev
url: https://connect.confluent.svc.cluster.local:8083
tls:
enabled: true
authentication:
type: mtls
- name: connect-dev2
url: https://connect-dev.confluent.svc.cluster.local:8083
tls:
enabled: true
authentication:
type: mtls
For an example scenario to configure Confluent Control Center (Legacy) to monitor multiple ksqlDB, Connect, and Schema Registry clusters, see Connect Control Center to Multiple Connect, ksqlDB, and Schema Registry Clusters.