Monitor Confluent Platform with Confluent for Kubernetes¶
Monitor your Confluent for Kubernetes (CFK) environment using the following tools and resources:
- Confluent Health+ with Telemetry
- JMX metrics monitoring integrations
- Confluent Control Center
Confluent Health+¶
Confluent Health+ provides ongoing, real-time analysis of performance and configuration data for your Confluent Platform deployment. From this analysis, Health+ sends out notifications to alert users to potential environmental issues before they become critical problems.
For more information, see Confluent Health+.
Telemetry Reporter¶
The Confluent Telemetry Reporter is a plugin that runs inside each Confluent Platform service to push metadata about the service to Confluent. Telemetry Reporter enables product features based on the metadata, like Health+. Data is sent over HTTP using an encrypted connection.
Telemetry is disabled by default in CFK. You can enable and configure it globally at the CFK level.
When you globally enable Telemetry, you have an option to disable for specific Confluent Platform components.
Each Confluent Platform component CR provides the status condition, in status.conditions
,
whether Telemetry is enabled or disabled.
For more information and supported settings for Telemetry Reporter, see Confluent Telemetry Reporter.
For a list of the metrics that are collected for Health+, see Telemetry Reporter Metrics.
Globally configure Telemetry¶
To globally enable Telemetry Reporter for all Confluent Platform components:
Set the following in the CFK values file:
telemetry: enabled: true
Apply the change with the following command:
helm upgrade --install confluent-operator \ confluentinc/confluent-for-kubernetes \ --values <path-to-values-file> \ --namespace <namespace>
To globally configure the Telemetry Reporter settings for all Confluent Platform components:
Set the following in the CFK values file:
telemetry: secretRef: --- [1] directoryPathInContainer: --- [2]
[1] [2] CFK supports the
secretRef
anddirectoryPathInContainer
methods to load Telemetry configuration through Helm.[1]
secretRef
takes the precedence overdirectoryPathInContainer
if both are configured.The
secretRef
must contain the following:telemetry.txt : |- api.key=<cloud_key> api.secret=<cloud_secret> proxy.url=<proxy_url> proxy.username=<proxy_username> proxy.password=<proxy_password>
If the referenced
secretRef
is not read or data is not in the expected format, CFK will fail to start.[2] Provide the mount path or directory path where
telemetry.txt
is present. Thetelemetry.txt
should contain the following:api.key=<cloud_key> api.secret=<cloud_secret> proxy.url=<proxy_url> proxy.username=<proxy_username> proxy.password=<proxy_password>
If
telemetry.txt
is not in the expected format, CFK will fail to start.
To apply changes in Telemetry settings, in the referenced Secret, or in the
telemetry.txt
file, manually restart CFK and Confluent Platform:Restart CFK:
kubectl rollout restart deployment/confluent-operator
Restart a Confluent Platform component:
kubectl rollout restart sts/<name>
See Restart Confluent Platform Cluster for looking up
<name>
of a component.
Disable Telemetry for a Confluent Platform component¶
To disable Telemetry for a specific Confluent Platform component, set the following in the
component CR and apply the change with the kubectl apply
command:
telemetry:
global: false
JMX Metrics¶
CFK deploys all Confluent components with JMX metrics enabled by default. These JMX metrics are made available on all pods at the following endpoints:
JMX metrics are available on port 7203 of each pod.
Jolokia (a REST interface for JMX metrics) is available on port 7777 of each pod.
JMX Prometheus exporter is available on port 7778.
Authentication / encryption is not supported for Prometheus exporter.
Configure security on JMX metrics endpoints¶
By default, authentication, encryption, and external access are not provided on JMX/Prometheus metric endpoints, but you have options to configure authentication, TLS, and external access for JMX/Prometheus metric endpoints at the component CR level:
spec:
metrics:
authentication:
type: --- [1]
prometheus: --- [2]
rules: --- [3]
- attrNameSnakeCase:
cache:
help:
labels:
name:
pattern:
type:
value:
valueFactor:
blackList: --- [4]
whiteList: --- [5]
tls:
enabled: --- [6]
[1] Set to
mtls
for mTLS authentication.If you set this to
mtls
, you must settls.enabled: true
([3]).[2] Specify Prometheus configurations to override the default settings.
See Prometheus for more information about the
rules
,blacklist
, andwhitelist
properties.[3] A list of rules to apply.
For example:
spec: metrics: prometheus: rules: - pattern: "org.apache.kafka.metrics<type=(\w+), name=(\w+)><>Value: (\d+)" name: "kafka_$1_$2" value: "$3" valueFactor: "0.000001" labels: "$1": "$4" "$2": "$3" help: "Kafka metric $1 $2" cache: false type: "GAUGE" attrNameSnakeCase: false
[4] A pattern to identify what not to query.
For example:
spec: metrics: prometheus: blackist: "org.apache.kafka.metrics:*"
[5] A pattern to identify what to query.
For example:
spec: metrics: prometheus: whitelist: "org.apache.kafka.metrics:type=ColumnFamily,*"
[6] If set to
true
, metrics are configured with global or component TLS as described in Configure Network Encryption with Confluent for Kubernetes.
Configure Prometheus and Grafana¶
You can configure Prometheus to capture and aggregate JMX metrics from Confluent components. Then you configure Grafana to visualize those metrics in a dashboard.
For an example configuration scenario, see Monitoring with Prometheus and Grafana.
Confluent Control Center¶
Confluent Control Center is a web-based tool for managing and monitoring Confluent Platform. Control Center provides a user interface that enables developers and operators to:
- Get a quick overview of cluster health
- Observe and control messages, topics, and Schema Registry
- Develop and run ksqlDB queries
For the metrics available for monitoring, see Metrics available in Control Center.
Configure Control Center to monitor Kafka clusters¶
The Confluent Metrics Reporter collects various metrics from an Apache Kafka® cluster. Control Center then uses those metrics to provide a detailed monitoring view of the Kafka cluster.
By default, the Confluent Metrics Reporter is enabled and configured to send metrics for the Kafka cluster to a set of topics on the same Kafka cluster.
To send metrics to a different cluster, or to configure specific authentication settings, configure the Kafka custom resource (CR):
metricReporter:
enabled: --- [1]
authentication:
type: --- [2]
jaasConfigPassThrough:
secretRef: --- [3]
tls:
enabled: --- [4]
- [1] Set to
true
orfalse
to enable or disable the metrics reporting. - [2] Set to the authentication type to use for Kafka. See Configure authentication to access Kafka for details.
- [3] Set to the Kubernetes Secret name used to authenticate to Kafka.
- [4] Set to
true
if the Kafka cluster has TLS network encryption enabled.
Once Confluent Metrics Reporter is setup for a Kafka cluster, configure Control Center to monitor the cluster.
By default, Control Center is set up to monitor the Kafka cluster it is using to store its own state.
If there is another Kafka cluster to monitor, you can configure that in the Control Center CR as below:
spec:
monitoringKafkaClusters:
- name: --- [1]
bootstrapEndpoint: --- [2]
- [1] Set to Kafka cluster name.
- [2] Set to the Kafka bootstrap endpoint.
Configure Control Center to monitor ksqlDB and Connect clusters¶
You can configure Control Center to provide a detailed monitoring view of ksqlDB and Connect clusters.
For an example to configure Confluent Control Center to monitor multiple ksqlDB and Connect clusters, see Connect Control Center to Multiple Connect and ksqlDB Clusters.