Monitor Confluent Platform with Confluent for Kubernetes¶
Monitor your Confluent for Kubernetes (CFK) environment using the following tools and resources:
- Confluent Health+ with Telemetry
- JMX metrics monitoring integrations
- Confluent Control Center
Confluent Health+¶
Confluent Health+ provides ongoing, real-time analysis of performance and configuration data for your Confluent Platform deployment. From this analysis, Health+ sends out notifications to alert users to potential environmental issues before they become critical problems.
For more information, see Confluent Health+.
Telemetry Reporter¶
The Confluent Telemetry Reporter is a plugin that runs inside each Confluent Platform service to push metadata about the service to Confluent. Telemetry Reporter enables product features based on the metadata, like Health+. Data is sent over HTTP using an encrypted connection.
Telemetry is disabled by default in CFK. You can enable and configure it globally at the CFK level.
When you globally enable Telemetry, you have an option to disable for specific Confluent Platform components.
Each Confluent Platform component CR provides the status condition, in status.conditions
,
whether Telemetry is enabled or disabled.
For more information and supported settings for Telemetry Reporter, see Confluent Telemetry Reporter.
For a list of the metrics that are collected for Health+, see Telemetry Reporter Metrics.
Globally configure Telemetry¶
To globally enable Telemetry Reporter for CFK and all Confluent Platform components:
Set the followings in the CFK values file.
telemetry: operator: enabled: true --- [1] enabled: true --- [2]
- [1] Enable Telemetry Report for CFK.
- [2] Enable Telemetry Report for all Confluent Platform components.
Apply the change with the following command:
helm upgrade --install confluent-operator \ confluentinc/confluent-for-kubernetes \ --values <path-to-values-file> \ --namespace <namespace>
To globally configure the Telemetry Reporter settings for all Confluent Platform components:
Set the following in the CFK values file:
telemetry: secretRef: --- [1] directoryPathInContainer: --- [2]
[1] [2] CFK supports the
secretRef
anddirectoryPathInContainer
methods to load Telemetry configuration through Helm. Specify only one method.The Telemetry configuration should be specified in
telemetry.txt
, and the file must containapi.key
andapi.secret
.If using a proxy, additional properties are required as shown below.
api.key=<cloud_key> api.secret=<cloud_secret> proxy.url=<proxy_url> # Only required if proxy is enabled proxy.username=<proxy_username> # Only required if proxy requires credential proxy.password=<proxy_password> # Only required if proxy requires credential
[1]
secretRef
takes the precedence overdirectoryPathInContainer
if both are configured.The expected key is
telemetry.txt
.If the referenced
secretRef
is not read or data is not in the expected format, CFK will fail to start.[2] Provide the directory path where
telemetry.txt
is present.If
telemetry.txt
is not in the expected format, CFK will fail to start.
See Provide secrets for Confluent Platform application CR for providing the keys/values and required annotations when using Vault.
To apply changes in Telemetry settings, in the referenced Secret, or in the
telemetry.txt
file, manually restart CFK and Confluent Platform:Restart CFK:
kubectl rollout restart deployment/confluent-operator
Restart Confluent Platform components as described in Restart Confluent Platform Using Confluent for Kubernetes.
Disable Telemetry for a Confluent Platform component¶
To disable Telemetry for a specific Confluent Platform component, set the following in the
component CR and apply the change with the kubectl apply
command:
telemetry:
global: false
JMX Metrics¶
CFK deploys all Confluent components with JMX metrics enabled by default. These JMX metrics are made available on all pods at the following endpoints:
JMX metrics are available on port 7203 of each pod.
Jolokia (a REST interface for JMX metrics) is available on port 7777 of each pod.
JMX Prometheus exporter is available on port 7778.
Authentication / encryption is not supported for Prometheus exporter in CFK.
Configure security on JMX metrics endpoints¶
By default, authentication and encryption are not enabled on JMX/Prometheus metric endpoints, but you have options to configure authentication and TLS for JMX/Prometheus metric endpoints in the component CR:
spec:
metrics:
authentication:
type: --- [1]
prometheus: --- [2]
rules: --- [3]
- attrNameSnakeCase:
cache:
help:
labels:
name:
pattern:
type:
value:
valueFactor:
blacklist: --- [4]
whitelist: --- [5]
tls:
enabled: --- [6]
[1] Set to
mtls
for mTLS authentication.If you set this to
mtls
, you must settls.enabled: true
([6]).Note that CFK does not currently support mTLS authentication for Prometheus even with this set to
mtls
.[2] Specify Prometheus configurations to override the default settings.
See Prometheus for more information about the
rules
,blacklist
, andwhitelist
properties.[3] A list of rules to apply.
For example:
spec: metrics: prometheus: rules: - pattern: "org.apache.kafka.metrics<type=(\w+), name=(\w+)><>Value: (\d+)" name: "kafka_$1_$2" value: "$3" valueFactor: "0.000001" labels: "$1": "$4" "$2": "$3" help: "Kafka metric $1 $2" cache: false type: "GAUGE" attrNameSnakeCase: false
[4] An array of patterns (in the string format) to identify what not to query.
For example:
spec: metrics: prometheus: blacklist: - "org.apache.kafka.metrics:*"
[5] An array of patterns (in the string format) to identify what to query.
For example:
spec: metrics: prometheus: whitelist: - "org.apache.kafka.metrics:type=ColumnFamily,*"
[6] If set to
true
, metrics are configured with global or component TLS as described in Configure Network Encryption for Confluent Platform Using Confluent for Kubernetes.This setting is ignored for Prometheus. CFK currently does not support TLS for Prometheus.
Configure external access on JMX metrics¶
By default, CFK does not configure external access on JMX metric endpoints.
As an advanced configuration option, CFK supports enabling external access on JMX metrics endpoint as an advanced configuration option. You need this setup only if the Jolokia HTTP-based solution is not enough.
To enable external access for a Confluent Platform component at the component CR level, use the annotation:
Set up an Kubernetes external access service that allows incoming requests to an external FQDN to be routed to the pod’s JMX/Prometheus metric endpoint port. See Networking Service.
Apply the annotation on the Confluent Platform component CR using a comma-separated list of the DNS (or IP) and the port you got from the previous step when you created a service.
"platform.confluent.io/jmx-rmi-server-hostnames": "pod1.external-address:<port1>,pod2.external-address<port2>, ... ,podn.external-address<portn>"
For example, using the external IP address of the node,
35.35.35.35
and nodePorts30001 to 30003
, the following CR configures a node port external access on the Kafka nodes:kind: Kafka metadata: name: kafka namespace: operator annotations: "platform.confluent.io/jmx-rmi-server-urls": "35.35.35.35:30001,35.35.35.35:30002,35.35.35.35:30003" spec: replica: 3
Configure Prometheus and Grafana¶
You can configure Prometheus to capture and aggregate JMX metrics from Confluent components. Then you configure Grafana to visualize those metrics in a dashboard.
For example configuration scenarios, see an example of monitoring with Prometheus and Grafana in jmx-monitoring-stacks and an example of monitoring with Grafana metrics dashboard.
Confluent Control Center¶
Confluent Control Center is a web-based tool for managing and monitoring Confluent Platform. Control Center provides a user interface that enables developers and operators to:
- Get a quick overview of cluster health
- Observe and control messages, topics, and Schema Registry
- Develop and run ksqlDB queries
For the metrics available for monitoring, see Metrics available in Control Center.
Configure Control Center to monitor Kafka clusters¶
The Confluent Metrics Reporter collects various metrics from an Apache Kafka® cluster. Control Center then uses those metrics to provide a detailed monitoring view of the Kafka cluster.
By default, the Confluent Metrics Reporter is enabled and configured to send Kafka metrics to a set of topics on the same Kafka cluster.
To send metrics to a different cluster, or to configure specific authentication settings, configure the Kafka custom resource (CR):
kind: Kafka
spec:
metricReporter:
enabled: --- [1]
authentication:
type: --- [2]
jaasConfigPassThrough:
secretRef: --- [3]
directoryPathInContainer: --- [4]
tls:
enabled: --- [5]
[1] Set to
true
orfalse
to enable or disable the metrics reporting.[2] Set to the authentication type to use for Kafka. See Configure authentication to access Kafka for details.
[3] Set to the Kubernetes Secret name used to authenticate to Kafka.
[4] Set to the directory path in the Kafka container where the Kafka authentication credentials are injected by Vault.
See Provide secrets for Confluent Platform component CR for providing the credential and required annotations when using Vault.
[5] Set to
true
if the Kafka cluster has TLS network encryption enabled.
Once Confluent Metrics Reporter is set up for a Kafka cluster, configure Control Center to monitor the cluster.
By default, Control Center is set up to monitor the Kafka cluster it is using to
store its own state. This Kafka cluster is defined using
spec.dependencies.kafka
in the Confluent Control Center CR.
If there is another Kafka cluster to monitor, you can configure that in the Control Center CR as below:
kind: ControlCenter
spec:
monitoringKafkaClusters:
- name: --- [1]
bootstrapEndpoint: --- [2]
authentication:
type: —-- [3]
jaasConfig: —-- [4]
jaasConfigPassThrough: —-- [5]
oauthbearer: —-- [6]
secretRef: —-- [7]
directoryPathInContainer: --- [8]
tls:
enabled: —-- [9]
[1] Set to Kafka cluster name.
[2] Set to the Kafka bootstrap endpoint.
[3] Set to the Kafka client authentication type.
When RBAC is not enabled, valid options are
plain
andmtls
.When RBAC is enabled, the only valid option is
oauthbearer
.[4] [5] For authenticating to a Kafka cluster using SASL/PLAIN, see Client-side SASL/PLAIN authentication for Kafka.
[6] When Confluent Control Center authorization type is set to RBAC (
spec.authorization.type: rbac
) and the authentication type is set tooauthbearer
in [3], use the OAuth method to authenticate to the Kafka cluster.[7] The username and password are loaded through secretRef. The expected key is
bearer.txt
, and the value for the key is:username=<username> password=<password>
An example command to create a secret to use for this property:
kubectl create secret generic oauth-client \ --from-file=bearer.txt=/some/path/bearer.txt \ --namespace confluent
[8] The directory in the Confluent Control Center container where the expected Bearer credentials are injected by vault. See above ([7]) for the expected format.
See Provide secrets for Confluent Platform component CR for providing the credential and required annotations when using Vault.
[9] For authenticating to a Kafka cluster using mTLS, see Client-side mTLS authentication for Kafka.
Configure Control Center to monitor remote Kafka clusters¶
To monitor a Kafka cluster in a different Kubernetes cluster:
Configure the Control Center CR as described in monitoring additional Kafka clusters.
When RBAC is enabled, you must set
authentication.type
tooauthbearer
, and provideoauthbearer
credentials in the Control Center CR.Configure the replication listener with external access on the remote Kafka as described in Configure Kafka in MRC with external access URLs.
If RBAC is enabled on the remote Kafka cluster, the Kafka token listener piggy backs on the replication listener.
Configure DNS where Control Center runs to be able to resolve the Kafka replication listener endpoint.
Configure Control Center to monitor ksqlDB, Connect and Schema Registry clusters¶
You can configure Control Center to provide a detailed monitoring or management view of ksqlDB, Connect, and Schema Registry clusters.
The following is an example of the dependencies
section in a Control Center CR.
The example connects two Schema Registry clusters, two ksqlDB clusters, and two
Connect clusters to Control Center:
kind: ControlCenter
spec:
dependencies:
schemaRegistry:
url: https://schemaregistry.confluent.svc.cluster.local:8081
tls:
enabled: true
authentication:
type: mtls
clusters:
- name: schemaregistry-dev
url: https://schemaregistry-dev.confluent.svc.cluster.local:8081
tls:
enabled: true
authentication:
type: mtls
ksqldb:
- name: ksql-dev
url: https://ksqldb.confluent.svc.cluster.local:8088
tls:
enabled: true
authentication:
type: mtls
- name: ksql-dev1
url: https://ksqldb-dev.confluent.svc.cluster.local:8088
tls:
enabled: true
authentication:
type: mtls
connect:
- name: connect-dev
url: https://connect.confluent.svc.cluster.local:8083
tls:
enabled: true
authentication:
type: mtls
- name: connect-dev2
url: https://connect-dev.confluent.svc.cluster.local:8083
tls:
enabled: true
authentication:
type: mtls
For an example scenario to configure Confluent Control Center to monitor multiple ksqlDB, Connect, and Schema Registry clusters, see Connect Control Center to Multiple Connect, ksqlDB, and Schema Registry Clusters.