Troubleshoot Confluent Operator¶
The following sections provide information about troubleshooting your Confluent Platform deployment.
Logs¶
Logs are sent directory to STDOUT for each pod. Use the command below to view the logs for a pod:
kubectl logs <pod-name> -n <namespace>
Metrics¶
- JMX metrics are available on port 7203 of each pod.
- Jolokia (a REST interface for JMX metrics) is available on port 7777 of each pod.
Debug¶
There are two types of problems that can go wrong while using Operator:
- A problem exists at the infrastructure level. That is, something has gone wrong at the Kubernetes layer.
- A problem exists at the application level. This means that the infrastructure is fine but something has gone wrong with Confluent Platform itself, usually in how something is configured.
You should look for Kubernetes issues first.
Check for potential Kubernetes errors by entering the following command:
kubectl get events -n <namespace>
Then, check for a specific resource issue, enter the following command (using the resource type example pods):
kubectl describe pods <podname> -n <namespace>
If everything looks okay after running the commands above, check the individual pod logs using the following command:
kubectl logs <pod name> -n <namespace>
Confluent Platform containers are configured so application logs go straight to STDOUT. The logs can be read directly with this command. If there is anything wrong at the application level, like an invalid configuration, this will be evident in the logs.
Note
If a pod has been replaced because it crashed and you want to check the
previous pod’s logs, add --previous
to the end of the command above.
Troubleshooting problems caused by the datacenter infrastructure, such as virtual machine (VM) firewall rules, DNS configuration, etc., should be resolved by infrastructure system administrator.
Test the deployment¶
Test and validate your deployment as described in the following sections.
Internal validation¶
Complete the following steps to validate internal communication.
On your local machine, enter the following command to display cluster namespace information (using the example namespace operator). This information contains the bootstrap endpoint you need to complete internal validation.
kubectl get kafka -n operator -oyaml
The bootstrap endpoint is shown on the
bootstrap.servers
line.... omitted internalClient: |- bootstrap.servers=kafka:9071
On your local machine, use kubectl exec to start a bash session on one of the pods in the cluster. The example uses the default pod name
kafka-0
on a Kafka cluster using the default namekafka
.kubectl -n operator exec -it kafka-0 bash
On the pod, create and populate a file named
kafka.properties
. There is no text editor installed in the containers, so you use the cat command as shown below to create this file. UseCTRL+D
to save the file.Note
The example shows default SASL/PLAIN security parameters. A production environment requires additional security. See Configure Security with Confluent Operator for additional information.
cat << EOF > kafka.properties bootstrap.servers=kafka:9071 sasl.jaas.config=org.apache.kafka.common.security.plain.PlainLoginModule required username="test" password="test123"; sasl.mechanism=PLAIN security.protocol=SASL_PLAINTEXT EOF
On the pod, query the bootstrap server using the following command:
kafka-broker-api-versions --command-config kafka.properties --bootstrap-server kafka:9071
You should see output for each of the three Kafka brokers that resembles the following:
kafka-1.kafka.operator.svc.cluster.local:9071 (id: 1 rack: 0) -> ( Produce(0): 0 to 7 [usable: 7], Fetch(1): 0 to 10 [usable: 10], ListOffsets(2): 0 to 4 [usable: 4], Metadata(3): 0 to 7 [usable: 7], LeaderAndIsr(4): 0 to 1 [usable: 1], StopReplica(5): 0 [usable: 0], UpdateMetadata(6): 0 to 4 [usable: 4], ControlledShutdown(7): 0 to 1 [usable: 1], OffsetCommit(8): 0 to 6 [usable: 6], OffsetFetch(9): 0 to 5 [usable: 5], FindCoordinator(10): 0 to 2 [usable: 2], JoinGroup(11): 0 to 3 [usable: 3], Heartbeat(12): 0 to 2 [usable: 2], ... omitted
This output validates internal communication within your cluster.
External validation¶
Take the following steps to validate external communication after you have enabled external access to Kafka and added DNS entries as described in External access to Kafka.
Note
The examples use default Confluent Platform component names and the default Kafka bootstrap prefix, kafka
.
On your local machine, download the Confluent Platform. You only need to download and set the
PATH
and required environment variables to use Confluent CLI. You do not need to start Confluent Platform on your local machine.You use the Confluent CLI running on your local machine to complete external validation. The Confluent CLI is included with the Confluent Platform.
On your local machine, run the command to get the bootstrap servers endpoint for external clients.
kubectl get kafka -n operator -oyaml
In the example output below, the bootstrap server endpoint is
kafka.mydomain:9092
.... omitted externalClient: |- bootstrap.servers=kafka.mydomain:9092 sasl.jaas.config=org.apache.kafka.common.security.plain.PlainLoginModule required username="test" password="test123"; sasl.mechanism=PLAIN security.protocol=SASL_PLAINTEXT
On your local machine where you have the Confluent Platform running locally, create and populate a file named
kafka.properties
with the following content. Assign the external endpoint you retrieved in the above step tobootstrap.servers
.bootstrap.servers=<kafka bootstrap endpoint> sasl.jaas.config=org.apache.kafka.common.security.plain.PlainLoginModule required username="test" password="test123"; sasl.mechanism=PLAIN security.protocol=SASL_PLAINTEXT
Note
The example shows default SASL/PLAIN security parameters. A production environment requires additional security. See Configure Security with Confluent Operator for additional information.
On your local machine, create a topic using the bootstrap endpoint
<kafka bootstrap endpoint>
. The example below creates a topic with 1 partition and 3 replicas.kafka-topics --bootstrap-server <kafka bootstrap endpoint> \ --command-config kafka.properties \ --create --replication-factor 3 \ --partitions 1 --topic example
On your local machine, produce to the new topic using the bootstrap endpoint
<kafka bootstrap endpoint>
. Note that the bootstrap server endpoint is the only Kafka broker endpoint required because it provides gateway access to all Kafka brokers.seq 10000 | kafka-console-producer \ --topic example --broker-list <kafka bootstrap endpoint> \ --producer.config kafka.properties
In a new terminal on your local machine, from the directory you put
kafka.properties
, issue the Confluent CLI command to consume from the new topic.kafka-console-consumer --from-beginning \ --topic example --bootstrap-server <kafka bootstrap endpoint> \ --consumer.config kafka.properties
Successful completion of these steps validates external communication with your cluster.