Troubleshoot Confluent for Kubernetes

This topic provides information about troubleshooting your Confluent Platform deployment.

Support bundle

To provide Confluent a support bundle of all the required information for debugging, run the following command, using Confluent plugin.

kubectl confluent support-bundle --namespace <namespace>

Confluent for Kubernetes aggregates information, such as events, Kubernetes versions, the log, Confluent APIs status, in a tar.gz file for you to upload to the Confluent Support site.

Logs

Logs are sent directory to STDOUT for each pod. Use the command below to view the logs for a pod:

kubectl logs <pod-name> -n <namespace>

Metrics

  • JMX metrics are available on port 7203 of each pod.
  • Jolokia (a REST interface for JMX metrics) is available on port 7777 of each pod.

Debug

There are several types of problems that can go wrong while using Confluent for Kubernetes (CFK):

  • A problem happens while deploying CFK.

  • A problem exists at the infrastructure level.

    Something has gone wrong at the Kubernetes layer.

  • A problem exists at the application level.

    The infrastructure is fine but something has gone wrong with Confluent Platform itself. Typically, this is caused by how Confluent Platform components were configured.

To debug deployment problems, run the Helm install command with the --set debug="true" to enable verbose output:

helm upgrade --install confluent-operator \
  confluentinc/confluent-for-kubernetes \
  --namespace <namespace>
  --set debug="true"

Look for Kubernetes issues first, then debug Confluent Platform.

  1. Check for potential Kubernetes errors by entering the following command:

    kubectl get events -n <namespace>
    
  2. Check for a specific resource issue, enter the following command (using the resource type example pods):

    kubectl describe pods <podname> -n <namespace>
    
  3. If everything looks okay after running the commands above, check the individual pod logs using the following command:

    kubectl logs <pod name> -n <namespace>
    

    Confluent Platform containers are configured so application logs are printed to STDOUT. The logs can be read directly with this command. If there is anything wrong at the application level, like an invalid configuration, this will be evident in the logs.

    Note

    If a pod has been replaced because it crashed and you want to check the previous pod’s logs, add --previous to the end of the command above.

Troubleshooting problems caused by the datacenter infrastructure, such as virtual machine (VM) firewall rules, DNS configuration, etc., should be resolved by infrastructure system administrator.