Enable Health+

Health+ provides ongoing, real-time analysis of performance and configuration data for your Confluent Platform deployment. From this analysis, Health+ sends out notifications to alert users to potential environmental issues before they become critical problems. Also, the Confluent Support team uses the collected metadata to enhance the support experience with context around the performance and usage of the deployment to lower the time to resolution for support issues.

Prerequisites

The following steps show how to enable telemetry for Health+.

  1. Sign up for Confluent Health+, if you haven’t already.
  2. Follow the steps in Telemetry Reporter installation to enable telemetry.
  3. Log in to Confluent Cloud.
  4. Navigate to Clusters to open the Health+ clusters page.
  5. Ensure that your cluster is in the Running state.
  6. In the Health+ clusters page, you can click your cluster to view telemetry details, and you can click Intelligent alerts to add a new notification to email, Slack, or a webhook.

Subscribe to notifications

Subscribe to notifications in the Confluent Cloud Console by navigating to the Health+ Notifications page and creating a new notification subscription.

You can send notifications to a specified integration, like Slack and email, or to a generic webhook to integrate with a custom application.

Webhook

When sending notifications using a webhook to a custom application, expect the following JSON schema for the webhook payload.

{
  id: string [required]
  rule_id: string [required]
  severity: string [ENUM: INFO, WARN, CRITICAL, ERROR] [required]
  status: string [ENUM: RAISED/CLEAR] [required]
  title: varchar[40]
  message: string
  created_at: string
}

Here’s an example payload:

{
  "id": "08157a8e-4fe0-4465-9854-711144790d76",
  "rule_id": "AlertRequestHandlerIdle",
  "severity": "CRITICAL",
  "status": "RAISED",
  "title": "ClusterId:WHdCB0BZQLSPRu8vItiAaA - RequestHandler idle alert",
  "message": "Cluster with ID WHdCB0BZQLSPRu8vItiAaA\n\nRequestHandlerAvgIdlePercent falls under the range 5.05 and 15.05\n\nCurrent Value = 7.05\n\nAction Recommended:\n\nSuggested Action:\n\nFurther Reading:\n\nDocumentation link\n\nQuestions? Please reach out to Confluent Support and reference this alert",
  "created_at": 1600214419660
}

Intelligent alerts

Confluent adds new Health+ rules on an ongoing basis as new optimizations are discovered, with no user intervention required. This ensures Confluent Platform deployments that have enabled Health+ benefit from Confluent expertise continuously.

Health+ rules are evaluated in real time and are intended to alert users of potential issues within minutes.

Data collection and processing

If enabled for Health+ purposes, Confluent Telemetry Reporter sends the following types of information back to Confluent.

  • Performance statistics internal to each Confluent Platform service
  • System utilization statistics
  • Cluster IDs
  • Topic names
  • Host names
  • Version information
  • Connector types
  • ksqlDB application IDs

This level of metadata is necessary to provide the Health+ service and enable the Confluent Support team to assist with issues efficiently and effectively. The data flowing through the topics in the Confluent Platform deployment is never collected.

You can view a representative set of metrics by consuming from the _confluent-telemetry-metrics topic in your Confluent Platform deployment.

For a complete list of Telemetry Reporter metrics, see Telemetry Reporter Metrics.