Enable Health+

Health+ provides ongoing, real-time analysis of performance and configuration data for your Confluent Platform deployment. From this analysis, Health+ sends out notifications to alert users to potential environmental issues before they become critical problems. Also, the Confluent Support team uses the collected metadata to enhance the support experience with context around the performance and usage of the deployment to lower the time to resolution for support issues.

Prerequisites

Enable Telemetry

After you sign up for Confluent Health+, get started by enabling Telemetry.

Subscribe to notifications

Subscribe to notifications in the Confluent Cloud Console by navigating to the Health+ Overview page and creating a new notification subscription.

You can send notifications to a specified integration, like Slack and email, or to a generic webhook to integrate with a custom application.

Webhook

When sending notifications using a webhook to a custom application, expect the following JSON schema for the webhook payload.

{
  id: string [required]
  rule_id: string [required]
  severity: string [ENUM: INFO, WARN, CRITICAL, ERROR] [required]
  status: string [ENUM: RAISED/CLEAR] [required]
  title: varchar[40]
  message: string
  created_at: string
}

Here’s an example payload:

{
  "id": "08157a8e-4fe0-4465-9854-711144790d76",
  "rule_id": "AlertRequestHandlerIdle",
  "severity": "CRITICAL",
  "status": "RAISED",
  "title": "ClusterId:WHdCF0FZQLSPRu8vItiAaA - RequestHandler idle alert",
  "message": "Cluster with ID WHdCF0FZQLSPRu8vItiAaA\n\nRequestHandlerAvgIdlePercent falls under the range 5.05 and 15.05\n\nCurrent Value = 7.05\n\nAction Recommended:\n\nSuggested Action:\n\nFurther Reading:\n\nDocumentation link\n\nQuestions? Please reach out to Confluent Support and reference this alert",
  "created_at": 1600214419660
}

Intelligent alerts

Confluent adds new Health+ rules on an ongoing basis as new optimizations are discovered, with no user intervention required. This ensures Confluent Platform deployments that have enabled Health+ benefit from Confluent expertise continuously.

Health+ rules are evaluated in real time and are intended to alert users of potential issues within minutes.

Data collection and processing

If enabled for Health+ purposes, Confluent Telemetry Reporter sends the following types of information back to Confluent.

  • Performance statistics internal to each Confluent Platform service
  • System utilization statistics
  • Cluster IDs
  • Topic names
  • Host names
  • Version information
  • Connector types
  • ksqlDB application IDs

This level of metadata is necessary to provide the Health+ service and enable the Confluent Support team to assist with issues efficiently and effectively. The data flowing through the topics in the Confluent Platform deployment is never collected.

You can view a representative set of metrics by consuming from the _confluent-telemetry-metrics topic in your Confluent Platform deployment.

For a complete list of Telemetry Reporter metrics, see Telemetry Reporter Metrics.