Control Center Alerts Overview¶
Control Center enables you to detect anomalous events in the monitoring data and configure alerts to occur when those events are detected. For example, you can detect when a cluster goes down and configure an email to be sent.
Note
When RBAC is enabled in Control Center, there are some nuances to trigger, action, and alert access. For details, see About alerts access.
To learn how to access the Alerts page, see Access Control Center Alerts and Alert History.
Important
If you are operating your cluster in KRaft mode, controllers are currently reported as brokers, and alerts may not function as expected. For more information, see KRaft limitations and known issues.
Triggers and actions¶
An alert consists of a trigger and one or more actions.
The component type you can choose for a Trigger, and the metrics you can define for the trigger depends on whether you are running Control Center in Normal mode or Reduced infrastructure mode. When you create a trigger, incompatible component types and metrics are filtered out.
Trigger component types that are only compatible with Normal mode:
- Topics
- Brokers
Trigger component types compatible with both Normal mode mode and Reduced infrastructure mode:
- Consumer groups
- Clusters
Each trigger is based on a metric with condition value criteria that determines when the trigger should fire. For more about Triggers, see View and Manage Control Center Alert Triggers.
Any actions associated with the trigger are executed when the criteria is met.
Supported actions for a trigger include:
- Sending an email notification to one or more accounts
- Sending a Slack webhook notification
- Sending a PagerDuty webhook notification that creates an incident ticket
A trigger can be associated with any number of defined actions. For more about actions, see Manage Alert Actions in Control Center.
To learn how to configure some example triggers and actions, see Example Triggers and Actions for Control Center Alerts.
When a trigger fires, it executes all its associated enabled actions for which
the Max send rate
has not been exceeded. If the Max send rate
of a particular
action has been exceeded, the trigger event is added to a queue associated
with the action and is included in the action event the next time it is executed
(actions can report a set of triggers, not just one trigger).
Note
Queuing does not occur when alerts (actions) are paused because triggers are ignored during that interim.
The maximum triggered events per alert (default: 1000) is controlled by
the confluent.controlcenter.max.trigger.events.per.alert.config
option.
Detection of anomalous events (triggering criteria) is decoupled from the alert actions that are taken when a triggering event occurs. This means that triggers and actions are defined independently, which provides flexibility when setting one or more actions to perform when a trigger fires.
Each time interceptor data is received by Control Center, metric values (such as consumption difference and latency) of the corresponding time windows are updated to reflect the new data. All newly updated metric values are then checked against all configured triggers to determine whether a trigger should fire.
Note
Interceptors can conceivably report data related to any time - alerting works across all time windows, not just those near real time.
Buffer for consumer group triggers (deprecated)¶
Tip
The buffer feature for this trigger has been deprecated. It will be removed from Control Center in a future release. Do not rely on the buffer value.
Triggers for consumer groups have an associated buffer value. The buffer enables you to require an alertable state to persist for a configurable period of time to alleviate prematurely activating a consumer group trigger.