An alert consists of a trigger and one or more actions. Triggers can be defined for topics, brokers, consumer groups, and clusters. Each trigger is based on a metric with condition value criteria that determines when the trigger should fire. Any actions associated with the trigger are executed when the criteria is met.
Supported actions include:
- Sending an email notification to one or more accounts
- Sending a Slack webhook notification
- Sending a PagerDuty webhook notification that creates an incident ticket
A trigger can be associated with any number of defined actions. When a trigger
fires, it executes all its
associated enabled actions for which the
Max send rate has not been
exceeded. If the max send rate of a particular action has been exceeded, the
trigger event is
added to a list queue associated with the action and is included in the action
event the next time it
is executed (actions can report a set of triggers, not just one trigger).
Queuing does not occur when alerts (actions) are paused because triggers are ignored during that interim.
The maximum triggered events per alert (default: 1000) is controlled by
Detection of anomalous events (triggering criteria) is decoupled from the alert actions that should be taken when a triggering event occurs. This means that triggers and actions are defined independently, which provides flexibility when setting one or more actions to perform when a trigger fires.
Each time interceptor data is received by Control Center, metric values (such as consumption difference and latency) of the corresponding time windows are updated to reflect the new data. All newly updated metric values are then checked against all configured triggers to determine whether a trigger should fire.
Interceptors can conceivably report data related to any time - alerting works across all time windows, not just those near real time.
Buffer for consumer group triggers¶
Because of normal lag in the system, time windows close to real time will frequently have associated metric values that would be cause for concern if the time window was further behind real time. For this reason, triggers for consumer groups have an associated buffer value. The buffer allows you to require an alertable state to persist for a configurable period of time to alleviate prematurely activating a consumer group trigger.
A triggered event that is within
Buffer seconds of real time is not
immediately registered against
actions. When the time window ultimately moves greater than the buffer seconds
real time, any associated metric value that would still cause a trigger to be
fired is then
registered against any appropriate actions.
Setting a condition for the buffer in seconds is only applicable to consumer group triggers.