Monitoring policies
The rate and number of incoming events serve as an important indicator of the state of the system. For example, you can detect when there are too many events, too few, or none at all. Monitoring policies are designed to detect such situations. In a policy, you can specify a lower threshold, an optional upper threshold, and the way the events are counted: by frequency or by total number.
The policy must be applied to the event source. You can apply one or more monitoring policies to a source. After applying the policy, you can monitor the status of the source on the List of event sources tab.
Policies for monitoring the sources of events are displayed in the table under Source status → Monitoring policies. You can sort the table by clicking the column header of the relevant setting. The maximum size of the policy list is not limited.
In the Sources column, you can click the Show button to view all event sources to which the policy is applied. When you click this button, you are taken to the List of event sources section, and the table of sources is filtered by the selected policy.
Algorithm of monitoring policies
Monitoring policies are applied to an event source in accordance with the following algorithm:
- The event stream is counted at the collector.
- The KUMA Core server gets information about the stream from the collectors every 15 seconds.
- The obtained data is stored on the KUMA Core server in the Victoria Metrics time series database, and the data storage depth on the KUMA Core server is 15 days.
- An inventory of event sources is taken once per minute.
- The stream is counted separately for each event source in accordance with the following rules:
- If a monitoring policy is applied to the event source, the displayed maximum number of events is calculated in accordance with the currently applied monitoring policies for the time interval specified in the policy.
Depending on the policy type, the number of the event stream is counted as the number of events (for the byCount policy type) or as the number events per second (EPS, for the byEPS policy type). You can look up how the stream is counted for the applied policy in the Stream column on the List of event sources page.
- If no monitoring policy is applied to the event source, the number for the event stream corresponds to the last value.
- If a monitoring policy is applied to the event source, the displayed maximum number of events is calculated in accordance with the currently applied monitoring policies for the time interval specified in the policy.
- Once a minute, the application checks if any monitoring policies exist that must be applied to event sources or stopped according to the monitoring policy schedule.
- Once a minute, the stream of events is checked for compliance with policy settings.
If the event stream from the source crosses the thresholds specified in the monitoring policy, information about this is recorded in the following way:
- A notification about a monitoring policy getting triggered is sent to the email addresses specified in the policy. For each policy, you can also configure a notification template.
- A stream monitoring informational event of type
5
(Type=5
) is generated. The fields of the event are described in the table below.Fields of the monitoring event
Event field name
Field value
ID
Unique ID of the event.
Timestamp
Event time.
Type
Type of the audit event. For the audit event, the value is
5
(monitoring).Name
Name of the monitoring policy.
DeviceProduct
KUMA
DeviceCustomString1
The value from the
value
field in the notification. Displays the value of the metric for which the notification was sent.
The generated monitoring event is sent to the following resources:
- All storages of the Main tenant
- All correlators of the Main tenant
- All correlators of the tenant in which the event source is located
Adding a monitoring policy
To add a new monitoring policy:
- In the KUMA Console, under Source status → Monitoring policies, click Add policy and configure the monitoring policy in the displayed window:
- In the Policy name field, enter a unique name for the policy you are creating. The name must contain 1 to 128 Unicode characters.
We recommend choosing a name that reflects the configured schedule of the monitoring policy.
- In the Tenant drop-down list, select the tenant that will own the policy. Your tenant selection determines the specific sources of events that can covered by the monitoring policy.
- In the Type field, select one of the following monitoring policy types:
- by count—by the number of events over a certain period of time.
- by EPS—by the number of events per second (EPS) over a certain period of time. The average value over the entire period is calculated. You can additionally track spikes during specific periods.
- In the Count interval field, specify the period during which the monitoring policy must take into account the data from the monitoring source. You can use the drop-down list on the right to select a value in minutes, hours, or days. The maximum value is 14 days.
- If you selected the by EPS policy type, in the Control interval, minutes field, specify the control time interval (in minutes) within which the number of events must cross the threshold for the monitoring policy to trigger:
- If, during this time period, all checks (performed once per minute) find that the stream is crossing the threshold, the monitoring policy is triggered.
- If, during this time period, one of the checks (performed once per minute) finds that the stream is within the thresholds, the monitoring policy is not triggered, and the count of check results is reset.
If you do not specify the frequency of measurement, the monitoring policy is triggered immediately after the stream is found to cross the threshold.
- In the Lower limit and Upper limit fields, define the boundaries representing normal behavior. Deviations outside of these boundaries will trigger the monitoring policy, create an alert, and forward notifications.
The Lower limit setting is required.
- In the Evaluation interval field, specify the frequency with which the VMalert service will query VictoriaMetrics for policy data while the policy is being applied to the event source. You can use the drop-down list on the right to select a value in minutes, hours, or days. The default interval is 5 minutes.
When specifying the evaluation interval, keep in mind the policy schedule. For example, if you configured the policy to be applied once every few hours, we do not recommend configuring a short interval and causing excessive load on VictoriaMetrics.
- If necessary, in the Send notifications field, specify the email addresses to which notifications about the activation of the KUMA monitoring policy must be sent. To add an address, enter it in the field and press Enter or click Add. You can specify multiple email addresses.
To forward notifications, you must configure a connection to the SMTP server.
- In the Notification template drop-down list, select the template that you want to use for notifications. If necessary, click the Create new button to start creating a new notification template.
By default, the basic notification template is selected. You can reset the template selection and switch to the base template by clicking the X icon.
- In the Schedule section, configure how often you want to apply the monitoring policy to event sources. By default, the policy is applied every week, every day, from 00:00 to 23:59. To configure the monitoring policy schedule, do some of the following:
- If you want to apply the monitoring policy weekly on specific days of the week:
- Enable the Configure schedule by days of the week toggle switch.
- In the Days of the week drop-down list, select the days of the week on which you want the policy to be applied to the source.
If you want to clear the selection, click the X icon.
- In the Time field, specify the start and end time of the policy, with minute precision.
The policy applicability interval is inclusive of its bounds; for example, if the end time is set to 23:59, the policy will be applied until 23:59:59.999. The default interval is 00:00 to 23:59. The start time must be earlier than the end time.
- If you want to add another period, click the Add period button and repeat steps 'b' and 'c'.
You can add any number of periods.
- If you want to apply the monitoring policy weekly on specific calendar dates:
- Enable the Configure schedule by days of the month toggle switch.
- Click the Days of the month field and use the calendar to select the dates on which you want to apply the policy to the source. You can select a period of several days or an individual day. The start date of the period must be earlier than the end date of the period.
The dates are configured without a year value, so the policy will be applied annually on the specified days until you delete this period. If you want to clear the selection, click the X icon.
- In the Time field, specify the start and end time of the policy, with minute precision.
The policy applicability interval is inclusive of its bounds; for example, if the end time is set to 23:59, the policy will be applied until 23:59:59.999. The default interval is 00:00 to 23:59. The start time must be earlier than the end time.
- If you want to add another period, click the Add period button and repeat steps 'b' and 'c'.
You can add any number of periods.
If you applied a schedule by day of the week and by day of the month at the same time, the day-of-the-month policy is applied first.
- If you want to apply the monitoring policy weekly on specific days of the week:
- Click Add.
The monitoring policy will be added.
Editing monitoring policies
The Source status → Monitoring policies section displays the added monitoring policies and their settings that you specified when creating the policy. You can click a policy to display a sidebar with all of its settings. If necessary, you can edit the policy settings in this sidebar.
If a monitoring policy is applied to an event source, if you edit certain policy settings, you may need to update the policy to apply the changes. Every 30 minutes, KUMA checks if any monitoring policies require updating, and if that is the case, it automatically runs a task to update those monitoring policies. You can also run the update task manually by clicking the Update policy button at the top of the table. One task updates all policies that need updating.
The Update policy button becomes active only if some monitoring policies need updating. Information about whether the policy needs updating is displayed in the table of monitoring policies in the Policy update status as one of the following statuses:
- Update required if one of the following monitoring policy settings was edited, but the changes have not been applied to event sources:
- Policy name
- Type
- Lower limit
- Upper limit
- Count interval
- Control interval
- Evaluation interval
- Updated in any of the following cases:
- After editing the policy, the task to apply the policy was started, and the changes were applied to event sources.
- You have edited one of the following policy settings, which does not require starting the update task:
- Send notifications
- Notification template
- Schedule
In this case, the edited policy settings are applied to event sources after a minute. Changes of the Notification template setting are applied instantly.
- The modified monitoring policy is not applied to event sources.
The date and time when the policy was last applied to event sources is displayed in the Policy last applied column.
While the policy update task is running, the Update policy button is unavailable for all users. If another user has edited the settings of the policy that necessitate an update, the Update policy button becomes active for you only after you refresh the page or edit the policy or an event source.
Applying monitoring policies
To apply monitoring policies to event sources:
- In the KUMA Console, in the Source status → List of event sources section, select one or more event sources in the table by selecting the check boxes in the first column next to the relevant sources. You can select several event sources by clicking the check box in the heading of the first column and selecting one of the following options:
- Select all to select all event sources on all pages of the table. If you have used search to filter sources, this will select all sources that match the search query.
- Select all in page to select all event sources that are loaded on the currently displayed page. If you have used search to filter sources, this will select all sources on the currently displayed page that match the search query.
In the lower left part of the table, you can find the number of selected sources and the total number of sources in the table.
After you select the event sources to which you want to apply the monitoring policy, the Enable policy button becomes available on the toolbar.
- Click Apply policy.
- This opens the Apply policy window; in that window, select one or more monitoring policies that you want to apply to the selected event sources. The table lists only monitoring policies that you can assign to the selected sources: policies that belong to the same tenant or to the Shared tenant, if you have access to it. If no shared policies exist for the selected event sources and you do not have access to the Shared tenant, the policy table is empty.
To select all available policies, you can select the check box in the heading of the first column. You can also use context search by policy name or sort the policies by clicking the heading of the column by which you want to sort the table and selecting Ascending or Descending.
Search and sorting is not available for the Sources, Schedule, Policy update status, Policy last applied columns.
- Click Apply.
- In the table of sources, click Update policy to apply the changes to event sources.
The monitoring policies are applied to the selected event sources; the status of these sources changes to green. The names of the policies applied to the sources are displayed in the Monitoring policy column. A message is also displayed indicating the number of sources to which the policies have been applied. If the monitoring policy is triggered for an event source, the new status of that source is displayed after you manually refresh the page or it is refreshed automatically. We recommend configuring an automatic data refresh period to keep track of changes in the list of sources.
If you have selected more than 100,000 event sources and applied one or more policies to them, these policies are applied only to the first 100,000 sources to which these policies have not yet been applied. If you need to apply policies to the remaining sources, you can do one of the following:
- Select all sources again and apply the policies to them.
- Filter the table of sources by any parameter so that the table displays less than 100,000 sources, then apply the policies to them.
Repeat the action until the policies have been applied to all the sources that you need.
Disabling monitoring policies
To disable monitoring policies for event sources:
- In the KUMA Console, in the Source status → List of event sources section, select one or more event sources in the table by selecting the check boxes in the first column next to the relevant sources.
In the lower left part of the table, you can find the number of selected sources and the total number of sources in the table. After you select the event sources to which monitoring policies are applied in the list, the Disable policy button becomes available on the toolbar.
You can select several event sources by clicking the check box in the heading of the first column selecting one of the following options:
- Select all to select all event sources on all pages of the table. If you have used search to filter sources, this will select all sources that match the search query.
- Select all in page to select all event sources that are loaded on the currently displayed page. If you have used search to filter sources, this will select all sources on the currently displayed page that match the search query.
- Click Disable policy.
- This opens the Disable policy window; in that window, select one or more monitoring policies that you want to disable for the selected event sources. The table lists all monitoring policies applied to at least one of the selected event sources.
To select all available policies, you can select the check box in the heading of the first column. You can also use context search or sort the policies by clicking the heading of the column by which you want to sort the table and selecting Ascending or Descending.
Search and sorting is not available for the Sources, Schedule, Policy update status, Policy last applied columns.
- In the settings section above the policy table, do one of the following:
- If you want to temporarily suspend the policies, select For the specified time and specify the time in minutes, hours, or days after which the selected policies will be reapplied to event sources. Maximum values:
- For days: 30
- For hours: 743
- For minutes: 44579
- If you want to permanently disable the selected policies for event sources, select Until manually enabled.
The default selection is For the specified time, and the value is set to 5 minutes.
- If you want to temporarily suspend the policies, select For the specified time and specify the time in minutes, hours, or days after which the selected policies will be reapplied to event sources. Maximum values:
- Click Disable.
- In the table of sources, click Update policy to apply the changes to event sources.
The monitoring policies are disabled for selected event sources or suspended for the specified time. The status of these sources in the table changes to gray. A message is displayed indicating the number of sources for which the policies have been disabled.
If you have selected more than 100,000 event sources and disabled one or more policies for them, these policies are disabled only for the first 100,000 sources to which these policies are applied. If you need to disable policies for the remaining sources, you can do one of the following:
- Select all sources again and disable the policies for them.
- Filter the table of sources by any parameter so that the table displays less than 100,000 sources, then disable the policies for them.
Repeat the action until the policies have been disabled for all the sources that you need.
Adding a new monitoring policy based on an existing policy
To create a new monitoring policy based on an existing policy:
- In the KUMA Console, in the Source status → Monitoring policies section, select the monitoring policy that you want to base the new policy on.
If necessary, you can find monitoring policies in the list using the Search field. The search will be carried out in the following columns: Name, Tenant, Type, Schedule (name of the day and time).
- Click Duplicate policy.
- This opens the Add policy window, in which you can edit policy settings.
By default, "- copy" is appended to the name of the new policy. The rest of the settings are the same as in the policy that you are duplicating.
- Click the Add button to create the new policy.
The monitoring policy is created based on an existing policy.
Deleting monitoring policies
To delete a monitoring policy:
- In the KUMA Console, in the Source status → Monitoring policies section, select one or more monitoring policies that you want to delete.
If necessary, you can find monitoring policies in the list using the Search field. The search will be carried out in the following columns: Name, Tenant, Type, Schedule (name of the day and time).
- Click Delete policy and confirm the action.
The selected monitoring policies are deleted.
You cannot remove predefined monitoring policies or policies that are assigned to data sources.