Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Alerting] Add telemetry for number of scheduled actions during rule execution #128891

Merged
merged 8 commits into from
Apr 1, 2022

Conversation

ymao1
Copy link
Contributor

@ymao1 ymao1 commented Mar 30, 2022

Towards #122535

Summary

Adds telemetry to calculate the p50/p90/p99 values for the number of actions a rule has scheduled during a single rule execution. Calculates by rule type and across all rule types.

To Verify

  1. Update the alerting telemetry task to run more frequently:
--- a/x-pack/plugins/alerting/server/usage/task.ts
+++ b/x-pack/plugins/alerting/server/usage/task.ts
@@ -139,7 +139,7 @@ export function telemetryTaskRunner(
                   avg_execution_time_per_day: dailyExecutionCounts.avgExecutionTime,
                   avg_execution_time_by_type_per_day: dailyExecutionCounts.avgExecutionTimeByType,
                 },
-                runAt: getNextMidnight(),
+                runAt: new Date(), // getNextMidnight(),
               };
             }
           )
  1. Create some rules that schedule actions and let them run for a little
  2. Navigate to https://localhost:5601/api/stats?extended=true&legacy=true and see the following fields inside alerting telemetry:
  • alerts.percentile_num_scheduled_actions_per_day
  • alerts.percentile_num_scheduled_actions_by_type_per_day

Checklist

@ymao1
Copy link
Contributor Author

ymao1 commented Mar 30, 2022

@elasticmachine merge upstream

@ymao1 ymao1 changed the title Alerting/actions telemetry [Alerting] Add telemetry for number of scheduled actions during rule execution Mar 31, 2022
@ymao1 ymao1 self-assigned this Mar 31, 2022
@ymao1 ymao1 added release_note:skip Skip the PR/issue when compiling release notes Team:ResponseOps Label for the ResponseOps team (formerly the Cases and Alerting teams) auto-backport Deprecated - use backport:version if exact versions are needed Feature:Alerting/RulesFramework Issues related to the Alerting Rules Framework v8.2.0 v8.3.0 labels Mar 31, 2022
@ymao1 ymao1 marked this pull request as ready for review March 31, 2022 12:18
@ymao1 ymao1 requested review from a team as code owners March 31, 2022 12:18
@elasticmachine
Copy link
Contributor

Pinging @elastic/response-ops (Team:ResponseOps)

Copy link
Member

@afharo afharo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Telemetry changes LGTM!

@mikecote mikecote self-requested a review April 1, 2022 16:32
Copy link
Contributor

@mikecote mikecote left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes LGTM!

@ymao1
Copy link
Contributor Author

ymao1 commented Apr 1, 2022

@elasticmachine merge upstream

@kibana-ci
Copy link
Collaborator

💚 Build Succeeded

Metrics [docs]

Unknown metric groups

ESLint disabled line counts

id before after diff
alerting 34 35 +1

Total ESLint disabled count

id before after diff
alerting 36 37 +1

History

To update your PR or re-run it, just comment with:
@elasticmachine merge upstream

cc @ymao1

Copy link
Member

@pmuellr pmuellr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@ymao1 ymao1 merged commit e36a08c into elastic:main Apr 1, 2022
@ymao1 ymao1 deleted the alerting/actions-telemetry branch April 1, 2022 19:57
kibanamachine pushed a commit that referenced this pull request Apr 1, 2022
…execution (#128891)

* Adding telemetry for number of scheduled actions

* Adding percentile by type types

* Parsing percentiles by rule type and adding tests

* Adding functional tests

Co-authored-by: Kibana Machine <42973632+kibanamachine@users.noreply.github.com>
(cherry picked from commit e36a08c)
@kibanamachine
Copy link
Contributor

💚 All backports created successfully

Status Branch Result
8.2

Note: Successful backport PRs will be merged automatically after passing CI.

Questions ?

Please refer to the Backport tool documentation

kibanamachine added a commit that referenced this pull request Apr 1, 2022
…execution (#128891) (#129253)

* Adding telemetry for number of scheduled actions

* Adding percentile by type types

* Parsing percentiles by rule type and adding tests

* Adding functional tests

Co-authored-by: Kibana Machine <42973632+kibanamachine@users.noreply.github.com>
(cherry picked from commit e36a08c)

Co-authored-by: Ying Mao <ying.mao@elastic.co>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
auto-backport Deprecated - use backport:version if exact versions are needed Feature:Alerting/RulesFramework Issues related to the Alerting Rules Framework release_note:skip Skip the PR/issue when compiling release notes Team:ResponseOps Label for the ResponseOps team (formerly the Cases and Alerting teams) v8.2.0 v8.3.0
Projects
No open projects
Development

Successfully merging this pull request may close these issues.

7 participants