Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Alerts that timeout during execution don't respect their schedule #83490

Closed
mikecote opened this issue Nov 16, 2020 · 2 comments · Fixed by #83682
Closed

Alerts that timeout during execution don't respect their schedule #83490

mikecote opened this issue Nov 16, 2020 · 2 comments · Fixed by #83682
Assignees
Labels
bug Fixes for quality problems that affect the customer experience Feature:Alerting Team:ResponseOps Label for the ResponseOps team (formerly the Cases and Alerting teams)

Comments

@mikecote
Copy link
Contributor

I did some local testing today to see how alerts behave after switching over to use Task Manager's schedule. I noticed when testing the timeout behaviour that the task is scheduled to run immediately after timing out when it should run at the next schedule (see #39349).

I did the following changes locally to reproduce this issue in a timely manner mikecote@43683af. The main pieces are to get an alert executor to timeout and to have an alert schedule much larger than the timeout window. That way you can see the next execution happen right after timing out.

To run my example you can do the following:

node scripts/functional_tests_server.js --config=test/alerting_api_integration/security_and_spaces/config.ts
node scripts/functional_test_runner.js --config=test/alerting_api_integration/security_and_spaces/config.ts
@mikecote mikecote added bug Fixes for quality problems that affect the customer experience Feature:Alerting Team:ResponseOps Label for the ResponseOps team (formerly the Cases and Alerting teams) labels Nov 16, 2020
@elasticmachine
Copy link
Contributor

Pinging @elastic/kibana-alerting-services (Team:Alerting Services)

@mikecote
Copy link
Contributor Author

It seems like the retryAt should be the next scheduled run or the timeout window (whichever is greater) but that may break the timeout logic..

@gmmorris gmmorris self-assigned this Nov 18, 2020
@kobelb kobelb added the needs-team Issues missing a team label label Jan 31, 2022
@botelastic botelastic bot removed the needs-team Issues missing a team label label Jan 31, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Fixes for quality problems that affect the customer experience Feature:Alerting Team:ResponseOps Label for the ResponseOps team (formerly the Cases and Alerting teams)
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants