Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Security Solution] Ensure alerts are scheduled when rule times out #128276

Merged
merged 7 commits into from
Mar 25, 2022

Conversation

madirey
Copy link
Contributor

@madirey madirey commented Mar 22, 2022

Summary

Fixes: #121559

Related:
#120506

NOTE: Tests are very difficult to write for this case. There are tests in the alerting framework for the constructs used. Below is the method I used for manual testing, in detail.

To test:

  1. Start an http server on localhost:
    sudo python3 -m http.server 5605 or use netcat.

  2. Create a rule that creates alerts and then times out. Easiest way I have found to accomplish this is below:

image

image

image

If you create more than 2 alerts, the first 2 alerts will be created and then the task will stall, and eventually time out.

  1. Check rule status after time out. You should see alerts created and rule notifications should fire for those 2 alerts (in example below, POST is not supported by the simple Python webserver, but you can see that the notification action did fire)

image

image

For maintainers

@madirey madirey added release_note:skip Skip the PR/issue when compiling release notes Feature:Detection Alerts Security Solution Detection Alerts Feature Team:Detection Alerts Security Detection Alerts Area Team 8.2 candidate considered, but not committed, for 8.2 release labels Mar 22, 2022
@madirey madirey requested a review from a team as a code owner March 22, 2022 15:07
@madirey madirey added the v8.2.0 label Mar 23, 2022
@madirey
Copy link
Contributor Author

madirey commented Mar 23, 2022

@elasticmachine merge upstream

@kibana-ci
Copy link
Collaborator

💚 Build Succeeded

Metrics [docs]

✅ unchanged

History

To update your PR or re-run it, just comment with:
@elasticmachine merge upstream

Copy link
Contributor

@marshallmain marshallmain left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The changes here LGTM as is. There are some other areas we may want to address re: cancellation in addition though.

We should remove the experimentalFeatures.securityRulesCancelEnabled flag and choose an appropriate value (possibly undefined) for ruleTaskTimeout based on the outcome of https://github.com/elastic/security-team/issues/3415. This can be a follow up PR but we should target 8.2 for that work if possible.

Threat match rules have a separate function buildExecutionIntervalValidator that implements timeout functionality as well. If we can remove that and rely on the search timeout instead that would unify the rule types more. We'll need to modify some of the logic in the threat match executor so it breaks out of the loop when searchAfterBulkCreate returns an error though.

id: alertId,
kibanaSiemAppUrl: (meta as { kibana_siem_app_url?: string } | undefined)
?.kibana_siem_app_url,
outputIndex: ruleDataClient.indexNameWithNamespace(spaceId),
ruleId,
esClient: services.scopedClusterClient.asCurrentUser,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we try to schedule throttled notifications after a rule is cancelled, the search executed inside this function will also be cancelled and we won't be able to schedule the actions. We may need the alerting framework to provide a secondary "un-cancellable" client that we can use during the actions scheduling process.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, good point!

@madirey madirey merged commit 1837a7f into elastic:main Mar 25, 2022
@madirey madirey deleted the de-rule-cancel-flow branch March 25, 2022 00:56
@kibanamachine
Copy link
Contributor

Friendly reminder: Looks like this PR hasn’t been backported yet.
To create backports run node scripts/backport --pr 128276 or prevent reminders by adding the backport:skip label.

@kibanamachine kibanamachine added the backport missing Added to PRs automatically when the are determined to be missing a backport. label Mar 29, 2022
@marshallmain marshallmain added backport:skip This commit does not require backporting and removed backport missing Added to PRs automatically when the are determined to be missing a backport. labels Mar 29, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
8.2 candidate considered, but not committed, for 8.2 release backport:skip This commit does not require backporting Feature:Detection Alerts Security Solution Detection Alerts Feature release_note:skip Skip the PR/issue when compiling release notes Team:Detection Alerts Security Detection Alerts Area Team v8.2.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Security Solution] Alerts can be written without scheduling actions afterwards
4 participants