-
Notifications
You must be signed in to change notification settings - Fork 8.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Epic][Security Solution][Detections] Rule Execution Log - UI on the Rule Details page #101014
Comments
Pinging @elastic/security-detections-response (Team:Detections and Resp) |
Pinging @elastic/security-solution (Team: SecuritySolution) |
Discussed #124198 with team, holding for 8.2 to further iterate on implementation and UX. |
@spong @banderror updated the mvp design based on our discussion (link to Figma):
|
Awesome -- thank you @yiyangliu9286! 🙂 Will implement these in #126215 and ping you when ready for review! |
## Summary Resolves #119598, #119599, #101014 Test plan ([internal doc](https://docs.google.com/document/d/1-prIUGYaPHiwGA79CgSdw1926lxIPKGWWkYOUD2BM1U/edit#heading=h.womzsfdt6zt8)) Adds `Rule Execution Log` table to Rule Details page: <p align="center"> <img width="700" src="https://user-images.githubusercontent.com/2946766/158540840-e9cddb9b-f33d-4b95-86ad-cb3e0a00cf39.gif" /> </p> ### Implementation notes The useful metrics within `event-log` for a given rule execution are spread between a few different platform (`execute-start`, `execute`) and security (`execution-metrics`, `status-change`) events. In effort to provide consolidated metrics per rule execution (and avoiding a lot of empty cells and mis-matched statuses like in the image below) <p align="center"> <img width="700" src="https://user-images.githubusercontent.com/2946766/151933881-2e58f4d7-4cda-4528-9d44-37cb7bd5de9c.png" /> </p> these rule execution events are aggregated by their `executionId`, and then fields are merged from each different event. This PR was re-worked to take advantage of the new event-log aggregation support added in #126948, and is no longer implemented as an in-memory aggregation server side. * Due to restrictions around supplying search filters that may match multiple sub-agg buckets and missing data ([see discussion here](https://github.com/elastic/kibana/pull/127339/files#r825240516)), it was decided that we'd disable the search bar for the time being. We have both a near-term (writing single rollup event) and long-term (ES|QL) solution that will allow us to re-enable this functionality. * Note, since a `terms` agg is used to fetch all execution events, an upper bound must be set. See [this discussion](https://github.com/elastic/kibana/pull/127339/files#r823035420) for more details, but setting this max to `1000` events for the time being, and returning total cardinality of execution events back within `total` to allow the UI to inform the user that they should narrow their search further to better isolate and find possible issues. This should be a be a reasonable constraint for most all rules as a rule executing every 5 minutes, 1000 executions would cover over 3 days of execution time. <p align="center"> <img width="700" src="https://user-images.githubusercontent.com/2946766/159045563-966896b4-3cd1-475d-9f0e-c2d300683546.png" /> </p> The `Filter for alerts` action will be available on all `Succeeded`/`Partial Failure` executions even if there weren't alerts generated until #126210 is merged and we can start returning the alert count, at which point we can programmatically enabled/disable this action based on alert count. <p align="center"> <img width="300" src="https://user-images.githubusercontent.com/2946766/159051762-e2f97ba4-4ce1-4f67-8ae1-395e4b191cab.png" /> </p>
Summary
Replace the existing "Failure History" tab with an enhanced Rule Execution Log UI on the Rule Details page.
New rule execution log documents are going have some standard ECS fields + some custom fields, which will make them technically similar to detection alerts and source events in terms of flexibility of analysis, showing in tables in the UI etc. This will allow us to implement a more advanced Rule Execution Log UI - not only with 5 last failures, but with all rule execution status updates (current statuses are
going to run
,succeeded
,warning
,failed
) without limitation in their number, current execution metrics (querying time, indexing time, gaps, etc), any new execution metrics (if needed), any additional events with arbitrary data, and just generic log messages for observability purposes.Ideas popped up during chatting with @yiyangliu9286 and @xcrzx:
Resources
To do
First iteration (simple log UI with basic filtering and pagination, no visualizations):
The text was updated successfully, but these errors were encountered: