Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Security Solution][Endpoint] Display of isolation state for Microsoft Defender agents in alert details and response console #206317

Merged

Conversation

paul-tavares
Copy link
Contributor

@paul-tavares paul-tavares commented Jan 10, 2025

Summary

Connector Changes

  • Added support for sort field to the Machine Actions Microsoft Defender API method

Security Solution

  • Add error handling to the calls made to a Connector's .execute() and throws a more details error message
  • Added logic to the Agent Status client for MS Defender to calculate the agent's Isolated status by querying for Machine Actions
    • Note: Due to API rate limits, which I believe may be associated with the current Microsoft Defender test environment we are using, the agent status in kibana (ex. Alert flyout, console) may flip to Unenrolled periodically
image image

Checklist

@paul-tavares paul-tavares added release_note:skip Skip the PR/issue when compiling release notes Team:Defend Workflows “EDR Workflows” sub-team of Security Solution backport:prev-minor Backport to (9.0) the previous minor version (i.e. one version back from main) v8.18.0 labels Jan 10, 2025
@paul-tavares paul-tavares self-assigned this Jan 10, 2025
@paul-tavares paul-tavares changed the title [Security Solution][Endpoint] Improve bi-directional response actions errors [Security Solution][Endpoint] Display of isolation state for Microsoft Defender agents in alert details and response console Jan 14, 2025
@paul-tavares paul-tavares marked this pull request as ready for review January 14, 2025 17:19
@paul-tavares paul-tavares requested review from a team as code owners January 14, 2025 17:19
@elasticmachine
Copy link
Contributor

Pinging @elastic/security-defend-workflows (Team:Defend Workflows)

@paul-tavares paul-tavares requested review from tomsonpl and removed request for pzl and ashokaditya January 14, 2025 17:19
@elasticmachine
Copy link
Contributor

💚 Build Succeeded

Metrics [docs]

✅ unchanged

cc @paul-tavares

Copy link
Member

@pmuellr pmuellr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code LGTM, though in theory we aren't really CODEOWNERS (pinged because of some ongoing file movement in Kibana).

I assumed this connector was only used in UX-driven scenarios and not background task usage (via task manager directly or alerting actions indirectly), but my eyes hit a background_task string in the changes ... so I had to ask again :-)

We're probably ok, even if there are connector schema changes, since I believe this is behind a feature flag anyway, but I'd like to know for sure.

}) as Promise<ActionTypeExecutorResult<TResponse>>;
return this.connectorsClient
.execute({
requesterId: 'background_task',
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have special concerns for tasks that run in "background". Specifically, when connectors can be used as an alerting rule action, we need to make sure there are no ZDT/BWC issues . That's because those connector executions are queued as task manager tasks. For connectors only used in UX-driven scenarios, that can use either HTTP APIs or ActionClient (in Kibana plugins), we don't have any ZDT/BWC issues.

So I was confused seeing background_task, and pretty much thinking this connector is UX driven, and not used as an alerting action.

Can this connector be used as an alerting action? If so, using the string background_task for it may be confusing to someone familiar with connectors/actions/task manager, thinking this is some kind of task manager task.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @pmuellr.
Thanks for the review.
No, this connector (or any connector associated with Security Solution bi-directional response actions, which now have are registered with a supportedFeatureId of EndpointSecurityConnectorFeatureId (link) are not used/available with the Alerting actions. These changes were introduced when we started the work with bi-directional response actions (SentinelOne connector was the first) and was recently updated to ensure proper RBAC was also made available for them (see this PR)

RE: Can this connector be used from Background tasks
Yes, it can and it is currently being used by security solution via a background task. In that flow, we use the Unsecured ActionsClient to interact with the connector.


You can follow up with @mikecote on this if needed - he was deeply engaged in driving the necessary changes in ResponseOps code to support our use cases.

Copy link
Contributor

@gergoabraham gergoabraham left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

didn't test it, but looks good! 🚀

@pmuellr
Copy link
Member

pmuellr commented Jan 15, 2025

Yes, it can and it is currently being used by security solution via a background task. In that flow, we use the Unsecured ActionsClient to interact with the connector.

Ah, thanks!

So, yes, we've entered a new era where these Connectors are going to have to go through an "intermediate release" if they are being run out of a task. Explained in this doc: 2024-01 Requiring intermediate releases to change rule or connector schemas in serverless.

At first glance sortField and sortDirection appear to be new connector params, so I think the idea is that these fields would be added in the intermediate release, but not depended on for anything - we just need the shape there for a potential downgrade. The second release would add all the rest.

But this didn't set off our usual "schema checker" that would catch these new params, since it's behind a feature flag. This is something customers can't turn on, right? If so, and you're willing to break folks who've used the feature flag, I think we could skip the intermediate release. Presumably the connector could get errors during an upgrade / downgrade. Once the feature flag is removed such that this connector is always available, we'd HAVE to go through an intermediate release.

@paul-tavares
Copy link
Contributor Author

@pmuellr ,

Re: "This is something customers can't turn on, right?"

Correct., this is behind a feature flag and the feature has not been advertised or released. We are in the process of preparing to GA this (along with SentinelOne and Crowstrike) in the next two weeks for serverless and then subsequently with 8.18/9.0, which means we'll also be removed the Tech. Preview labels from it in addition to enabling keys.

Re: Intermediate release process

I'm going to add you to a Slack channel and ping others to get clarification on this. These connectors are not used in the alerting framework - they are only invoked/used by business logic on the security solution side, so I'm not sure that we should be bound by those same restrictions you mentioned above.

Let me know if there is anything else that is needed here to get this through :)

@pmuellr
Copy link
Member

pmuellr commented Jan 15, 2025

they are only invoked/used by business logic on the security solution side

The critical bit that intermediate releases are required for is if the connectors are being queued to run as task manager tasks. Because then multiple Kibanas running different versions of Kibana come into play (old schedules tasks, but picked up by new, or vice versa. Alerting rules always run as TM tasks, connectors can be run as TM tasks or directly (HTTP, ActionsClient).

I guess the critical question is, given the following, is: Are "background tasks" using task manager, or are they just long-running in-process async code executions not using task manager?

it can and it is currently being used by security solution via a background task

@paul-tavares
Copy link
Contributor Author

Re:

I guess the critical question is, given the following, is: Are "background tasks" using task manager, or are they just long-running in-process async code executions not using task manager?

Yes, our background task is being run by Task Manager. Still not sure why that would present a problem. The task has no reference in the schedule to the connector its self. When the task is run, we initialize a new Un-secured ActionsClient and invoke the connector that way. So the security solution code and the connector code will be in sync I am not seeing a case where different version of the code would be invoked (ex. new version of the connector code and older code of the security solution plugin, or the other way around )

@paul-tavares paul-tavares merged commit 4c6abde into elastic:main Jan 15, 2025
8 checks passed
@paul-tavares paul-tavares deleted the task/olm-improve-response-actions-errors branch January 15, 2025 21:12
@kibanamachine
Copy link
Contributor

Starting backport for target branches: 8.x

https://github.com/elastic/kibana/actions/runs/12797103216

kibanamachine pushed a commit to kibanamachine/kibana that referenced this pull request Jan 15, 2025
…t Defender agents in alert details and response console (elastic#206317)

## Summary

### Connector Changes

- Added support for sort field to the Machine Actions Microsoft Defender
API method

### Security Solution

- Add error handling to the calls made to a Connector's `.execute()` and
throws a more details error message
- Added logic to the Agent Status client for MS Defender to calculate
the agent's Isolated status by querying for Machine Actions
- Note: Due to API rate limits, which I believe may be associated with
the current Microsoft Defender test environment we are using, the agent
status in kibana (ex. Alert flyout, console) may flip to `Unenrolled`
periodically

(cherry picked from commit 4c6abde)
@kibanamachine
Copy link
Contributor

💚 All backports created successfully

Status Branch Result
8.x

Note: Successful backport PRs will be merged automatically after passing CI.

Questions ?

Please refer to the Backport tool documentation

kibanamachine added a commit that referenced this pull request Jan 15, 2025
…crosoft Defender agents in alert details and response console (#206317) (#206868)

# Backport

This will backport the following commits from `main` to `8.x`:
- [[Security Solution][Endpoint] Display of isolation state for
Microsoft Defender agents in alert details and response console
(#206317)](#206317)

<!--- Backport version: 9.4.3 -->

### Questions ?
Please refer to the [Backport tool
documentation](https://github.com/sqren/backport)

<!--BACKPORT [{"author":{"name":"Paul
Tavares","email":"56442535+paul-tavares@users.noreply.github.com"},"sourceCommit":{"committedDate":"2025-01-15T21:12:41Z","message":"[Security
Solution][Endpoint] Display of isolation state for Microsoft Defender
agents in alert details and response console (#206317)\n\n##
Summary\r\n\r\n### Connector Changes\r\n\r\n- Added support for sort
field to the Machine Actions Microsoft Defender\r\nAPI method\r\n\r\n###
Security Solution\r\n\r\n- Add error handling to the calls made to a
Connector's `.execute()` and\r\nthrows a more details error message\r\n-
Added logic to the Agent Status client for MS Defender to
calculate\r\nthe agent's Isolated status by querying for Machine
Actions\r\n- Note: Due to API rate limits, which I believe may be
associated with\r\nthe current Microsoft Defender test environment we
are using, the agent\r\nstatus in kibana (ex. Alert flyout, console) may
flip to
`Unenrolled`\r\nperiodically","sha":"4c6abdebdfe5940abc82503519db748eaac18da6","branchLabelMapping":{"^v9.0.0$":"main","^v8.18.0$":"8.x","^v(\\d+).(\\d+).\\d+$":"$1.$2"}},"sourcePullRequest":{"labels":["release_note:skip","v9.0.0","Team:Defend
Workflows","backport:prev-minor","v8.18.0"],"title":"[Security
Solution][Endpoint] Display of isolation state for Microsoft Defender
agents in alert details and response
console","number":206317,"url":"https://github.com/elastic/kibana/pull/206317","mergeCommit":{"message":"[Security
Solution][Endpoint] Display of isolation state for Microsoft Defender
agents in alert details and response console (#206317)\n\n##
Summary\r\n\r\n### Connector Changes\r\n\r\n- Added support for sort
field to the Machine Actions Microsoft Defender\r\nAPI method\r\n\r\n###
Security Solution\r\n\r\n- Add error handling to the calls made to a
Connector's `.execute()` and\r\nthrows a more details error message\r\n-
Added logic to the Agent Status client for MS Defender to
calculate\r\nthe agent's Isolated status by querying for Machine
Actions\r\n- Note: Due to API rate limits, which I believe may be
associated with\r\nthe current Microsoft Defender test environment we
are using, the agent\r\nstatus in kibana (ex. Alert flyout, console) may
flip to
`Unenrolled`\r\nperiodically","sha":"4c6abdebdfe5940abc82503519db748eaac18da6"}},"sourceBranch":"main","suggestedTargetBranches":["8.x"],"targetPullRequestStates":[{"branch":"main","label":"v9.0.0","branchLabelMappingKey":"^v9.0.0$","isSourceBranch":true,"state":"MERGED","url":"https://github.com/elastic/kibana/pull/206317","number":206317,"mergeCommit":{"message":"[Security
Solution][Endpoint] Display of isolation state for Microsoft Defender
agents in alert details and response console (#206317)\n\n##
Summary\r\n\r\n### Connector Changes\r\n\r\n- Added support for sort
field to the Machine Actions Microsoft Defender\r\nAPI method\r\n\r\n###
Security Solution\r\n\r\n- Add error handling to the calls made to a
Connector's `.execute()` and\r\nthrows a more details error message\r\n-
Added logic to the Agent Status client for MS Defender to
calculate\r\nthe agent's Isolated status by querying for Machine
Actions\r\n- Note: Due to API rate limits, which I believe may be
associated with\r\nthe current Microsoft Defender test environment we
are using, the agent\r\nstatus in kibana (ex. Alert flyout, console) may
flip to
`Unenrolled`\r\nperiodically","sha":"4c6abdebdfe5940abc82503519db748eaac18da6"}},{"branch":"8.x","label":"v8.18.0","branchLabelMappingKey":"^v8.18.0$","isSourceBranch":false,"state":"NOT_CREATED"}]}]
BACKPORT-->

Co-authored-by: Paul Tavares <56442535+paul-tavares@users.noreply.github.com>
@pmuellr
Copy link
Member

pmuellr commented Jan 16, 2025

Yes, our background task is being run by Task Manager. Still not sure why that would present a problem. The task has no reference in the schedule to the connector its self. When the task is run, we initialize a new Un-secured ActionsClient and invoke the connector that way. So the security solution code and the connector code will be in sync I am not seeing a case where different version of the code would be invoked (ex. new version of the connector code and older code of the security solution plugin, or the other way around )

Ahhh!

I thought you had some user-driven code creating a task to run the connector, so the user-driven code and eventual task running the connector could be different Kibanas.

Instead, you have a task manager task that invokes connectors in the task, via the ActionsClient.

You're right, this won't be a problem; the problem is when a saved object can get accessed in multiple Kibanas, which isn't the situation you describe.

At least it's not as hairy of a problem. I think there may still be some gotchas for future changes, will discuss separately with MikeC. For now though, with the feature flag in place, I'm happy with having it merged.

viduni94 pushed a commit to viduni94/kibana that referenced this pull request Jan 23, 2025
…t Defender agents in alert details and response console (elastic#206317)

## Summary

### Connector Changes

- Added support for sort field to the Machine Actions Microsoft Defender
API method

### Security Solution

- Add error handling to the calls made to a Connector's `.execute()` and
throws a more details error message
- Added logic to the Agent Status client for MS Defender to calculate
the agent's Isolated status by querying for Machine Actions
- Note: Due to API rate limits, which I believe may be associated with
the current Microsoft Defender test environment we are using, the agent
status in kibana (ex. Alert flyout, console) may flip to `Unenrolled`
periodically
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport:prev-minor Backport to (9.0) the previous minor version (i.e. one version back from main) release_note:skip Skip the PR/issue when compiling release notes Team:Defend Workflows “EDR Workflows” sub-team of Security Solution v8.18.0 v9.0.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants