[Security Solution][Detections] Adoption telemetry #71102

rylnd · 2020-07-08T14:58:18Z

Summary

This adds a new UsageCollector that collects adoption data for both Detection Rules and ML Jobs within our solution's group.

The metrics being collected are:

number of custom detection rules enabled
number of custom detection rules disabled (to calculate the total rules they have created)
number of pre-built detection rules enabled
number of pre-built detection rules disabled (number of rules change in every release version)
number of custom ML jobs enabled
number of custom ML jobs disabled
number of pre-built ML jobs enabled
number of pre-built ML jobs disabled

TODO:

verify schema
backfill unit tests

Checklist

Delete any items that are not applicable to this PR.

Documentation was added for features that require explanation or tutorials
[] Unit or functional tests were updated or added to match the most common scenarios

For maintainers

This was checked for breaking API changes and was labeled appropriately

This uses ML and raw ES calls to query our ML Jobs and Rules, and parse them into a format to be consumed by telemetry. Still to come: * initialization * tests

The service seems to convert colons to underscores, so let's just use an underscure.

This allows us to test our adherence to the collector API, focusing particularly on the fetch function.

We're going to have our usage data under one key corresponding to the app, so this nests the existing data under a 'detections' key while allowing another fetching function to be plugged into the main collector under a separate key.

rylnd · 2020-07-09T02:02:38Z

@elasticmachine merge upstream

michaelolo24 · 2020-07-09T15:24:09Z

x-pack/plugins/security_solution/server/usage/detections_helpers.ts

+    filterPath: ['hits.hits._source.alert.enabled', 'hits.hits._source.alert.tags'],
+    ignoreUnavailable: true,
+    index,
+    size: 10000, // elasticsearch index.max_result_window default value


Are we ever likely to exceed this limit?

No, it doesn't seem likely that a user'd have 10k rules, but this ensures that if they did, we'd tally the first 10k instead of receiving an error and returning 0s.

This was used in another collector, but if you think there's risk here I'm happy to remove it!

I think it's fine. And I hope they don't have more than 10k rules 😅. I was really just curious if there was a way beyond it beyond scroll. I ran into an issue previously where I needed to set this max just to get all of the data I wanted (I think ES has a default max it returns?), but it looks fine.

* inlines collector options * inlines schema object * makes DetectionsUsage an interface instead of a type alias

Conflicts: x-pack/plugins/security_solution/server/plugin.ts

elasticmachine · 2020-07-09T18:38:56Z

Pinging @elastic/siem (Team:SIEM)

michaelolo24 · 2020-07-09T20:46:28Z

x-pack/plugins/security_solution/server/usage/detections.mocks.ts

+        config: {
+          job_type: 'anomaly_detector',
+          description:
+            'SIEM Auditbeat: Looks for unusual destination port activity that could indicate command-and-control, persistence mechanism, or data exfiltration activity (beta)',


Just checking we haven't migrated these detections to the SecuritySolution naming pattern?

I took these from live data and the rename hasn't happened yet. I'll be sure to update these appropriately once that happens 👍

michaelolo24 · 2020-07-09T20:50:19Z

x-pack/plugins/security_solution/server/usage/detections_helpers.ts

+    );
+
+    if (ruleResults.hits?.hits?.length > 0) {
+      ruleMetrics = ruleResults.hits.hits.map((hit) => ({


Not really an issue, but why not do the tally here rather than iterating twice? Or alternatively, just return ruleResults and then just iterate once in the usageBuilder

That's a good point, I'll change these to do a single loop 👍

michaelolo24 · 2020-07-09T20:53:46Z

x-pack/plugins/security_solution/server/usage/detections_helpers.ts

+      const moduleJobs = modules.flatMap((module) => module.jobs);
+      const jobs = await jobServiceProvider(mlCaller).jobsSummary(['siem']);
+
+      jobMetrics = jobs.map((job) => ({


Same thing here, but not sure how much of an issue it'll be to loop over all this data twice. I know we have about 60s till the usageCollector times out, but I don't think we'll run into that

We were previously performing two loops over each set of data: one to format it down to just the data we need, and another to convert that into usage data. We now perform both steps within a single loop.

jgowdyelastic · 2020-07-13T10:48:54Z

x-pack/plugins/security_solution/server/usage/detections_helpers.ts

+      const jobs = await jobServiceProvider(mlCaller).jobsSummary(['siem']);
+
+      jobsUsage = jobs.reduce((usage, job) => {
+        const isElastic = moduleJobs.some((moduleJob) => moduleJob.id === job.id);


does SIEM apply a prefix to the module when initially setting it up?
If so, this won't match and you'll have to do a partial match on the last part of job id.

Confirmed that this is not currently an issue, but good to know moving forward.

afharo

Kibana Telemetry changes LGTM!

afharo · 2020-07-13T12:26:35Z

x-pack/plugins/security_solution/server/usage/collector.ts

+      detections: {
+        detection_rules_custom_enabled: { type: 'number' },
+        detection_rules_custom_disabled: { type: 'number' },
+        detection_rules_elastic_enabled: { type: 'number' },
+        detection_rules_elastic_disabled: { type: 'number' },
+        ml_jobs_custom_enabled: { type: 'number' },
+        ml_jobs_custom_disabled: { type: 'number' },
+        ml_jobs_elastic_enabled: { type: 'number' },
+        ml_jobs_elastic_disabled: { type: 'number' },
+      },


NIT: Can we use type: 'long' instead (or 'float' if that's the intended format)?
This will help us understand the type of number to be indexed 🙂

NIT 2: Have you considered using a tree structure instead?

{ "properties": { "detections": { "properties": { "detection_rules": { "custom": { "enabled": { "type": "long" }, "disabled": { "type": "long" } }, "elastic": { "enabled": { "type": "long" }, "disabled": { "type": "long" } } }, "ml_jobs": { "custom": { "enabled": { "type": "long" }, "disabled": { "type": "long" } }, "elastic": { "enabled": { "type": "long" }, "disabled": { "type": "long" } } } } } } }

rylnd · 2020-07-13T15:52:42Z

@elasticmachine merge upstream

kibanamachine · 2020-07-13T18:08:44Z

💚 Build Succeeded

continuous-integration/kibana-ci/pull-request
Commit: 25ac77c

Build metrics

✅ unchanged

History

💚 Build #60937 succeeded 9356910
💚 Build #60649 succeeded a4e885d
💚 Build #60318 succeeded 7c622a0
💚 Build #60288 succeeded a416fa6
💔 Build #60044 failed 221db19

To update your PR or re-run it, just comment with:
@elasticmachine merge upstream

* style: sort plugin interface * WIP: UsageCollector for Security Adoption This uses ML and raw ES calls to query our ML Jobs and Rules, and parse them into a format to be consumed by telemetry. Still to come: * initialization * tests * Initialize usage collectors during plugin setup * Rename usage key The service seems to convert colons to underscores, so let's just use an underscure. * Collector is ready if we have a kibana index * Refactor collector to generate options in a function This allows us to test our adherence to the collector API, focusing particularly on the fetch function. * Refactor usage collector in anticipation of endpoint data We're going to have our usage data under one key corresponding to the app, so this nests the existing data under a 'detections' key while allowing another fetching function to be plugged into the main collector under a separate key. * Update our collector to satisfy telemetry tooling * inlines collector options * inlines schema object * makes DetectionsUsage an interface instead of a type alias * Extracts telemetry mappings via scripts/telemetry_extract * Refactor detections usage logic to perform one loop instead of two We were previously performing two loops over each set of data: one to format it down to just the data we need, and another to convert that into usage data. We now perform both steps within a single loop. * Refactor detections telemetry to be nested * Extract new nested detections telemetry mappings Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com> Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>

elasticmachine · 2021-09-23T14:31:57Z

Pinging @elastic/security-solution (Team: SecuritySolution)

rylnd added 6 commits July 7, 2020 21:40

style: sort plugin interface

9c516bf

WIP: UsageCollector for Security Adoption

770204a

This uses ML and raw ES calls to query our ML Jobs and Rules, and parse them into a format to be consumed by telemetry. Still to come: * initialization * tests

Initialize usage collectors during plugin setup

1b952c8

Rename usage key

6eb35b1

The service seems to convert colons to underscores, so let's just use an underscure.

Collector is ready if we have a kibana index

35c58b9

Refactor collector to generate options in a function

e4c057f

This allows us to test our adherence to the collector API, focusing particularly on the fetch function.

rylnd added Team:SIEM v8.0.0 release_note:skip Skip the PR/issue when compiling release notes v7.9.0 labels Jul 8, 2020

rylnd self-assigned this Jul 8, 2020

rylnd force-pushed the security_adoption_telemetry branch from e73b13c to bbae701 Compare July 9, 2020 00:05

Merge branch 'master' into security_adoption_telemetry

221db19

michaelolo24 reviewed Jul 9, 2020

View reviewed changes

rylnd added 3 commits July 9, 2020 10:44

Update our collector to satisfy telemetry tooling

01c20a1

* inlines collector options * inlines schema object * makes DetectionsUsage an interface instead of a type alias

Extracts telemetry mappings via scripts/telemetry_extract

a416fa6

Merge branch 'master' into security_adoption_telemetry

7c622a0

Conflicts: x-pack/plugins/security_solution/server/plugin.ts

rylnd marked this pull request as ready for review July 9, 2020 18:38

rylnd requested review from a team as code owners July 9, 2020 18:38

michaelolo24 reviewed Jul 9, 2020

View reviewed changes

michaelolo24 approved these changes Jul 9, 2020

View reviewed changes

Refactor detections usage logic to perform one loop instead of two

a4e885d

We were previously performing two loops over each set of data: one to format it down to just the data we need, and another to convert that into usage data. We now perform both steps within a single loop.

jgowdyelastic reviewed Jul 13, 2020

View reviewed changes

afharo approved these changes Jul 13, 2020

View reviewed changes

afharo mentioned this pull request Jul 13, 2020

initial telemetry setup #69330

Merged

2 tasks

elasticmachine and others added 3 commits July 13, 2020 09:52

Merge branch 'master' into security_adoption_telemetry

9356910

Refactor detections telemetry to be nested

6034b18

Extract new nested detections telemetry mappings

25ac77c

rylnd merged commit 1afb0c4 into elastic:master Jul 13, 2020

rylnd deleted the security_adoption_telemetry branch July 13, 2020 18:18

rylnd mentioned this pull request Jul 13, 2020

[7.x] [Security Solution][Detections] Adoption telemetry (#71102) #71504

Merged

MindyRS added the Team: SecuritySolution Security Solutions Team working on SIEM, Endpoint, Timeline, Resolver, etc. label Sep 23, 2021

pjhampton mentioned this pull request Nov 18, 2021

Remove Detection Rule telemetry from Security Solution (8.0+) #119047

Closed

1 task

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Security Solution][Detections] Adoption telemetry #71102

[Security Solution][Detections] Adoption telemetry #71102

rylnd commented Jul 8, 2020 •

edited

Loading

rylnd commented Jul 9, 2020

michaelolo24 Jul 9, 2020

rylnd Jul 9, 2020

rylnd Jul 9, 2020

michaelolo24 Jul 9, 2020 •

edited

Loading

elasticmachine commented Jul 9, 2020

michaelolo24 Jul 9, 2020

rylnd Jul 9, 2020

michaelolo24 Jul 9, 2020 •

edited

Loading

rylnd Jul 9, 2020

michaelolo24 Jul 9, 2020

jgowdyelastic Jul 13, 2020 •

edited

Loading

rylnd Jul 13, 2020

afharo left a comment

afharo Jul 13, 2020

afharo Jul 13, 2020

rylnd commented Jul 13, 2020

kibanamachine commented Jul 13, 2020

elasticmachine commented Sep 23, 2021

[Security Solution][Detections] Adoption telemetry #71102

[Security Solution][Detections] Adoption telemetry #71102

Conversation

rylnd commented Jul 8, 2020 • edited Loading

Summary

Checklist

For maintainers

rylnd commented Jul 9, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

michaelolo24 Jul 9, 2020 • edited Loading

Choose a reason for hiding this comment

elasticmachine commented Jul 9, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

michaelolo24 Jul 9, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jgowdyelastic Jul 13, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

afharo left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

rylnd commented Jul 13, 2020

kibanamachine commented Jul 13, 2020

💚 Build Succeeded

Build metrics

History

elasticmachine commented Sep 23, 2021

rylnd commented Jul 8, 2020 •

edited

Loading

michaelolo24 Jul 9, 2020 •

edited

Loading

michaelolo24 Jul 9, 2020 •

edited

Loading

jgowdyelastic Jul 13, 2020 •

edited

Loading