-
Notifications
You must be signed in to change notification settings - Fork 8.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add usage collection for savedObject tagging #83160
Add usage collection for savedObject tagging #83160
Conversation
x-pack/plugins/saved_objects_tagging/server/usage/tag_usage_collector.ts
Outdated
Show resolved
Hide resolved
x-pack/plugins/saved_objects_tagging/server/usage/fetch_tag_usage_data.ts
Show resolved
Hide resolved
Pinging @elastic/kibana-platform (Team:Platform) |
x-pack/plugins/saved_objects_tagging/server/usage/tag_usage_collector.ts
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Mergable after removing the esClient
check or we can remove it once we fix the bug.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM now that the esClient check is removed.
FYI: the missing |
taggedObjects: { type: 'integer' }, | ||
|
||
types: { | ||
dashboard: perTypeSchema, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we collect tags usage data for more than just these two saved object types. I think we should try to avoid having this kind of dynamic fields in our schema and it will also increase the field count on our telemetry cluster.
What about:
perSavedObjectType: {
usedTags: { type: 'integer' },
taggedObjects: { type: 'integer' },
savedObjectType: {type: 'string'},
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
perSavedObjectType: { ...
Should we use a nested
or array
type in that case?
That would work. However I would be sure that is would be alright to the way telemetry consumes / uses the data.
When I look at the other collectors, none seems to be used nested fields, which is why I'm asking:
kibana/src/plugins/kibana_usage_collection/server/collectors/application_usage/schema.ts
Lines 52 to 59 in d67a421
export const applicationUsageSchema = { | |
// OSS | |
dashboards: commonSchema, | |
dev_tools: commonSchema, | |
discover: commonSchema, | |
home: commonSchema, | |
kibana: commonSchema, // It's a forward app so we'll likely never report it | |
management: commonSchema, |
note that this could only be a design flaw when the collector was implemented. @elastic/kibana-telemetry?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey! This is a legacy issue that no longer applies: in the past, before having the schema implementation, the indexer in the remote service indexed the data in the same format it received it from Kibana. This meant that we had to keep the mappings in mind when developing the collectors (and nested
was never an option because Kibana does not natively support viz with that type yet).
New collectors/fields shouldn't have those contraints anymore because the new indexer actually reshapes the payloads and splits the info into multiple indices based on the different needs to consume the information by our PMs and Analysts.
If we foresee the number of keys growing dynamically, I think the array
approach would be best.
cc @mindbat so he can share his thoughts from the indexer-maintainer POV 🙂
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
New collectors/fields shouldn't have those contraints anymore because the new indexer actually reshapes the payloads and splits the info into multiple indices based on the different needs to consume the information by our PMs and Analysts.
@afharo: We're not sending Kibana data to Telemetry Next yet and the legacy indexer doesn't reshape data by default. We would still need to update the indexer to consume the new data, reshape it and index it into a custom extended index.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So should we stick with the static, per type, mappings for now?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Even if we just index data as-is, can you elaborate why we couldn't use an array field type for the mappings?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and nested was never an option because Kibana does not natively support viz with that type yet
I missed @afharo's comment and I assume this applies to array fields too #58175
It makes sense that we want to do some processing inside Telemetry Next, but I think it would be much more flexible if Kibana had more control around which documents are created. If telemetry is uploaded as NDJSON then usage collectors could return several documents instead of always having a single large payload that needs to be handled and split by the server. This is probably a big effort, but maintaining a static schema for dynamic data like applications and saved object types that change every release feels like a maintenance nightmare.
So I guess for this PR we just need to add all the known saved object types to the mappings.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So I guess for this PR we just need to add all the known saved object types to the mappings.
We only need to add types that currently supports the tagging feature. That would just be dashboard, visualizations and maps. Will add maps. Further additions should only be done when the app/type implements the tagging feature.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done in 7b6d76e: added map
to the schema, and updated the readme to state that developers should update the schema when adding tagging support to additional SO types.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Even if we just index data as-is, can you elaborate why we couldn't use an array field type for the mappings?
Without any transformations on the remote service indexer, we'd end up with data mapped as an array in elasticsearch and face the usual problem of not being able to meaningfully query or visualize that data. It's a situation we're trying to fix with ui_metric
right now, for example.
@@ -1841,6 +1817,30 @@ | |||
} | |||
} | |||
}, | |||
"infraops": { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@elastic/kibana-telemetry Out of curiosity, any idea why the order of fields are changing here? Would be nice if we could make it stable to avoid unnecessary diffs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Bamieh do you have any idea how we can prevent reordering fields in the json files?
💚 Build SucceededMetrics [docs]Distributable file count
History
To update your PR or re-run it, just comment with: |
* add so tagging usage collection * update telemetry mappings * fix types * remove check on esClient presence * update schema and README
* master: (38 commits) [ML] Data frame analytics: Adds functionality to map view (elastic#83710) Add usage collection for savedObject tagging (elastic#83160) [SECURITY_SOLUTION] 145: Advanced Policy Tests (elastic#82898) [APM] Service overview transactions table (elastic#83429) [ML] Fix Single Metric Viewer not loading if job is metric with no partition (elastic#83880) do not export types from 3rd party modules as 'type' (elastic#83803) [Fleet] Allow to send SETTINGS action (elastic#83707) Fixes Failing test: Chrome X-Pack UI Functional Tests.x-pack/test/functional_with_es_ssl/apps/triggers_actions_ui/details·ts - Actions and Triggers app Alert Details Alert Instances renders the active alert instances (elastic#83478) [Uptime]Reduce chart height on monitor detail page (elastic#83777) [APM] Prefer `APIReturnType` over `PromiseReturnType` (elastic#83843) [Observability] Fix telemetry for Observability Overview (elastic#83847) [Alerting] Adds generic UI for the definition of conditions for Action Groups (elastic#83278) ensure workload agg doesnt run until next interval when it fails (elastic#83632) [ILM] Policy form should not throw away data (elastic#83077) [Monitoring] Stop collecting Kibana Usage in bulkUploader (elastic#83546) [TSVB] fix wrong imports (elastic#83798) [APM] Correlations UI POC (elastic#82256) list all the refs in tsconfig.json (elastic#83678) Bump jest (and related packages) to v26.6.3 (elastic#83724) Functional tests - stabilize reporting tests for cloud execution (elastic#83787) ...
Summary
Fix #81847
Add a new
saved_objects_tagging
usage collector to monitor tagging usages.Data structure:
Sample data:
Checklist