[Telemetry] Use server's `lastReported` on the browser #121656

afharo · 2021-12-20T16:26:10Z

Summary

Resolves #87846.

This PR normalizes the logic to decide whether to send telemetry or not:

Moves isReportIntervalExpired to common so both the server and public ends use the same logic to decide whether the lastReport date is expired.
On the UI, it requests the stored value on the server-side so, if the server or any other browser has already reported telemetry in the expected interval (24h), it will skip it on this browser (cc @thesmallestduck @elastic/infra-telemetry).
On the server, the stored value also applies now. So, the same behaviour applies to the UI: if another Kibana instance or any browser has reported telemetry in the last 24h, it won't report again.

NOTE: All these changes are done in favour of reducing the number of requests, hence, the load of Kibana and Elasticsearch due to the generation of the telemetry report multiple times per day.

Checklist

Delete any items that are not applicable to this PR.

Unit or functional tests were updated or added to match the most common scenarios

Risk Matrix

Delete this section if it is not applicable to this PR.

Before closing this PR, invite QA, stakeholders, and other developers to identify risks that should be tested prior to the change/feature release.

When forming the risk matrix, consider some of the following examples and how they may potentially impact the change:

Risk	Probability	Severity	Mitigation/Notes
It may fail to store the `lastReported` updated value after a successful send if the process goes down, the tab is closed or there's a network issue right between the successful report and the SO update.	Low	Low	A new report may be generated if the "in-memory" value says so. However, this is not an issue per-se. It will only momentarily add some extra load to the servers.
Having one sender per day reduces some visibility in our usage: it'll be harder to know how many different instances of Kibana are running for the same cluster, we'll get fewer reports, meaning fewer variability in the user-agents	High	Low	We should actually collect that information explicitly instead.

For maintainers

This was checked for breaking API changes and was labeled appropriately

afharo · 2021-12-20T17:00:35Z

src/plugins/telemetry/server/fetcher.ts

-    { savedObjects, elasticsearch }: CoreStart,
-    { telemetryCollectionManager }: FetcherTaskDepsStart
-  ) {
+  public start({ savedObjects }: CoreStart, { telemetryCollectionManager }: FetcherTaskDepsStart) {


only removed the unused elasticsearch

afharo · 2021-12-20T17:01:38Z

src/plugins/telemetry/server/fetcher.ts

-      if (!this.lastReported || Date.now() - this.lastReported > REPORT_INTERVAL_MS) {
+      // Check both: in-memory and SO-driven value.
+      // This will avoid the server retrying over and over when it has issues with storing the state in the SO.
+      if (isReportIntervalExpired(this.lastReported) && isReportIntervalExpired(lastReported)) {


This implies the 2nd risk I listed in the description of the PR. I hope we're OK with it.

…er-uses-servers-lastReported

elasticmachine · 2021-12-21T10:33:41Z

Pinging @elastic/kibana-core (Team:Core)

afharo · 2021-12-23T11:07:11Z

@elasticmachine merge upstream

Bamieh

With this PR we're basically removing any redundancy in reporting telemetry usage. since we have no way to confirm that we actually got the data other than the status: 200 to check if the endpoint is reachable we'd be risking losing usage due to any connectivity issues.

Do you think adding another counter redundancyFactor that allows reporting usage 3 times a day for example before shouldSendUsage starts returning false? This way we have some redundancy while reducing the number of telemetry calls.

We'd also need product approval (@thesmallestduck ) that we are OK with losing any kind of redundancy we have.

Note that we do cache usage now every 4 hours so the cost of redundancy is greatly reduced.

afharo · 2021-12-23T15:26:55Z

since we have no way to confirm that we actually got the data other than the status: 200

Shouldn't this highlight an issue on the receiving side instead? IMO, if the request made it to the receiving end, that's a success from the Kibana POV. IMO, we shouldn't overload Kibana at the cost of performance to overcome issues on the receiving end.

If the users find that Telemetry is heavily decreasing the performance of their deployments, they'll simply disable it. I'd rather lose 1 day worth of telemetry (if the receiving end falsely replies 200) vs. missing the entire dataset.

We'd also need product approval (@thesmallestduck ) that we are OK with losing any kind of redundancy we have.

I completely agree! Adding him as a reviewer to make sure we wait for his approval.

Note that we do cache usage now every 4 hours so the cost of redundancy is greatly reduced.

Indeed it does! However, IMO, there are some cases that are not covered by the caching solution: big deployments with multiple instances of Kibana behind a load balancer, serving a high number of users. Each additional Kibana instance in the deployment will come with the cost of one "empty cache". This PR allows scalability at zero cost.

The cache mechanism is still useful for any retries we may need due to the receiver being down.

afharo · 2022-01-05T11:04:57Z

@elasticmachine merge upstream

kibana-ci · 2022-01-05T12:25:04Z

💚 Build Succeeded

Metrics [docs]

Module Count

Fewer modules leads to a faster build time

id	before	after	diff
`telemetry`	28	29	+1

Page load bundle

Size of the bundles that are downloaded on every page load. Target size is below 100kb

id	before	after	diff
`telemetry`	24.3KB	24.7KB	+457.0B

History

💛 Build #14978 was flaky 974a425
💚 Build #14438 succeeded ac45fa1
💔 Build #14312 failed fb9e7aa

To update your PR or re-run it, just comment with:
@elasticmachine merge upstream

thesmallestduck · 2022-01-05T17:54:27Z

Reducing to 1 browser-transmitted payload per day is okay by me.

kibanamachine · 2022-01-06T13:38:08Z

💔 Backport failed

The backport operation could not be completed due to the following error:
You must specify a valid Github repository

You can specify it via either:

Config file (recommended): ".backportrc.json". Read more: https://github.com/sqren/backport/blob/e119d71d6dc03cd061f6ad9b9a8b1cd995f98961/docs/configuration.md#project-config-backportrcjson
CLI: "--upstream elastic/kibana"

The backport PRs will be merged automatically after passing CI.

To backport manually run:
node scripts/backport --pr 121656

afharo · 2022-01-10T10:54:57Z

💚 All backports created successfully

Status	Branch	Result
✅	8.0

Note: Successful backport PRs will be merged automatically after passing CI.

Questions ?

Please refer to the Backport tool documentation

(cherry picked from commit 2d17554)

…2525) (cherry picked from commit 2d17554)

afharo commented Dec 20, 2021

View reviewed changes

[Telemetry] Use server's lastReported on the browser

fb9e7aa

afharo force-pushed the telemetry/browser-uses-servers-lastReported branch from a3292cc to fb9e7aa Compare December 20, 2021 17:20

afharo added 2 commits December 21, 2021 09:14

Fix broken test

46510fc

Merge branch 'main' of github.com:elastic/kibana into telemetry/brows…

ac45fa1

…er-uses-servers-lastReported

afharo marked this pull request as ready for review December 21, 2021 10:33

afharo requested a review from a team as a code owner December 21, 2021 10:33

Merge branch 'main' into telemetry/browser-uses-servers-lastReported

974a425

Bamieh reviewed Dec 23, 2021

View reviewed changes

afharo requested a review from thesmallestduck December 23, 2021 15:07

Merge branch 'main' into telemetry/browser-uses-servers-lastReported

4ca341a

afharo mentioned this pull request Jan 6, 2022

TelemetryAPIJourney - Retrieving the telemetry payload (cached vs. fresh) elastic/kibana-load-testing#211

Merged

Bamieh approved these changes Jan 6, 2022

View reviewed changes

afharo merged commit 2d17554 into elastic:main Jan 6, 2022

afharo deleted the telemetry/browser-uses-servers-lastReported branch January 6, 2022 13:37

afharo mentioned this pull request Jan 10, 2022

[8.0] [Telemetry] Use server's lastReported on the browser (#121656) #122525

Merged

afharo added a commit to afharo/kibana that referenced this pull request Jan 10, 2022

[Telemetry] Use server's lastReported on the browser (elastic#121656)

6a0c03d

(cherry picked from commit 2d17554)

afharo added a commit that referenced this pull request Jan 10, 2022

[Telemetry] Use server's lastReported on the browser (#121656) (#12…

927cc4d

…2525) (cherry picked from commit 2d17554)

gbamparop pushed a commit to gbamparop/kibana that referenced this pull request Jan 12, 2022

[Telemetry] Use server's lastReported on the browser (elastic#121656)

b63e9d6

afharo mentioned this pull request Mar 24, 2022

[Security Solution] Adds event log telemetry specific for security solution rules #128216

Merged

1 task

afharo mentioned this pull request Oct 27, 2022

[Telemetry] Snapshot collection may skip days #142058

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Telemetry] Use server's `lastReported` on the browser #121656

[Telemetry] Use server's `lastReported` on the browser #121656

afharo commented Dec 20, 2021 •

edited

Loading

afharo Dec 20, 2021

afharo Dec 20, 2021

elasticmachine commented Dec 21, 2021

afharo commented Dec 23, 2021

Bamieh left a comment

afharo commented Dec 23, 2021

afharo commented Jan 5, 2022

kibana-ci commented Jan 5, 2022

thesmallestduck commented Jan 5, 2022

kibanamachine commented Jan 6, 2022

afharo commented Jan 10, 2022

[Telemetry] Use server's lastReported on the browser #121656

[Telemetry] Use server's lastReported on the browser #121656

Conversation

afharo commented Dec 20, 2021 • edited Loading

Summary

Checklist

Risk Matrix

For maintainers

afharo Dec 20, 2021

Choose a reason for hiding this comment

afharo Dec 20, 2021

Choose a reason for hiding this comment

elasticmachine commented Dec 21, 2021

afharo commented Dec 23, 2021

Bamieh left a comment

Choose a reason for hiding this comment

afharo commented Dec 23, 2021

afharo commented Jan 5, 2022

kibana-ci commented Jan 5, 2022

💚 Build Succeeded

Metrics [docs]

Module Count

Page load bundle

History

thesmallestduck commented Jan 5, 2022

kibanamachine commented Jan 6, 2022

💔 Backport failed

afharo commented Jan 10, 2022

💚 All backports created successfully

Questions ?

[Telemetry] Use server's `lastReported` on the browser #121656

[Telemetry] Use server's `lastReported` on the browser #121656

afharo commented Dec 20, 2021 •

edited

Loading