-
Notifications
You must be signed in to change notification settings - Fork 8.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Telemetry] Custom logger appender? #89839
Comments
Pinging @elastic/kibana-core (Team:Core) |
+1 on this. We can add an explicit flag to the configs and default it to |
+1 I think we should aim to explicitly ignore expected errors though. E.g. reading a saved object can timeout when there's network instability, if we know this call will be repeated at some point in the future, we should just always ignore this error and not print any logs, not even a developer needs to see that it happened, it's part of the expected behaviour of the plugin. Then for the scenarios we're not sure we're handling correctly or that we don't know if we can recover from we can log in development. |
-1 wait, why would we have network instability when running Elasticsearch and Kibana on the same host? |
@LeeDr Elasticsearch can be non-responsive for multiple reasons even on the same host. Out of memory/space, node is restarting, kibana index is locked or misconfigured, etc. Our telemetry collectors are usually the first to scream when kibana is unable to reach ES for any reason. This usually means that customers open tickets that telemetry is causing kibana to fail although telemetry was only the first plugin to log an error in the server logs. If we silence the telemetry/collection logs on production or put them behind a config we'd be avoiding these situations which helps users identify the real cause of the issue. I'm assuming users would try to disable telemetry as a first step to debug such cases although the root cause is not related to telemetry, hence we'd be missing out from receiving usage from these clusters. |
For implementation details: the POC #95960 had it implemented. I think that the effort would be to:
|
@pjhampton added an important point: We should set Cloud to enable these logs as well by default so we can catch any potential bugs in those controlled environments. |
Follow up to #89588.
We are constantly switching
warn
/error
logs in the usage-collection/telemetry plugins todebug
to avoid making noise to our end-users. It's an easy way to silence those errors. However, it feels to me like debug errors can make it harder to identify any possible issues we may introduce (they are not noisy enough).How about we configure the loggers for the UsageCollection & Telemetry plugins to be silent when in production mode, but properly warn when in dev mode?
The text was updated successfully, but these errors were encountered: