Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Backport of [HCP Telemetry] Periodic Refresh for Dynamic Telemetry Configuration into release/1.15.x #18360

Conversation

Achooo
Copy link
Contributor

@Achooo Achooo commented Aug 2, 2023

The below text is copied from the body of the original PR.


Description

New feature for HCP Telemetry that allows us to dynamically re-configure metrics export to HCP.

Background
Currently, we need to restart the Consul server to bring in any updates to filters, a new endpoint or new labels from the Consul Telemetry Gateway in HCP.

Ideally, we are able to modify these dynamically, which requires logic to continuously fetch new configuration and update it as necessary.

Implementation
This PR enables this functionality, with changes to :

  1. New object: TelemetryConfigProvider:
  • Periodically fetches telemetry configuration from HCP, using a time.Ticker.
  • Holds configuration and exposes it via methods : GetEndpoint(), GetFilters() and GetLabels()
  • Updates configuration, ONLY if it has changed. Comparisons are done using a hash function on the structs holding the config.
  • Handles concurrency access to read the config and modify the config.
  1. OTELSink: uses a new ConfigProvider interface to access filters and labels.
  • Since labels are now dynamic, we can't add them to the OTEL resource (immutable) so a new function is added to compute OTEL labels dynamically.
  • Handle dynamic filters
  1. OTELExporter: uses a new EndpointProvider interface to access the endpoint.

** Cleanup / Refactor ***

  1. The client package was cleaned to separate out the TelemetryConfig object to parse and convert objects.
  2. A logger was added to the client to log bad filter information

This refactor was necessary since now the TelemetryConfigProvider also fetches telemetry config. This avoid duplication of logic for the parsing of the endpoint, labels ,etc.

Testing & Reproduction steps

  • End to end tests:
  • [x ] Unit tests:

Links

PR Checklist

  • updated test coverage
  • external facing docs updated
  • appropriate backport labels added
  • not a security concern

Overview of commits

…18168)

* OTElExporter now uses an EndpointProvider to discover the endpoint

* OTELSink uses a ConfigProvider to obtain filters and labels configuration

* improve tests for otel_sink

* Regex logic is moved into client for a method on the TelemetryConfig object

* Create a telemetry_config_provider and update deps to use it

* Fix conversion

* fix import newline

* Add logger to hcp client and move telemetry_config out of the client.go file

* Add a telemetry_config.go to refactor client.go

* Update deps

* update hcp deps test

* Modify telemetry_config_providers

* Check for nil filters

* PR review updates

* Fix comments and move around pieces

* Fix comments

* Remove context from client struct

* Moved ctx out of sink struct and fixed filters, added a test

* Remove named imports, use errors.New if not fformatting

* Remove HCP dependencies in telemetry package

* Add success metric and move lock only to grab the t.cfgHahs

* Update hash

* fix nits

* Create an equals method and add tests

* Improve telemetry_config_provider.go tests

* Add race test

* Add missing godoc

* Remove mock for MetricsClient

* Avoid goroutine test panics

* trying to kick CI lint issues by upgrading mod

* imprve test code and add hasher for testing

* Use structure logging for filters, fix error constants, and default to allow all regex

* removed hashin and modify logic to simplify

* Improve race test and fix PR feedback by removing hash equals and avoid testing the timer.Ticker logic, and instead unit test

* Ran make go-mod-tidy

* Use errtypes in the test

* Add changelog

* add safety check for exporter endpoint

* remove require.Contains by using error types, fix structure logging, and fix success metric typo in exporter

* Fixed race test to have changing config values

* Send success metric before modifying config

* Avoid the defer and move the success metric under
@Achooo
Copy link
Contributor Author

Achooo commented Aug 2, 2023

Manual backport due to failure in #18349:

  • Fix module conflicts
  • Removes test-integ/* files (which aren't on this branch) in case the file number change count is concerning that's why.

Copy link
Contributor

@chapmanc chapmanc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@Achooo Achooo merged commit 1fcd774 into release/1.15.x Aug 2, 2023
110 of 111 checks passed
@Achooo Achooo deleted the achooo/1.15.x/backport/cc-4960/hcp-telemetry-periodic-refresh branch August 2, 2023 17:38
@jira jira bot mentioned this pull request Aug 28, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
pr/dependencies PR specifically updates dependencies of project pr/no-backport
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants