Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Don't expose tracing endpoints over cos_agent when backend isn't available #228

Open
wants to merge 7 commits into
base: main
Choose a base branch
from

Conversation

mmkay
Copy link
Contributor

@mmkay mmkay commented Jan 7, 2025

Issue

When grafana-agent isn't related to a tracing backend, but a charm relates to grafana-agent over cos_agent requesting tracing receivers, charm event hooks take a significant amount of time as the charm tries to send traces to an endpoint that isn't open in grafana-agent.

Solution

Return null when tracing isn't available with a reasonable log message. Fixes #219.

Context

Backward compatibility was tested using https://github.com/canonical/postgresql-operator, observed behaviour:

  • if a charm with an older version of cos_agent connects to new grafana-agent and requests tracing endpoints when grafana-agent isn't connected to a tracing backend, its logs will show that the databag wasn't validated. As tracing receivers are the only part grafana-agent returns to the charm in this databag, it doesn't cause any issues, but the charm doesn't try to send charm traces, as desired.
  • if a charm with a current (rev13) version of cos_agent connects to new grafana-agent, the following log line will appear:
unit-postgresql-14: 12:08:10 WARNING unit.postgresql/14.juju-log database-peers:81: Endpoint for tracing wasn't provided as tracing backend isn't ready yet. If grafana-agent isn't connected to a tracing backend, integrate it. Otherwise this issue should resolve itself in a few events.

There's also an open question of whether grafana-agent (and grafana-agent-k8s) should be put into blocked state if a charm requests tracing, but grafana-agent isn't connected to the tracing backend. So far I think this should be a separate issue (and will create one once we agree on it), but I might be convinced otherwise.

Testing Instructions

Deploy the following bundle with a local build of grafana-agent from this branch.

default-base: ubuntu@22.04/stable
applications:
  grafana-agent:
    charm: grafana-agent
    channel: local
    revision: 0
  postgresql:
    charm: postgresql
    channel: 14/edge
    revision: 530
    num_units: 1
    to:
    - "2"
    constraints: arch=amd64
    storage:
      pgdata: rootfs,1,1024M
machines:
  "2":
    constraints: arch=amd64
relations:
- - postgresql:cos-agent
  - grafana-agent:cos-agent

Observe that connection exceptions in postgresql no longer appear in juju debug-log.

Upgrade Notes

@mmkay mmkay marked this pull request as ready for review January 8, 2025 11:48
@mmkay mmkay requested a review from a team as a code owner January 8, 2025 11:48
@@ -982,18 +1004,29 @@ def _on_relation_data_changed(self, event: RelationChangedEvent):

def update_tracing_receivers(self):
"""Updates the list of exposed tracing receivers in all relations."""
requested_tracing_protocols = self.requested_tracing_protocols()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the purpose of calling the requested_tracing_protocols method here versus in the for loop (original functionality)?

Copy link
Contributor

@MichaelThamm MichaelThamm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good overall, but worth having a tracing member review the details as well

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Inaccessible tracing URL is exposed over cos_agent if no tracing backend is connected
2 participants