Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add RDS tags on metrics #62

Conversation

wanix
Copy link
Contributor

@wanix wanix commented Sep 9, 2022

Hi all, first thanks for this great tooling :)

Instead of using static credentials only, I propose to use the default credentials management which includes static ones, but also assume roles and service accounts for Kubernetes usage.
I also propose to add the tags associated to an RDS instance to be able to filter the metrics from tags.
The ways used is not really elegant (all tags in one label), I would have preferred to add one label per tag pairs key=value but I didn't found how to do it.

The code was tested locally, a previous version successfully on one of our K8s cluster using IAM role and service-account (code from this PR).
With the latest changes on your master branch, I have more difficulties to build my own image (pkg or your quay.io image). I will try with the build from this PR.

@app-sre-bot
Copy link

Can one of the admins verify this patch?

@wanix wanix changed the title Wanix/add tags to rds metrics plus default credentials Use default AWS SDK credentials management + add RDS metrics Sep 9, 2022
@janboll
Copy link
Contributor

janboll commented Sep 12, 2022

ok to test

@janboll
Copy link
Contributor

janboll commented Sep 12, 2022

Hi, and thank you for providing this Pull Request!

I would appreciate it if you'd split the two features into individual PRs, would that be possible?

Regarding the tags, what is the use case behind this? Do you instead need to export some specific tags? I find merging them into a csv list counterintuitive. What do you think of mapping tags from instance to metric? I.e. have something like this in the configuration:

rds: 
   tag_export:
      - foo: foo
      - bar: bar

With that, you could specify which tags to fetch and how to set them on the metrics.
I'd also consider this a feature that is useful for other exporters as well. However, I'd understand if you'd go ahead with RDS first.

@wanix
Copy link
Contributor Author

wanix commented Sep 12, 2022

Hi,
ok to split, I will do it.
About the tags and label, the goal is to extract the tags and have them as labels to be able to filter the metrics (alarms routing, dashboards).
for the configuration, it could be something like this:

rds:
  tags_as_labels:
    - team
    - service
    - env

tags_as_labels are the keys we will search into the tags to export the values as labels values, we will have new labels for each metric (where the value are taken for RDS tags), like:
team=foo service=application-category env=dev

@wanix
Copy link
Contributor Author

wanix commented Sep 12, 2022

Split done with #63

@patjlm
Copy link
Contributor

patjlm commented Sep 13, 2022

I merged #63 so this will need a rebase. Could you please update the title as well.

Concerning the content, I am not so keen on having a list of tag values in a single label. The approach that seems more common is to have an other metric (eg db_instance_tags) with labels for each tags. See for example kube_pod_annotations or kube_pod_labels in kube-state-metrics.

Now, that means indeed generating metrics with a dynamic list of labels. We can maybe dive in how kube-state-metrics does it ?

@wanix wanix force-pushed the wanix/add-tags-to-rds-metrics-plus-default-credentials branch from ef76bf1 to 2b63a2c Compare September 13, 2022 06:42
@wanix wanix changed the title Use default AWS SDK credentials management + add RDS metrics Add RDS tags on metrics Sep 13, 2022
@wanix
Copy link
Contributor Author

wanix commented Sep 13, 2022

The goal is to have dynamic labels per metrics (adding each tag as new label).
For example, with the current code, we have:

aws_resources_exporter_rds_pendingmaintenanceactions{action="system-update",auto_apply_after="",aws_region="us-east-1",current_apply_date="",dbinstance_identifier="auth-sso-20220706093056048000000001",description="New Operating System update is available"} 1

and with a code managing tags if there are any, we should have (considering the RDS instance taken as example has 3 tags declared: owner, service, env)

aws_resources_exporter_rds_pendingmaintenanceactions{action="system-update",auto_apply_after="",aws_region="us-east-1",current_apply_date="",dbinstance_identifier="auth-sso-20220706093056048000000001",owner="teamA",service="auth",env="dev",description="New Operating System update is available"} 1

@wanix
Copy link
Contributor Author

wanix commented Sep 13, 2022

I see like this 2 ways, or we don't want too many labels for Prometheus and so we define from the conf which tag keys we want to care of if there are tags, something like:

rds:
  tags_as_labels:
    - team
    - service
    - env

Or we don't have such filter and we put as labels all the tags we find.

The problem we can have can also be, what do we do if there is any conflict ? For example, if we find a tag with the key "action" in the given example.

@janboll
Copy link
Contributor

janboll commented Sep 19, 2022

Sorry for the late response. We groomed this proposal and think it needs some revision.

We'd like to go forward and have a separate metric that just exports the tags, as seen in the kube-state-metrics. Please take a look at the metric kube_pod_labels specified here: https://github.com/kubernetes/kube-state-metrics/blob/master/docs/pod-metrics.md It is just a gauge with a static value of 1. There is one time series per pod/namespace, that describes the labels set on that pod.

Translated to our project, we'd like to have one additional metric named i.e. aws_rds_tags where the labels include information like:

  • aws account id (if possible)
  • region
  • rds name
  • tags (automatically discovered list of tags, prefixed with aws_tag)

This implies we'd have to add an exporter, that is dynamically registered per AWS RDS resource. This is a new concept, not seen in this project. Tags can additionally be filtered based on configuration, i.e. to avoid duplication or conflicts.

What do you think of this?

@wanix
Copy link
Contributor Author

wanix commented Sep 19, 2022

Hi, because on our usage the RDS id is dynamic (we do IaC with terraform) and can change at any moment for any reasons (temporary RDS, RDS restored in DR region, RDS copy, etc...), we want to be able to identify the RDS (and their relative metrics) using the tags we push with the RDS infra code.
I'm looking on what we can do with some magic aggregation using a metric containing the tags to be able to filter on this one, then have the real expected values from a usefull metric like aws_resources_exporter_rds_pendingmaintenanceactions.

@janboll
Copy link
Contributor

janboll commented Sep 19, 2022

Right, that should be possible then by joining the target metric with the label metric. I created an example for the pod container metrics, see:

kube_pod_container_resource_limits * on(pod) group_left kube_pod_labels{label_app=~"aws-resource-exporter.*"}  > 1

@wanix
Copy link
Contributor Author

wanix commented Sep 19, 2022

Yes, using the kube way to do, I have the desired goal with something like:

max(kube_pod_container_status_running) by (pod) * on (pod) group_right() kube_pod_labels{label_app_component="authentication"} >= 1

So with such a metric and the labels I gave, such a way can be:

min(aws_resources_exporter_rds_pendingmaintenanceactions) by (dbinstance_identifier) * on (dbinstance_identifier) group_right() aws_rds_tags{aws_tag_env="dev"} > 0

@wanix
Copy link
Contributor Author

wanix commented Sep 19, 2022

What do you think of this?

Your proposed solution will be a good one for our need I think.
How do we proceed ? Do you add this in your roadmap and I have only to monitor your project ? Should I try to propose a new PR (I'm not a GOlang expert 😅) based on the description you provided ?
In any case, should I simply close this PR ?

@janboll
Copy link
Contributor

janboll commented Sep 20, 2022

Yes, I think we can close this PR. I added a story to our backlog to look at this issue. Given there are similar efforts on our side, I think we might pick it up soon. (no promise though :))

@janboll janboll closed this Sep 20, 2022
@wanix
Copy link
Contributor Author

wanix commented Sep 22, 2022

thanks 😃

@wanix
Copy link
Contributor Author

wanix commented Nov 21, 2022

Hi @janboll , any news for this subject ?

@wanix
Copy link
Contributor Author

wanix commented Feb 23, 2023

Hi @janboll, sorry to bother you, any news ?

@janboll
Copy link
Contributor

janboll commented Feb 23, 2023

Hi @wanix, sorry for missing this the first time!
Implementing this isn't easy, cause the Prometheus client we use does not support multiple metrics with the same name: https://github.com/prometheus/client_golang/blob/53e51c4f5338f760a766232610e574b00ea720d8/prometheus/registry.go#L311

We did not pursue this, cause any solution possible (other promethus client, or custom implementation) would be a large endeavor.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants