-
Notifications
You must be signed in to change notification settings - Fork 459
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Target Allocator - ServiceMonitor scheme #1669
Comments
So I did some digging and I think this is not the |
Could you send the response from the target allocator's scrape_configs endpoint? (i.e. |
Hi Below is the relevant section I think. (I have removed the other discovered resources)
Target Allocator logs
Collector logs
If you need anything else let me know Thanks |
Service Monitor setup
|
|
You may want to hide that secret... |
It is not a valid one |
It seems that
In the scrape config is the issue... I think this may be a target allocator bug. At first I thought we may need to let the TA mount a secret, but it's doing target discovery fine, so I think the real issue is something related to how we marshal the config which is out of our control. Either way, this is going to take a bit of investigatory work. |
Ok, let me know if you need me to do anything. And thanks for your help |
ah okay, prometheus-operator has a bunch of functions we need to call to make this work something like this block. I'd say this work is possible, but not a tiny change... I don't have a ton of capacity to work on it currently unfortunately. I'll ask around in the next operator SIG meeting and in the slack and see if anyone else can take it. |
Ok, thanks for you help |
@matej-g offered to help out here, thank you! Please let me know if you need any more clarification here. |
Thanks for the pointers @jaronoff97, sorry for the delay 🙂 - PR is here #1710 |
Some additional context, I also chatted with @jaronoff97. It seems that currently, we won't be able to support specifying these credential fields that are coming from secrets, without further work, possibly on both target allocator and receiver. When we specify a secret in the scrape config, it will get redacted to On the other hand, this should not affect credentials that can be provide with a file path - this would only require that the collector has that particular secret mounted. |
Sorry for the silence on this one. It looks like we're not the only project dealing with this issue, potentially we might have a solution from prometheus/common#487 (we might have to warn our users that their secrets might be exposed via target allocator endpoint, alas that is the downside if they want to use credentials). In order to move forward with #1710 (it also includes addition of some unit tests which are useful outside of this issue as well), @jaronoff97 would you be fine going with your suggestion - i.e. we can state that for now we support only those types of credentials that can be provided in a file, document this and revisit this topic once prometheus/common#487 moves forward. This will unfortunately not unblock the present issue instantly, but allows us to at least provide partial solution. |
I think this was mistakenly marked as resolved, #1710 provided only partial solution, this should be re-open. |
So this popped up again on my radar and I would like to address this, together with #1844 and build on changes incoming in #2328. But I think we need some alignment on the next steps. It seems we will need to deal with two distinct use cases:
cc @jaronoff97 |
@matej-g i've been thinking about this as well... let me address each of the cases:
All of this brings me to an unfortunate conclusion... I think we are reaching the limit of what is possible with this servicemonitor CRD architecture. The decision to have the CRDs be pulled by the target allocator was one that was made prior to me joining the project and it didn't feel right to critique it at the time. Given the issue we are discussing is one of many that stems from this architecture choice and that we are discussing the next version of our own architecture, I think now is the time to figure out if there is a better way to architect this functionality. In my opinion, we should move the CRD functionality entirely to the operator. The operator would take on the onus to pull the prometheus CRDs, translate their YAML to a scrape config and simply write it as part of the collector's (and TA's) config map. This would allow the operator to provide much more convenience when it comes to secrets and configs because we would no longer need to deal with the secrets being marshalled. Furthermore, moving this functionality out of the target allocator and on to the operator would reduce the scope of the TA (and probably improve its performance) and improve the overall user experience for the prom CRDs. The only drawback of doing this is that non-operator users would no longer be able to take advantage of this. Given that doing so has always been done outside our recommendations, I think this is a worthwhile tradeoff. ex: sequenceDiagram
User->>+Operator: Applies collector w/ promCRD enabled
Operator->>+Operator: Reconciles collector CRD
Operator-->>-Operator: Pulls prom CRDs that match selector
Operator->>+Target-Allocator: Creates TA w/ CRD config
Operator->>+Collector: Creates Collector w/ CRD config
Collector->>Target-Allocator: Requests Targets
On any change from the operators prom crd watcher it simply rolls out a new version of the collector and target allocator which both will have a hash of the scrape config as an annotation. Slotting in to the above configuration for the issue at hand would look like this: sequenceDiagram
User->>+Operator: Applies collector w/ promCRD enabled
Operator->>+Operator: Reconciles collector CRD
Operator-->>-Operator: Pulls prom CRDs that match selector
Operator->>+User: Propagates warnings for missing secrets/config mounts
After this exists, we could visit the possibility of doing this automatically for a user as well, though I'm sure that's going to be a bit thornier. Let me know your thoughts! Thank you 🙇 cc @swiatekm-sumo who may have some better ideas about this. |
For reference, what prometheus-operator does here is simply writing the whole configuration, credentials included, into a Secret, which is then mounted in the Prometheus Pod and config-reloader takes care of making Prometheus load the new configuration. I mention this because if we want to do anything outside of this loop, we'll need to add it ourselves. A similar architecture, which I understand Jacob's proposal to be, requires that we deal with the configuration reload problem, which is quite thorny, and is the reason prometheus-operator uses a config-reloader program in the first place. Restarting everything every time this configuration changes is a brute force solution, and I don't think it would scale well to large clusters. It certainly feels wrong to me to force these restarts on users who don't need them (because they don't need autorization for their scrapes). The alternative is to use something like config-reloader, but we'd need it for both the collector and the target allocator, and from what I know it's relatively expensive to reload the collector. Making users mount the Secrets manually in the Collector seems like it could be a reasonable workaround for the time being, but we'd also need to do develop a convention for file naming, as ServiceMonitor only allows a Secret name to be specified for authorization, so Target Allocator would need to change these to file path references, and the user would actually need to mount the Secrets under these file paths. Maybe the mid-term solution is simply to secure the connection between the Collector and Target Allocator? It wouldn't be particularly difficult to set up client TLS authentication for this connection - the operator can easily generate the Secret and ensure it's mounted in all the Pods. This also has some negative consequences, like making troubleshooting TA harder, but along with custom unmarshaling, it solves all the problems, and is only marginally more complicated than the filename approach. |
Yes, the configuration reload problem is definitely thorny. I was thinking with the rate limiting you put in place for the TA, we would use that to limit how often we're rolling out new workloads. I think setting up TLS between the Collector and TA is a good thing to do regardless of this issue and should be possible given we have a dependency on cert-manager already. Let's discuss these ideas at next week's SIG meeting. |
I don't think any amount of rate limiting will fix this. We're adding a per-node allocation strategy in #2430, and when using that you'd have to recreate all the Pods of a DaemonSet, which can be hundreds in a large cluster. Putting the usability of this aside, users have come to expect performance similar to prometheus-operator, which takes under a minute from a ServiceMonitor change to the first new scrape. |
yeah that's very true. Ugh. I can check in with collector SIG people and see if their efforts for dynamic reloading have made progress. I see you already have an open issue 😮💨 |
My two cents would be that the underscores should be removed and replaced with lowerCamelCasing. From the end-user perspective, what purpose do these files serve? Are they references to files mounted by a secret? If so, how would I define the secret? If they're managed by the OTEL operator, why do I need to care about the certificate configuration at all? |
As mention above, since the secrets for the initial iteration will not be
managed by the operator, the users will be creating the secrets and
mounting them using the specs for the TA and the Collector.
…On Tue, May 14, 2024, 02:35 Danny Seymour ***@***.***> wrote:
My two cents would be that the underscores should be removed and replaced
with lowerCamelCasing.
From the end-user perspective, what purpose do these files serve? Are they
references to files mounted by a secret? If so, how would I define the
secret? If they're managed by the OTEL operator, why do I need to care
about the certificate configuration at all?
—
Reply to this email directly, view it on GitHub
<#1669 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AQFGLZJOLUYL52OKRSJAE4LZCFE4VAVCNFSM6AAAAAAXF7MV6GVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCMBYHE4TSNJRGY>
.
You are receiving this because you were assigned.Message ID:
***@***.***>
|
I was actually thinking that we should start with what Jacob mentioned - if the cert-manager CRDs are installed and the operator has permission to create them, then use cert-manager to provision the certificates. I'd rather avoid adding fields to the Collector CRDs until we're confident we know how this should work. And ideally, I'd rather avoid making this configurable at all. |
Since @jaronoff97 suggested the initial version of this change doesn't need to include managing certs and we will provide example configs in the meanwhile, i was going in this direction. |
I'd personally rather do the no-config version first, which means using cert-manager. There isn't really any benefit to the user orchestrating the certs themselves, other than us not needing to do it, and like I mentioned, I really don't want to add fields to the CRD unless absolutely necessary. |
If the readme specifically states that cert manager needs to be installed in the cluster, why do we need to check if the CRDs exist for this functionality? |
That, in my opinion, is a documentation failure we should fix. Cert-manager is used by the default kustomize manifests for provisioning webhook certificates, but it's very much possible to not use it. For example, the official Helm Chart can work without cert-manager by manually creating the necessary Secrets. I don't think we should be making the dependency on cert-manager deeper. |
Got it. Thanks! |
Done the initial implementation. Tested locally with curl, open_ssl, etc. Testing with the Collector i see that although open-telemetry/opentelemetry-collector-contrib#31449 was merged the Collector client doesn't provide the certificates with the requests to the TA. Opened a bug - open-telemetry/opentelemetry-collector-contrib#33370 |
@swiatekm-sumo whenever there is a basicAuth in the spec.endpoints for a serviceMonitor the /jobs is empty although /scrape_configs returns the expected result. No error in the logs of the TA. |
That's weird. Targets are emitted by Prometheus' own discovery manager, so maybe we're not using it correctly? If scrape_configs is correct, then the config generation at least works correctly. |
Are you sure the Target Allocator has RBAC permissions to read the required Secrets? I'm not sure if we'd get an error about that. |
In the /scrape_configs endpoint I see the secret (when working with https). Double checked the rbac rules. There is get on all secrets |
I just tested this on apiVersion: v1
automountServiceAccountToken: true
kind: ServiceAccount
metadata:
name: ta
---
apiVersion: v1
automountServiceAccountToken: true
kind: ServiceAccount
metadata:
name: collector
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: targetallocator-prometheuscr
rules:
- apiGroups:
- ""
resources:
- pods
- nodes
- services
- endpoints
- configmaps
- secrets
- namespaces
verbs:
- get
- watch
- list
- apiGroups:
- apps
resources:
- statefulsets
- services
- endpoints
verbs:
- get
- watch
- list
- apiGroups:
- discovery.k8s.io
resources:
- endpointslices
verbs:
- get
- watch
- list
- apiGroups:
- networking.k8s.io
resources:
- ingresses
verbs:
- get
- watch
- list
- apiGroups:
- monitoring.coreos.com
resources:
- servicemonitors
- podmonitors
verbs:
- get
- watch
- list
- nonResourceURLs:
- /metrics
verbs:
- get
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: collector-prometheuscr
rules:
- apiGroups:
- ""
resources:
- pods
- nodes
- nodes/metrics
- services
- endpoints
- namespaces
verbs:
- get
- watch
- list
- apiGroups:
- networking.k8s.io
resources:
- ingresses
verbs:
- get
- watch
- list
- nonResourceURLs:
- /metrics
- /metrics/cadvisor
verbs:
- get
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: ta
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: targetallocator-prometheuscr
subjects:
- kind: ServiceAccount
name: ta
namespace: default
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: collector
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: collector-prometheuscr
subjects:
- kind: ServiceAccount
name: collector
namespace: default
---
apiVersion: opentelemetry.io/v1alpha1
kind: OpenTelemetryCollector
metadata:
name: prometheus-cr
spec:
config: |
receivers:
prometheus:
config:
scrape_configs: []
processors:
exporters:
prometheus:
endpoint: 0.0.0.0:9090
service:
pipelines:
metrics:
receivers: [prometheus]
processors: []
exporters: [prometheus]
mode: statefulset
serviceAccount: collector
targetAllocator:
enabled: true
prometheusCR:
enabled: true
scrapeInterval: 1s
serviceMonitorSelector: {}
serviceAccount: ta
---
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
name: prometheus-cr
spec:
endpoints:
- port: monitoring
basicAuth:
password:
name: basic-auth
key: password
username:
name: basic-auth
key: user
selector:
matchLabels:
app.kubernetes.io/managed-by: opentelemetry-operator
---
apiVersion: v1
kind: Secret
metadata:
name: basic-auth
data:
password: dG9vcg== # toor
user: YWRtaW4= # admin
type: Opaque If you're testing against a patched TA, maybe it's something in your patches that is causing the problem? |
In you manifests, you are not using TLS between the TA and the Collector. you should be seeing as before #2921 in the scrape_config. If there is no basicAuth - jobs are created. |
So you're missing jobs, but only over https? They exist over http? |
No. now even over http they are missing |
Must be a problem with the https server then, as the targets are clearly generated. You're the foremost expert on this, I think, I can't see anything obviously wrong with the https server setup. Maybe we can catch this via unit tests? |
Also, though you have basicAuth the actual /metrics endpoint doesn't require it. In not exactly 100% on the flow, but could it be that if basicAuth is there and should be used at the endpoint but fails somehow, no job will be created? |
I tried adding basicAuth to a serviceMonitor for which the TA does create a job (no auth is required at the /metrics endpoint) and it works as well. but as soon as i have a serviceMonitor for a service which requires basicAuth and the basicAuth is present in the ServiceMonitor, the job doesn't get created. |
Can we move this conversation to the CNCF Slack? This back and forth is clogging the issue, and it sounds like we need to really dive into the weeds for this one. |
@swiatekm-sumo Added #3015 draft. |
@ItielOlenick is this work completed now? I think we got in everything we need now right? |
I believe so, yes. |
Awesome, thank you so much for your incredible work here. For any on-lookers, please open new issues with relation to Target Allocator <> mTLS. |
Hi
I have the TA up and with the
PrometheusCR
enabled, this is to look at options of migrating to using the OTEL collector for more purposes.At present I am getting data as expected except for service monitors (to be fair in my test lab I only have a couple of these setup) that contain a "non default" scheme and have authentication data in them. eg
This config works with prometheus but using the otel collector (via the TA) I get
2023-04-20T20:04:56.248Z warn internal/transaction.go:121 Failed to scrape Prometheus endpoint {"kind": "receiver", "name": "prometheus", "data_type": "metrics", "scrape_timestamp": 1682021096246, "target_labels": "{__name__=\"up\", container=\"opensearch\", endpoint=\"http\", instance=\"100.64.141.115:9200\", job=\"first-cluster-masters\", namespace=\"opensearch-first-cluster\", pod=\"first-cluster-masters-1\", service=\"first-cluster-masters\"}"}
Just looking to find out if this is a known "issue" or is, and it probably is, it that I am doing something wrong? Any help is appreciated
This is all being deployed using the opentelemetry-operator helm chart version 0.26.3
Target Allocator
Collector
The text was updated successfully, but these errors were encountered: