-
Notifications
You must be signed in to change notification settings - Fork 84
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Deploytime api fix #187
Deploytime api fix #187
Conversation
…eploytime-repicasets
…into file-cleanup
…eploytime-api-fix
Debug message added. |
@malacourse I think we need to rethink the logic here a bit. Looking at the looping it's taking 5+ minutes for the exporter to return data. For some reason, it seems to be returning things in odd groups as well, starting with 1 record, and then repetitively spitting out 2, 3, 4, all the way up to 81 lines.
|
You might need to build a cache so you aren't looking up the same values over and over again. We had to do this in the committime exporter. |
Note that it only takes 5+ minutes when one of the apiVersions is missing (I changed |
Another suggestion that I have (possibly for both deploytime and committime) is that maybe we should refactor the code so that we aren't gathering data per namespace. This results in multiplying the number of api calls we have to make by the number of namespaces in the cluster, and if we hit a lot of 404 type errors, that's an expensive api call to have to make multiple times. |
Hey team, I tried this fix and hit the following permissions error -
|
@willowmck did you redeploy with the latest exporter chart? It looks to grant those permissions. https://github.com/redhat-mal/pelorus/blob/deploytime-api-fix/charts/pelorus/charts/exporters/templates/rbac.yaml |
My goof, I was on the wrong branch. However, now I'm back to the original error.
Something weird is happening though, because line 78 in that file is a blank line. |
@willowmck in order to run the code from this branch on OpenShift, you need to point your values file to this fork/branch. That would look like:
|
Ok, that's probably the issue. I'll make that change. |
ok, success! Thanks for bearing with me on what is clearly a senior moment. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@malacourse Left a few comments related to debugging. Also, when I run this, I hit a KeyError looking up the owner reference.
Exception has occurred: KeyError
'deploytime-exporter-2'
File "/home/esauer/src/pelorus/exporters/deploytime/app.py", line 90, in generate_metrics
rc = replicas_dict[ownerRef.name]
File "/home/esauer/src/pelorus/exporters/deploytime/app.py", line 25, in collect
metrics = generate_metrics(self._namespaces)
File "/home/esauer/src/pelorus/exporters/deploytime/app.py", line 128, in <module>
REGISTRY.register(DeployTimeCollector(namespaces))
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
another small change.
exporters/deploytime/app.py
Outdated
logging.debug("Getting replica: %s, kind: %s, namespace: %s", | ||
ownerRef.name, ownerRef.kind, namespace) | ||
|
||
if replicas_dict.get(ownerRef.name+namespace): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the standard kubernetes notation for this would be directory or url style: namespace + "/" + ownerRef.name
exporters/deploytime/app.py
Outdated
apiResource = dyn_client.resources.get(api_version=apiVersion, kind=objectName) | ||
replicationobjects = apiResource.get(label_selector=pelorus.get_app_label()) | ||
for replica in replicationobjects.items: | ||
replicas[replica.metadata.name+replica.metadata.namespace] = replica |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the standard kubernetes notation for this would be directory or url style: replica.metadata.namespace + "/" + replica.metadata.name
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@malacourse ok, code looks good, and not only did we fix the original errors but we cut the run time down from 30s - 3 or 4 minutes (when one apiVersion DNE), to 1 to 2 seconds!
# When all apis exist
$ time curl localhost:8080
...
real 0m1.179s
user 0m0.012s
sys 0m0.007s
# When one apiVersion is missing
$ time curl localhost:8080
real 0m1.179s
user 0m0.012s
sys 0m0.007s
LGTM!
Describe the behavior changes introduced in this PR
For later version of Openshift the "extensions/v1beta1" version of repicasets is not available. This causes the deploytime exporter to fail.
Linked Issues?
resolves #186
Testing Instructions
Test against higher version of OCP 4.4.16 or greater.
Please include any additional commands or pointers in addition to our standard PR testing process.
@redhat-cop/mdt