Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(metricprovider): add prometheus range query support #3704

Merged
merged 13 commits into from
Jul 10, 2024

Conversation

mclarke47
Copy link
Contributor

@mclarke47 mclarke47 commented Jul 4, 2024

fixes: #3702

Checklist:

  • Either (a) I've created an enhancement proposal and discussed it with the community, (b) this is a bug fix, or (c) this is a chore.
  • The title of the PR is (a) conventional with a list of types and scopes found here, (b) states what changed, and (c) suffixes the related issues number. E.g. "fix(controller): Updates such and such. Fixes #1234".
  • I've signed my commits with DCO
  • I have written unit and/or e2e tests for my change. PRs without these are unlikely to be merged.
  • My builds are green. Try syncing with master if they are not.
  • My organization is added to USERS.md.

Signed-off-by: Matthew Clarke <mclarke@spotify.com>
Copy link
Contributor

github-actions bot commented Jul 4, 2024

Go Published Test Results

2 171 tests   2 171 ✅  2m 54s ⏱️
  119 suites      0 💤
    1 files        0 ❌

Results for commit 98df9ed.

♻️ This comment has been updated with latest results.

Signed-off-by: Matthew Clarke <mclarke@spotify.com>
Copy link

codecov bot commented Jul 4, 2024

Codecov Report

Attention: Patch coverage is 93.65079% with 4 lines in your changes missing coverage. Please review.

Project coverage is 80.30%. Comparing base (4f1edbe) to head (98df9ed).
Report is 89 commits behind head on master.

Files with missing lines Patch % Lines
utils/evaluate/evaluate.go 80.00% 3 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #3704      +/-   ##
==========================================
+ Coverage   80.27%   80.30%   +0.03%     
==========================================
  Files         156      156              
  Lines       17964    18018      +54     
==========================================
+ Hits        14420    14470      +50     
- Misses       2631     2634       +3     
- Partials      913      914       +1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Signed-off-by: Matthew Clarke <mclarke@spotify.com>
Copy link
Contributor

github-actions bot commented Jul 4, 2024

E2E Tests Published Test Results

  4 files    4 suites   3h 27m 32s ⏱️
111 tests 100 ✅  6 💤 5 ❌
452 runs  420 ✅ 24 💤 8 ❌

For more details on these failures, see this check.

Results for commit 98df9ed.

♻️ This comment has been updated with latest results.

Signed-off-by: Matthew Clarke <mclarke@spotify.com>
@mclarke47 mclarke47 changed the title WIP feat(metricprovider): add prometheus range query support feat(metricprovider): add prometheus range query support WIP Jul 4, 2024
mclarke47 added 4 commits July 4, 2024 17:49
Signed-off-by: Matthew Clarke <mclarke@spotify.com>
Signed-off-by: Matthew Clarke <mclarke@spotify.com>
Signed-off-by: Matthew Clarke <mclarke@spotify.com>
@mclarke47 mclarke47 marked this pull request as ready for review July 5, 2024 18:00
@mclarke47 mclarke47 changed the title feat(metricprovider): add prometheus range query support WIP feat(metricprovider): add prometheus range query support Jul 5, 2024
@mclarke47
Copy link
Contributor Author

mclarke47 commented Jul 5, 2024

It would be nice if we could parameterize lookBackDuration through args also, but not 100% sure how to do this. I could also do that in a follow up PR.

Signed-off-by: Matthew Clarke <mclarke@spotify.com>
@zachaller
Copy link
Collaborator

zachaller commented Jul 8, 2024

What's the difference between this and say doing a query like http_requests_total{job="prometheus"}[5m]?

Is the golang QueryRange function just some syntactic sugar for a raw query?

https://www.robustperception.io/step-and-query_range/

@zachaller zachaller added this to the v1.8 milestone Jul 8, 2024
@mclarke47
Copy link
Contributor Author

mclarke47 commented Jul 8, 2024

Query ranges are syntactic sugar for the query endpoint as it says in that blog post. However I think there are 2 main benefits to adding query ranges:

  • They allow writing queries similar to how they are viewed in graphs and so IMO makes it make the Prometheus Argo Rollouts provider more intuitive to use, since you can re-use queries from other sources without needing to first convert them to queries that would work with the query endpoint
  • Query ranges support the step argument which allows you to specify the windows in which the queries over the range are executed so you get a little more control over how the data it returned from prometheus

@zachaller
Copy link
Collaborator

zachaller commented Jul 8, 2024

That's fair, I do see it being a bit easier to reason about, also do you think it make sense to expose the start and end parameters vs hiding that behind lookBackDuration?

@mclarke47
Copy link
Contributor Author

I think that's would be good, although it should probably be relative to time.Now() otherwise we would be hard coding a datetime which seems not very useful.

@zachaller is there a way to access args of the analysis template inside the provider so that we can parameterize things?

@zachaller
Copy link
Collaborator

zachaller commented Jul 8, 2024

I think that's would be good, although it should probably be relative to time.Now() otherwise we would be hard coding a datetime which seems not very useful.

@zachaller is there a way to access args of the analysis template inside the provider so that we can parameterize things?

Pretty sure this works today because we do things like:

provider:
        prometheus:
          address: "http://prometheus:{{args.prometheus-port}}"
          query: |-
            (quantile(0.5, quantile_over_time(0.5, namespace_pod_cpu_utilization{namespace="{{args.namespace}}", pod=~".*-{{args.canary-hash}}-.*"}[11m])))

Might be worth checking if it still works with your code as well just incase the template system explicity looks for those fields etc.

I would also need to double check if we have access to say time.Now() or something in the templating system as well.

@mclarke47
Copy link
Contributor Author

Will double check if this already works

@mclarke47
Copy link
Contributor Author

Confirming:

...
  - name: lookback_duration
    value: "5m"
  - name: lookback_step
    value: "1m"
...    
    provider:
      prometheus:
        rangeQuery:
          lookBackDuration: "{{args.lookback_duration}}"
          step: "{{args.lookback_step}}"
...          

does work

@mclarke47
Copy link
Contributor Author

There are time functions in expr, I'm going to change the API to be start/end/step, with start's default being now(). I'll test this out then update the docs.

@mclarke47
Copy link
Contributor Author

mclarke47 commented Jul 8, 2024

hmmm, it seems this expression logic is only evaluated in success/failure conditions not in the actual fields of the provider, only args substitution works there.

I could:

  • Stick with a lookback duration as is currently implemented
  • Put in some logic to Evaluate these fields with util/evaluate

WDYT @zachaller?

Screenshot 2024-07-08 at 2 15 10 PM

(the image doesn't seem to be embedding it's at https://github.com/argoproj/argo-rollouts/assets/6514980/29710477-b02a-4a11-a71e-cbd4690ffea1)

@zachaller
Copy link
Collaborator

Yea, it looks like Analysis only use fasttemplate not expr as seen here

It might actually be fairly easy to add expr support directly to parsing analysis. I could see it being pretty useful in other cases for other providers too. Let me know your thoughts if adding expr support to analysis makes sense I would probably just start with time feature support.

Also, I think your example is missing query section.

Signed-off-by: Matthew Clarke <mclarke@spotify.com>
@mclarke47
Copy link
Contributor Author

Hey @zachaller, I pushed that change to treat the start/end fields as expr expressions. 🙏

mclarke47 added 2 commits July 8, 2024 15:13
Signed-off-by: Matthew Clarke <mclarke@spotify.com>
Signed-off-by: Matthew Clarke <mclarke@spotify.com>
Comment on lines 64 to 65
"asInt": asInt,
"asFloat": asFloat,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's remove asInt, asFloat and just use the built in funcs https://expr-lang.org/docs/language-definition#int

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed!

case time.Time:
return val, nil
default:
return time.Time{}, fmt.Errorf("expected string, but got %T", val)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

expected time.Time

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed!

@@ -112,29 +136,51 @@ func (p *Provider) GarbageCollect(run *v1alpha1.AnalysisRun, metric v1alpha1.Met
return nil
}

func sampleValuesToFloatSlice(SampleValues []model.SampleValue) []float64 {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we lower case SampleValues

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed!

return results
}

func sampleValuesToResultStr(SampleValues []model.SampleValue) string {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we lowercase SampleValues

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed!

@zachaller
Copy link
Collaborator

Just a few small changes then this LGTM

Signed-off-by: Matthew Clarke <mclarke@spotify.com>
@mclarke47
Copy link
Contributor Author

mclarke47 commented Jul 9, 2024

@zachaller have addressed your comments 🙏

@zachaller zachaller merged commit 3e4ea74 into argoproj:master Jul 10, 2024
23 checks passed
@mclarke47 mclarke47 deleted the add-prom-range-query-support branch July 10, 2024 14:15
@zachaller
Copy link
Collaborator

Thank you for the contribution!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support prometheus range queries
3 participants