Infer max query downsample resolution from promql query #7012

yeya24 · 2023-12-28T12:30:19Z

I added CHANGELOG entry for this change.
Change is not relevant to the end user.

Changes

This is useful to help with #6929.

No matter of auto downsample is enabled or not, always try to infer max allowed resolution based on promql query.

The main idea is from @fpetkovski to check Range and Func from hints to see if the promql function requires at least 2 samples in the range and then we adjust max allowed resolution accordingly.

For example I have a query like rate[2m]. Because rate function requires at least 2 samples to run so minimum resolution is 1m. This should avoid picking higher resolution like 5m or 1h.

Verification

Added unit test.

fpetkovski · 2023-12-28T14:01:03Z

Does it make sense to solve this at the store level? Since the range is passed in hints, we can decide whether to send downsampled or aggregated chunks to queriers.

yeya24 · 2023-12-28T15:07:21Z

Interesting idea. Yeah I think it works to get the range of matrix selector.
I don't have strong preference. But I wonder how we gonna do it at store. Do we ignore max_source_resolution from SeriesRequest and only infer it from the query range?
I think we still have the usecase where query range is quite large but we only want to query raw blocks.

Doing it at the query API level seems more flexible.

fpetkovski · 2023-12-28T15:56:56Z

Looking at the code, I think it would be best to set max resolution here based on the selected range. We can still query raw blocks over a large range, but we should cap the max resolution for small ranges to make sure downsampled blocks cannot be requested.

Otherwise we have to make many places in the codebase, and it will be hard to handle cases with different ranges in a binary expression.

MichaHoffmann · 2023-12-28T15:58:35Z

Can we make the engine set a hint since it already analyzes the query and then stores can just act on the hint that the engine provides?

yeya24 · 2023-12-29T00:08:39Z

Looking at the code, I think it would be best to set max resolution here based on the selected range. We can still query raw blocks over a large range, but we should cap the max resolution for small ranges to make sure downsampled blocks cannot be requested.

Umm, do we want to set different max resolution per select or use one max resolution value for the whole query?

My current implementation is the latter. Having different max resolution might also work, but I am wondering if it has some issues with our current results cache as it assumes a single resolution for the whole query.

fpetkovski

I see, then we can stay on the safe side and calculate the value based on all selectors.

yeya24 · 2023-12-30T02:28:25Z

Ideally we should remove max_source_resolution param totally and seamlessly query downsampled blocks. Users don't have to care about this param and it is more compatible to Prometheus API so that Cortex can also be benefited.
But it is a big change to me.

I think we still have the usecase where query range is quite large but we only want to query raw blocks.

I am wondering if this is a valid usecase. I can only think of people who want to run predict_linear over raw blocks for better precision because downsampled blocks don't downsample using this. But then it seems making non sense of downsampling.

Having different max resolution might also work, but I am wondering if it has some issues with our current results cache as it assumes a single resolution for the whole query.

It sounds fine to have different resolutions per vector selector.
Ideally max source resolution should be removed from the cache key since the resolution is inferred from the query.

GiedriusS

We were talking about this on KubeCon on Friday and the same day we had a question about why rate[5m] doesn't work on downsampled data so I think we should merge this. There is only one in-line suggestion.

GiedriusS · 2024-03-25T11:06:34Z

pkg/api/query/v1.go

+				if f, ok := node.(*parser.Call); ok {
+					switch f.Func.Name {
+					// Functions that require at least 2 samples.
+					case "rate", "irate", "increase", "idelta", "deriv", "predict_linear":


I'm 99% sure that we will forget to update this list once a new function appears that takes in a range vector. Perhaps this list could be converted into a list of functions like map[string]bool and then a unit test could be added that would check whether all parser.Functions are covered. This would make it so that it would be impossible to add another PromQL function without checking whether it needs to be added to this.

yeya24 · 2024-03-25T16:24:29Z

Now I am more towards the solution mentioned by @fpetkovski to use Select hints and Range to figure out the resolution.
I will try to finish the implementation and tests soon.

anarcher · 2024-10-03T03:18:16Z

Is there a plan to merge this PR? It seems like a really necessary feature for me :-) Thank you!

yeya24 · 2025-02-23T20:15:15Z

I have updated the PR with your suggestion. PTAL @fpetkovski @GiedriusS

Signed-off-by: Ben Ye <benye@amazon.com>

fpetkovski

This makes sense to me 👍

pull-request-size bot added the size/M label Dec 28, 2023

fpetkovski previously approved these changes Dec 29, 2023

View reviewed changes

GiedriusS reviewed Mar 25, 2024

View reviewed changes

yeya24 dismissed fpetkovski’s stale review via 8d17bba February 23, 2025 20:11

yeya24 force-pushed the downsample-query branch from 4b72f2c to 8d17bba Compare February 23, 2025 20:11

pull-request-size bot added size/L and removed size/M labels Feb 23, 2025

yeya24 changed the title ~~Consider query function when auto downsample~~ Infer max query downsample resolution from promql query Feb 23, 2025

Adjust max_source_resolution automatically based promql queries

6d6b81a

Signed-off-by: Ben Ye <benye@amazon.com>

yeya24 force-pushed the downsample-query branch from 8d17bba to 6d6b81a Compare February 23, 2025 20:16

fpetkovski approved these changes Feb 24, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Infer max query downsample resolution from promql query #7012

Infer max query downsample resolution from promql query #7012

yeya24 commented Dec 28, 2023 •

edited

Loading

fpetkovski commented Dec 28, 2023

yeya24 commented Dec 28, 2023

fpetkovski commented Dec 28, 2023 •

edited

Loading

MichaHoffmann commented Dec 28, 2023

yeya24 commented Dec 29, 2023

fpetkovski left a comment

yeya24 commented Dec 30, 2023 •

edited

Loading

GiedriusS left a comment

GiedriusS Mar 25, 2024 •

edited

Loading

yeya24 commented Mar 25, 2024

anarcher commented Oct 3, 2024

yeya24 commented Feb 23, 2025

fpetkovski left a comment

Infer max query downsample resolution from promql query #7012

Are you sure you want to change the base?

Infer max query downsample resolution from promql query #7012

Conversation

yeya24 commented Dec 28, 2023 • edited Loading

Changes

Verification

fpetkovski commented Dec 28, 2023

yeya24 commented Dec 28, 2023

fpetkovski commented Dec 28, 2023 • edited Loading

MichaHoffmann commented Dec 28, 2023

yeya24 commented Dec 29, 2023

fpetkovski left a comment

Choose a reason for hiding this comment

yeya24 commented Dec 30, 2023 • edited Loading

GiedriusS left a comment

Choose a reason for hiding this comment

GiedriusS Mar 25, 2024 • edited Loading

Choose a reason for hiding this comment

yeya24 commented Mar 25, 2024

anarcher commented Oct 3, 2024

yeya24 commented Feb 23, 2025

fpetkovski left a comment

Choose a reason for hiding this comment

yeya24 commented Dec 28, 2023 •

edited

Loading

fpetkovski commented Dec 28, 2023 •

edited

Loading

yeya24 commented Dec 30, 2023 •

edited

Loading

GiedriusS Mar 25, 2024 •

edited

Loading