-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix certain binary expression queries by optimizing through push down #6132
Conversation
Instant queries of the following type fail and return an `unimplemented` error: ``` sum(count_over_time({foo="bar"} | logfmt | duration > 2s [3s])) / sum(count_over_time({foo="bar"} [3s])) ``` Signed-off-by: Christian Haudum <christian.haudum@gmail.com>
If either the left hand side or the right hand side of a binary expression is a noop, we need to return the original expression so the whole expression is a noop as well, and thus not executed using the downstream engine. Otherwise, a binary expression that has a noop on either side, results in an `unimplemented` error when executed using the downstream engine. Signed-off-by: Christian Haudum <christian.haudum@gmail.com>
This change optimizes queries that use a vector aggregation without grouping around a range aggregation with a label extraction stage such as `json` or `logfmt`. Since the vector aggregation can be pushed down to the downstream query, the downstream query does not create a massive amount of streams, even though it contains a generic label extraction stage. Signed-off-by: Christian Haudum <christian.haudum@gmail.com>
./tools/diff_coverage.sh ../loki-main/test_results.txt test_results.txt ingester,distributor,querier,querier/queryrange,iter,storage,chunkenc,logql,loki Change in test coverage per package. Green indicates 0 or positive change, red indicates that test coverage for a package fell. + ingester 0%
+ distributor 0%
+ querier 0%
+ querier/queryrange 0%
+ iter 0%
+ storage 0%
+ chunkenc 0%
- logql -0.1%
+ loki 0% |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
Signed-off-by: Christian Haudum <christian.haudum@gmail.com>
Signed-off-by: Christian Haudum <christian.haudum@gmail.com>
./tools/diff_coverage.sh ../loki-main/test_results.txt test_results.txt ingester,distributor,querier,querier/queryrange,iter,storage,chunkenc,logql,loki Change in test coverage per package. Green indicates 0 or positive change, red indicates that test coverage for a package fell. + ingester 0%
+ distributor 0%
+ querier 0%
+ querier/queryrange 0%
+ iter 0%
+ storage 0%
+ chunkenc 0%
+ logql 0%
+ loki 0% |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
…#6132) * Add test cases that fail Instant queries of the following type fail and return an `unimplemented` error: ``` sum(count_over_time({foo="bar"} | logfmt | duration > 2s [3s])) / sum(count_over_time({foo="bar"} [3s])) ``` Signed-off-by: Christian Haudum <christian.haudum@gmail.com> * Fix certain binary expressions for instant queries If either the left hand side or the right hand side of a binary expression is a noop, we need to return the original expression so the whole expression is a noop as well, and thus not executed using the downstream engine. Otherwise, a binary expression that has a noop on either side, results in an `unimplemented` error when executed using the downstream engine. Signed-off-by: Christian Haudum <christian.haudum@gmail.com> * Optimize instant vector aggregations with log extraction stage This change optimizes queries that use a vector aggregation without grouping around a range aggregation with a label extraction stage such as `json` or `logfmt`. Since the vector aggregation can be pushed down to the downstream query, the downstream query does not create a massive amount of streams, even though it contains a generic label extraction stage. Signed-off-by: Christian Haudum <christian.haudum@gmail.com> * fixup! Add test cases that fail Signed-off-by: Christian Haudum <christian.haudum@gmail.com> * fixup! fixup! Add test cases that fail Signed-off-by: Christian Haudum <christian.haudum@gmail.com>
What this PR does / why we need it:
This PR optimizes instant queries that use a vector aggregation around a range aggregation with log extraction stage (
json
,logfmt
), such assum(count_over_time({foo="bar"} | logfmt | duration > 30s [24h]))
.Since the vector aggregation can be pushed down to the downstream query, the downstream query does not create a massive amount of streams, even though it contains a generic label extraction stage.
This would also fix binary expression queries that use said aggregation on one of its leave nodes, such as:
If either the left hand side or the right hand side of a binary expression is a noop, we need to return the original expression so the whole expression is a noop as well, and thus not executed using the downstream engine.
Otherwise, a binary expression that has a noop on either side, results in an
unimplemented
error when executed using the downstream engine.Which issue(s) this PR fixes:
Fixes #6130
Special notes for your reviewer:
Checklist
CHANGELOG.md
.docs/sources/upgrading/_index.md