-
Notifications
You must be signed in to change notification settings - Fork 529
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TraceQL performance improvement: dynamic reordering of binop branches #4163
Conversation
@@ -325,30 +325,50 @@ func (a Aggregate) evaluate(input []*Spanset) (output []*Spanset, err error) { | |||
} | |||
|
|||
func (o *BinaryOperation) execute(span Span) (Static, error) { | |||
recording := o.b.Recording | |||
if recording { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This code is crafted to stay out of the hot-path as much as possible. There may be a way to improve it even further. However I can say that previously pulling LHS.execute into a function pointer made things slower.
pkg/traceql/ast_execute.go
Outdated
switch o.Op { | ||
case OpAnd: | ||
if recording { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These penalties are probably the only tricky part to this approach. Happy to add more detail if things aren't clear.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Love it!
} | ||
|
||
// Start recording. Should be called immediately prior to a branch execution. | ||
func (b *branchOptimizer) Start() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about StartRecording and StopRecording?
|
||
// Start recording. Should be called immediately prior to a branch execution. | ||
func (b *branchOptimizer) Start() { | ||
b.start = time.Now() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why don't we use the same last
to track the start value? that way we don't need to share that variable:
b.last[branch] = time.Now()
and in the Finish method:
b.last[branch] = time.Since(b.last[branch])
We can rename last
to be more accurate
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting idea. The variable types are different, time.Time
vs time.Duration
, will have to see if the overhead of getting unix nanoseconds is less than the var.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
…grafana#4163) * Dynamic reordering of binop branches * consts and cleanup * changelog
What this PR does:
This PR builds on #4114 which introduced short-circuiting boolean expressions like true || false. Now we are going a step further and dynamically profiling and reordering statements at runtime to be choose the fastest path first. Consider a statement like
trueSlow || trueFast
. It's better to puttrueFast
first because it will allow us to return faster, and skip processing thetrueSlow
expression.This PR is the result of digging even more into the
particularly tough query run internally
mentioned in #4114 .The way it works is by not short-circuiting for the first 1000 executions, and instead measuring the runtime and usefulness of each branch. Then it checks whether the right-hand-side is the better choice and then flips the expression.
trueSlow || trueFast
becomestrueFast || trueSlow
1000 is derived experimentally. It's enough to reliably help but without adding too much overhead. In the benchmarks binary operations are run millions of times.
Benchmarks:
updated after faaee88
There is a very significant improvement to the tough query of +26%. However there is a small regression to already optimally-written queries, because we profile the branches but there is nothing to do.
Which issue(s) this PR fixes:
Fixes #
Checklist
CHANGELOG.md
updated - the order of entries should be[CHANGE]
,[FEATURE]
,[ENHANCEMENT]
,[BUGFIX]