[processor/tailsamplingprocessor] `and` support for string invert match #9768

tillig · 2022-05-06T00:33:16Z

Description: Fix #9553: Adds support for tail sampling so the and policy can do string invert match.

Link to tracking Issue: #9553

Testing: Added unit tests to verify string attribute invert could both allow and disallow match.

Documentation: No documentation added. Documentation already notes string invert match usage should work, it just hadn't been implemented with and.

I didn't refactor the unit tests to be parameterized like those in string_tag_filter_test.go; unclear if that's required here.

pmm-sumo · 2022-05-08T15:05:39Z

@mottibec @fabiovn since it involves the and policy and invert matches, could you take a look as well?

pmm-sumo · 2022-05-08T15:12:40Z

processor/tailsamplingprocessor/internal/sampling/and.go

@@ -45,7 +45,7 @@ func (c *And) Evaluate(traceID pcommon.TraceID, trace *TraceData) (Decision, err
 		if err != nil {
 			return Unspecified, err
 		}
-		if decision == NotSampled {
+		if decision == NotSampled || decision == InvertNotSampled {


Nice find, now I'm wondering if other places which evaluate decision should be updated as well. Quickly looking at the code, this seems to be the case (though to be fair I haven't looked at tailsamplingprocessor code in a while and might be missing something)

I admit I haven't dived deep into the string matching, but it did make me wonder why it matters if it's Sampled or InvertSampled. It wasn't immediately obvious why a regular decision and an invert decision would need to be treated differently.

It seems it's the case as well, I tested it and it failed. I am opening second PR right now :)

@tillig I don't fully understand how the invert works but it looks like there's something related to the InvertSampled decision in the makeDecision function

Yeah, but even that seems really sort of weird to me.

// InvertNotSampled takes precedence over any other decision if samplingDecision[sampling.InvertNotSampled] { finalDecision = sampling.NotSampled } else if samplingDecision[sampling.Sampled] { finalDecision = sampling.Sampled } else if samplingDecision[sampling.InvertSampled] && !samplingDecision[sampling.NotSampled] { finalDecision = sampling.Sampled }

All the policies get executed before a decision is made. That's weird, and sounds like it's more of a composite thing than it should be. It feels like the policies should be executed in order and the first one that says "sample this thing" would allow sampling through. That'd perform better, too.

InvertNotSampled has the highest precedence, but InvertSampled has the lowest precedence and has to be specifically checked to see if it's also not contradicted by NotSampled.

I dunno, I'm coming at it late, I just don't see any doc on it and the special treatment there is kind of confusing. It gets even more confusing when I think about how this affects the way I might get an unexpected result from just a series of policies now. If I put an invert match policy in the mix and it says "no," it'll countermand any other policy I put in there. Even if it's not in an and policy. 🤔 Am I reading that wrong?

I agree it's strange and basing on my understand how tail-sampling processor is supposed to work, we should prioritise positive sampling decisions, since ultimately the policies are "or"-ed (unless and or composite operator/policy is used)

BTW, we had several discussion on how tail-sampling should be arranged. I was proposing in the past the adoption of the sampler I made (renamed it to cascadingfilterprocessor). It does a couple more things and has (I believe) a much simpler configuration. I am considering refreshing the idea of upstreaming it again. Would you find it useful @tillig ?

It'd be nice to get those design discussions on tail sampling codified somehow into the README or as inline code comments to explain why this does what it does, not just how (which is all the code explains). Or even just links to the discussion threads so folks like me could catch up.

I think cascadingfilterprocessor looks interesting. It shows doing exactly what I'm trying to do, which is:

Keep long running ops

Keep errors

Drop all noise (health, metrics)

Keep a small percentage of the remaining traces

But... that container doesn't have the dynatrace exporter in it, which I need, so it's a non-starter; and at the moment I'm mostly in POC-land trying to show that OpenTelemetry works (yeah, that's another issue entirely) so I don't have the wherewithal to build a custom Frankenstein container to make it happen. If the cascadingfilterprocessor was in here, I'd be all over trying it.

Sure thing, I was curious if you would find that useful. As for trying it out, it should be quite easy to include it via a builder

Cool, thanks. If/when that time comes, I'll have to actually start learning what "include it via a builder" means. I'm not a go dev and there's a long tail of things to learn. 😄

Yeah, I agree with that, the behavior of the tailsamplingprocessor is not what i would expect.
the readme should definitely be updated to reflect that.
I know there was talk about deprecating the tailsamplingprocessor so I like the idea of upstreaming cascadingfilterprocessor I think it can be a great fit to replace it.

tillig · 2022-05-08T15:20:01Z

It appears some sort of ballast load test failed here, but I'm unclear how adding this one "if" check would have caused that. Is there something different I should be doing?

TylerHelmuth · 2022-05-08T15:27:04Z

It appears some sort of ballast load test failed here, but I'm unclear how adding this one "if" check would have caused that. Is there something different I should be doing?

The load tests have been very flaky recently. The issue is likely not caused by your change. @djaglowski is working on a change to help fix the tests.

pmm-sumo

LGTM

I have prepared a PR with a similar fix to composite policy: #9793

github-actions · 2022-05-26T05:16:50Z

This PR was marked stale due to lack of activity. It will be closed in 14 days.

tillig · 2022-05-26T14:12:42Z

The PR here is still very much of interest to me. It looks like the unit tests failed with some things unrelated to my change (errors about trying to use a plugin that's not installed?) and the load tests have been noted as flaky and getting a fix.

If there's something I need to do here, please do let me know. I have to admit I'm somewhat new to Go so I'm not sure how to help fix the unrelated issues, but maybe with a little hint to get me going in the right direction?

djaglowski · 2022-05-26T14:59:47Z

@tillig I've removed the Stale label and restarted unit tests. Assuming no persistent issues there, we'll get this merged.

tillig · 2022-05-26T15:37:12Z

It appears the tests failing have to do with this test in the logtransformprocessor which isn't something I messed with. Let me see if I can pull changes from main to see if it's fixed there.

tillig · 2022-05-26T15:43:19Z

Merged main in here. Someone will have to kick off the build again to see if those tests start passing.

tillig · 2022-05-26T18:45:19Z

Looks like unrelated link checks were failing due to the README in unrelated modules. Brought in additional changes from main in the hopes it'll get me over the hump. (Should I stop bringing in main?)

djaglowski

Looks good to me. Seems worth noting in the changelog though.

tillig · 2022-05-26T19:52:12Z

@djaglowski Added. Thanks!

…ch (open-telemetry#9768) * [processor/tailsamplingprocessor] and support for string invert * Add open-telemetry#9553 to changelog.

This fixes the handling of AND policies that contain a sub-policy with invert_match=true. Previously if the decision from a policy evaluation was NotSampled or InvertNotSampled it would return a NotSampled decision regardless, effectively downgrading the result. This was breaking the documented behaviour that inverted decisions should take precedence over all others. This is related to the changes made in open-telemetry#9768 that introduced support for using invert_match within and sub policies.

…hen inside and sub policy (#33671) **Description:** This fixes the handling of AND policies that contain a sub-policy with invert_match=true. Previously if the decision from a policy evaluation was `NotSampled` or `InvertNotSampled` it would return a `NotSampled` decision regardless, effectively downgrading the result. This was breaking the documented behaviour that inverted decisions should take precedence over all others. This is related to the changes made in #9768 that introduced support for using invert_match within and sub policies. **Link to tracking Issue:** #33656 **Testing:** I tested manually that this fixes the issue described in #33656 and also updated the tests. If you have any suggestions for more tests we could add let me know. **Documentation:**

tillig added 2 commits May 5, 2022 15:49

[processor/tailsamplingprocessor] and support for string invert

f905ec5

Merge branch 'main' into issue-9553

d6b6856

tillig requested a review from a team May 6, 2022 00:33

tillig requested a review from jpkrohling as a code owner May 6, 2022 00:33

github-actions bot assigned codeboten May 6, 2022

tillig mentioned this pull request May 6, 2022

[processor/tailsamplingprocessor] and policy appears to ignore string_attribute match results #9553

Closed

pmm-sumo reviewed May 8, 2022

View reviewed changes

pmm-sumo mentioned this pull request May 8, 2022

[processor/tailsampling] Fix composite policy with inverse matching #9793

Merged

pmm-sumo approved these changes May 8, 2022

View reviewed changes

github-actions bot added the Stale label May 26, 2022

djaglowski removed the Stale label May 26, 2022

Merge branch 'main' into issue-9553

f97fd20

Merge branch 'main' into issue-9553

d572efc

djaglowski approved these changes May 26, 2022

View reviewed changes

Add open-telemetry#9553 to changelog.

80bab7d

djaglowski merged commit d9e0d00 into open-telemetry:main May 26, 2022

tillig deleted the issue-9553 branch May 26, 2022 20:20

jamesrwhite mentioned this pull request Jun 19, 2024

[processor/tailsampling] invert_match not given precedence when inside and policy #33656

Open

jamesrwhite mentioned this pull request Jun 20, 2024

[processor/tailsampling] fix InvertNotSampled decision precedence when inside and sub policy #33671

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[processor/tailsamplingprocessor] `and` support for string invert match #9768

[processor/tailsamplingprocessor] `and` support for string invert match #9768

tillig commented May 6, 2022

pmm-sumo commented May 8, 2022

pmm-sumo May 8, 2022

tillig May 8, 2022

pmm-sumo May 8, 2022

mottibec May 9, 2022

tillig May 9, 2022

pmm-sumo May 9, 2022

tillig May 9, 2022

pmm-sumo May 9, 2022

tillig May 9, 2022

mottibec May 11, 2022

tillig commented May 8, 2022

TylerHelmuth commented May 8, 2022

pmm-sumo left a comment

github-actions bot commented May 26, 2022

tillig commented May 26, 2022

djaglowski commented May 26, 2022

tillig commented May 26, 2022

tillig commented May 26, 2022

tillig commented May 26, 2022

djaglowski left a comment

tillig commented May 26, 2022

[processor/tailsamplingprocessor] and support for string invert match #9768

[processor/tailsamplingprocessor] and support for string invert match #9768

Conversation

tillig commented May 6, 2022

pmm-sumo commented May 8, 2022

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tillig commented May 8, 2022

TylerHelmuth commented May 8, 2022

pmm-sumo left a comment

Choose a reason for hiding this comment

github-actions bot commented May 26, 2022

tillig commented May 26, 2022

djaglowski commented May 26, 2022

tillig commented May 26, 2022

tillig commented May 26, 2022

tillig commented May 26, 2022

djaglowski left a comment

Choose a reason for hiding this comment

tillig commented May 26, 2022

[processor/tailsamplingprocessor] `and` support for string invert match #9768

[processor/tailsamplingprocessor] `and` support for string invert match #9768