Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Advice: Is there a way of sampling a % of spans #36876

Open
amccague opened this issue Dec 17, 2024 · 8 comments
Open

Advice: Is there a way of sampling a % of spans #36876

amccague opened this issue Dec 17, 2024 · 8 comments
Labels
processor/probabilisticsampler Probabilistic Sampler processor processor/tailsampling Tail sampling processor question Further information is requested

Comments

@amccague
Copy link

Component(s)

pkg/ottl

Describe the issue you're reporting

I have some noisy spans that I want to filter. But equally I want to know if they're still being produced and from where.

Ideally I'd implement a filter with a percentage, so 1% of the spans still get through. Though I can't see a rand function or something like modulo, though I could possibly craft it using divide. In the absence of rand I could try to use something like the last digit component as my random source.

This doesn't seem like a novel idea though so potentially it's a solved problem and my searching skills aren't up to scatch.

These are leaf spans so it wouldn't impact any children.
I don't think this is worthy use of the tail sampling filter either.

@amccague amccague added the needs triage New item requiring triage label Dec 17, 2024
Copy link
Contributor

Pinging code owners:

See Adding Labels via Comments if you do not have permissions to add labels yourself.

@mkromkamp
Copy link

mkromkamp commented Dec 19, 2024

Bot the tailsampling and the probabilistic sampling processor support this scenario I think.

There is a short explanation when to choose which over here. Based on the ottl label I suspect the tail sampler might be the better fit 🙂

Hope that helps.

@amccague
Copy link
Author

Hi @mkromkamp - I have reviewed those two processors and to be honest I can't quite determine from the configuration examples in the readme how to achieve this. They appear to sample traces first and foremost.

I can't see the specific configurations to target a span by say attribute value or name. Could you be more specific about the configuration or samplers you had in mind?

Thanks

@mkromkamp
Copy link

mkromkamp commented Dec 22, 2024

Hi @mkromkamp - I have reviewed those two processors and to be honest I can't quite determine from the configuration examples in the readme how to achieve this. They appear to sample traces first and foremost.

I can't see the specific configurations to target a span by say attribute value or name. Could you be more specific about the configuration or samplers you had in mind?

Thanks

Yeah sure. I might need a concrete example. But # Rule 2: low sampling for readiness/liveness probes sounds roughly like what you are potentially looking for.

It keeps 10% (0.1) percent of traces for the /live and /ready endpoints for services with the name service-1, service-2, and service-3

Reformatted snippet to avoid having to search

name: team_a-probe
type: and
and:
  and_sub_policy:
    - name: service-name-policy
      type: string_attribute
      string_attribute:
        key: service.name
        values:
          - service-1
          - service-2
          - service-3
    - name: route-live-ready-policy
      type: string_attribute
      string_attribute:
        key: http.route
        values:
          - /live
          - /ready
        enabled_regex_matching: true
    - name: probabilistic-policy
      type: probabilistic
      probabilistic:
        sampling_percentage: 0.1

@amccague
Copy link
Author

@mkromkamp

This would as I understand it sample the whole (sub)trace for that route. Per my original request I want to probability sample a particular span, I want to keep the rest of the spans for that given route.

@mkromkamp
Copy link

@mkromkamp

This would as I understand it sample the whole (sub)trace for that route. Per my original request I want to probability sample a particular span, I want to keep the rest of the spans for that given route.

I see, that wasn't clear to me from the original post. And is indeed not possible with this approach 🙂

@TylerHelmuth TylerHelmuth added processor/tailsampling Tail sampling processor processor/probabilisticsampler Probabilistic Sampler processor question Further information is requested and removed pkg/ottl needs triage New item requiring triage labels Jan 7, 2025
Copy link
Contributor

github-actions bot commented Jan 7, 2025

Pinging code owners for processor/tailsampling: @jpkrohling. See Adding Labels via Comments if you do not have permissions to add labels yourself. For example, comment '/label priority:p2 -needs-triaged' to set the priority and remove the needs-triaged label.

Copy link
Contributor

github-actions bot commented Jan 7, 2025

Pinging code owners for processor/probabilisticsampler: @jpkrohling @jmacd. See Adding Labels via Comments if you do not have permissions to add labels yourself. For example, comment '/label priority:p2 -needs-triaged' to set the priority and remove the needs-triaged label.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
processor/probabilisticsampler Probabilistic Sampler processor processor/tailsampling Tail sampling processor question Further information is requested
Projects
None yet
Development

No branches or pull requests

3 participants