Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ignored transactions and consistent probability sampling #3307

Open
PeterF778 opened this issue Mar 8, 2023 · 1 comment
Open

Ignored transactions and consistent probability sampling #3307

PeterF778 opened this issue Mar 8, 2023 · 1 comment
Assignees
Labels
spec:trace Related to the specification/trace directory

Comments

@PeterF778
Copy link

PeterF778 commented Mar 8, 2023

Let's consider the following hypothetical example.

An OpenTelemetry user deploys a distributed application with tiers A, B, and C. Customer requests hit tier A, which makes calls to tier B, and tier B makes calls to C. The OTel user wants to trace some of the transactions only. Other transactions are to be ignored (for whatever reasons). The distinction between ignored and traced transaction is made by a custom sampler at tier A, based on the URL of the incoming requests.
When tiers B and C use ParentBasedSamplers with the default configuration, things work as expected, but the user may have performance issues because the sampling decisions made by tier A propagate to tiers B and C without any adjustment (fan out problem).

Employing Consistent Probability Samplers may fix the performance issue because sampling rates at tiers B and C can now be different. However, now calls to tier B and C may be traced even if they belong to transactions which the user wanted to ignore. This is because Consistent Probability Samplers ignore the sampled flag received from the parent. This behavior is not what the OTel user may want. For the ignored transactions the traces will always be incomplete, missing the root span.

But is it a valid use case?

If it is, a possible fix could be removing the requirement that a Consistent Probability Sampler always generates a new r-value for non-root spans (think tiers B and C), if it is missing. It should do it only when the parent's sampled flag is set. If the flag is false and there is no r-value, it should simply decide not to sample and leave the TraceState as is.

@PeterF778
Copy link
Author

After giving the issue some thought, I now believe this is not a valid use case. By that I do not mean that it should forbidden to completely ignore a class of requests from observability perspective, but it should be discouraged, and my proposed "fix" would make things even worse.
A downstream service (C), which might have a different owner than the root service A, needs to be able to trace/sample all incoming requests, if it desires. Otherwise, it would not have any way to calculate basic metrics, such as incoming request rate.
Another reason to not apply my proposed "fix" is that getting truly random trace-id bits (see randomness of trace id) opens an opportunity to omit the r-value from trace state. This would force each consistent probability sampler to (consistently) derive the r-value from the trace-id, but it also means that the absence of r-value could not be used to assume anything about the trace.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
spec:trace Related to the specification/trace directory
Projects
None yet
Development

No branches or pull requests

2 participants