Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs: Document the ability to use prefix in dynamic sampler FieldList #1396

Merged
merged 1 commit into from
Oct 23, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 8 additions & 0 deletions config/metadata/rulesMeta.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -102,6 +102,14 @@ groups:
all endpoints under normal traffic and call out when there is
failing traffic to any endpoint.

As of Refinery 2.8.0, the `root.` prefix can be used to limit the
field value to that of the root span. For example,
`root.http.response.status_code` will only consider the
`http.response.status_code` field from the root span rather than a
combination of all the spans in the trace. This is useful when you
want to sample based on the root span's properties rather than the
entire trace, and helps to reduce the cardinality of the sampler key.

In contrast, for example, consider as a bad set of fields: a
combination of `HTTP endpoint`, `status code`, and `pod id`, since it
would result in keys that are all unique, and therefore result in
Expand Down
15 changes: 15 additions & 0 deletions refinery_rules.md
Original file line number Diff line number Diff line change
Expand Up @@ -97,6 +97,9 @@ Using fields with very high cardinality, like `k8s.pod.id`, is a bad choice.
If the combination of fields essentially makes each trace unique, then the Dynamic Sampler will sample everything.
If the combination of fields is not unique enough, then you will not be guaranteed samples of the most interesting traces.
As an example, consider as a good set of fields: the combination of `HTTP endpoint` (high-frequency and boring), `HTTP method`, and `status code` (normally boring but can become interesting when indicating an error) since it will allowing proper sampling of all endpoints under normal traffic and call out when there is failing traffic to any endpoint.
As of Refinery 2.8.0, the `root.` prefix can be used to limit the field value to that of the root span.
For example, `root.http.response.status_code` will only consider the `http.response.status_code` field from the root span rather than a combination of all the spans in the trace.
This is useful when you want to sample based on the root span's properties rather than the entire trace, and helps to reduce the cardinality of the sampler key.
In contrast, for example, consider as a bad set of fields: a combination of `HTTP endpoint`, `status code`, and `pod id`, since it would result in keys that are all unique, and therefore result in sampling 100% of traces.
For example, rather than a set of fields, using only the `HTTP endpoint` field is a **bad** choice, as it is not unique enough, and therefore interesting traces, like traces that experienced a `500`, might not be sampled.
Field names may come from any span in the trace; if they occur on multiple spans, then all unique values will be included in the key.
Expand Down Expand Up @@ -199,6 +202,9 @@ Using fields with very high cardinality, like `k8s.pod.id`, is a bad choice.
If the combination of fields essentially makes each trace unique, then the Dynamic Sampler will sample everything.
If the combination of fields is not unique enough, then you will not be guaranteed samples of the most interesting traces.
As an example, consider as a good set of fields: the combination of `HTTP endpoint` (high-frequency and boring), `HTTP method`, and `status code` (normally boring but can become interesting when indicating an error) since it will allowing proper sampling of all endpoints under normal traffic and call out when there is failing traffic to any endpoint.
As of Refinery 2.8.0, the `root.` prefix can be used to limit the field value to that of the root span.
For example, `root.http.response.status_code` will only consider the `http.response.status_code` field from the root span rather than a combination of all the spans in the trace.
This is useful when you want to sample based on the root span's properties rather than the entire trace, and helps to reduce the cardinality of the sampler key.
In contrast, for example, consider as a bad set of fields: a combination of `HTTP endpoint`, `status code`, and `pod id`, since it would result in keys that are all unique, and therefore result in sampling 100% of traces.
For example, rather than a set of fields, using only the `HTTP endpoint` field is a **bad** choice, as it is not unique enough, and therefore interesting traces, like traces that experienced a `500`, might not be sampled.
Field names may come from any span in the trace; if they occur on multiple spans, then all unique values will be included in the key.
Expand Down Expand Up @@ -313,6 +319,9 @@ Using fields with very high cardinality, like `k8s.pod.id`, is a bad choice.
If the combination of fields essentially makes each trace unique, then the Dynamic Sampler will sample everything.
If the combination of fields is not unique enough, then you will not be guaranteed samples of the most interesting traces.
As an example, consider as a good set of fields: the combination of `HTTP endpoint` (high-frequency and boring), `HTTP method`, and `status code` (normally boring but can become interesting when indicating an error) since it will allowing proper sampling of all endpoints under normal traffic and call out when there is failing traffic to any endpoint.
As of Refinery 2.8.0, the `root.` prefix can be used to limit the field value to that of the root span.
For example, `root.http.response.status_code` will only consider the `http.response.status_code` field from the root span rather than a combination of all the spans in the trace.
This is useful when you want to sample based on the root span's properties rather than the entire trace, and helps to reduce the cardinality of the sampler key.
In contrast, for example, consider as a bad set of fields: a combination of `HTTP endpoint`, `status code`, and `pod id`, since it would result in keys that are all unique, and therefore result in sampling 100% of traces.
For example, rather than a set of fields, using only the `HTTP endpoint` field is a **bad** choice, as it is not unique enough, and therefore interesting traces, like traces that experienced a `500`, might not be sampled.
Field names may come from any span in the trace; if they occur on multiple spans, then all unique values will be included in the key.
Expand Down Expand Up @@ -398,6 +407,9 @@ Using fields with very high cardinality, like `k8s.pod.id`, is a bad choice.
If the combination of fields essentially makes each trace unique, then the Dynamic Sampler will sample everything.
If the combination of fields is not unique enough, then you will not be guaranteed samples of the most interesting traces.
As an example, consider as a good set of fields: the combination of `HTTP endpoint` (high-frequency and boring), `HTTP method`, and `status code` (normally boring but can become interesting when indicating an error) since it will allowing proper sampling of all endpoints under normal traffic and call out when there is failing traffic to any endpoint.
As of Refinery 2.8.0, the `root.` prefix can be used to limit the field value to that of the root span.
For example, `root.http.response.status_code` will only consider the `http.response.status_code` field from the root span rather than a combination of all the spans in the trace.
This is useful when you want to sample based on the root span's properties rather than the entire trace, and helps to reduce the cardinality of the sampler key.
In contrast, for example, consider as a bad set of fields: a combination of `HTTP endpoint`, `status code`, and `pod id`, since it would result in keys that are all unique, and therefore result in sampling 100% of traces.
For example, rather than a set of fields, using only the `HTTP endpoint` field is a **bad** choice, as it is not unique enough, and therefore interesting traces, like traces that experienced a `500`, might not be sampled.
Field names may come from any span in the trace; if they occur on multiple spans, then all unique values will be included in the key.
Expand Down Expand Up @@ -608,6 +620,9 @@ Using fields with very high cardinality, like `k8s.pod.id`, is a bad choice.
If the combination of fields essentially makes each trace unique, then the Dynamic Sampler will sample everything.
If the combination of fields is not unique enough, then you will not be guaranteed samples of the most interesting traces.
As an example, consider as a good set of fields: the combination of `HTTP endpoint` (high-frequency and boring), `HTTP method`, and `status code` (normally boring but can become interesting when indicating an error) since it will allowing proper sampling of all endpoints under normal traffic and call out when there is failing traffic to any endpoint.
As of Refinery 2.8.0, the `root.` prefix can be used to limit the field value to that of the root span.
For example, `root.http.response.status_code` will only consider the `http.response.status_code` field from the root span rather than a combination of all the spans in the trace.
This is useful when you want to sample based on the root span's properties rather than the entire trace, and helps to reduce the cardinality of the sampler key.
In contrast, for example, consider as a bad set of fields: a combination of `HTTP endpoint`, `status code`, and `pod id`, since it would result in keys that are all unique, and therefore result in sampling 100% of traces.
For example, rather than a set of fields, using only the `HTTP endpoint` field is a **bad** choice, as it is not unique enough, and therefore interesting traces, like traces that experienced a `500`, might not be sampled.
Field names may come from any span in the trace; if they occur on multiple spans, then all unique values will be included in the key.
Expand Down
17 changes: 16 additions & 1 deletion rules.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
# Honeycomb Refinery Rules Documentation

This is the documentation for the rules configuration for Honeycomb's Refinery.
It was automatically generated on 2024-10-11 at 16:33:02 UTC.
It was automatically generated on 2024-10-22 at 22:51:47 UTC.

## The Rules file

Expand Down Expand Up @@ -118,6 +118,9 @@ Using fields with very high cardinality, like `k8s.pod.id`, is a bad choice.
If the combination of fields essentially makes each trace unique, then the Dynamic Sampler will sample everything.
If the combination of fields is not unique enough, then you will not be guaranteed samples of the most interesting traces.
As an example, consider as a good set of fields: the combination of `HTTP endpoint` (high-frequency and boring), `HTTP method`, and `status code` (normally boring but can become interesting when indicating an error) since it will allowing proper sampling of all endpoints under normal traffic and call out when there is failing traffic to any endpoint.
As of Refinery 2.8.0, the `root.` prefix can be used to limit the field value to that of the root span.
For example, `root.http.response.status_code` will only consider the `http.response.status_code` field from the root span rather than a combination of all the spans in the trace.
This is useful when you want to sample based on the root span's properties rather than the entire trace, and helps to reduce the cardinality of the sampler key.
In contrast, for example, consider as a bad set of fields: a combination of `HTTP endpoint`, `status code`, and `pod id`, since it would result in keys that are all unique, and therefore result in sampling 100% of traces.
For example, rather than a set of fields, using only the `HTTP endpoint` field is a **bad** choice, as it is not unique enough, and therefore interesting traces, like traces that experienced a `500`, might not be sampled.
Field names may come from any span in the trace; if they occur on multiple spans, then all unique values will be included in the key.
Expand Down Expand Up @@ -223,6 +226,9 @@ Using fields with very high cardinality, like `k8s.pod.id`, is a bad choice.
If the combination of fields essentially makes each trace unique, then the Dynamic Sampler will sample everything.
If the combination of fields is not unique enough, then you will not be guaranteed samples of the most interesting traces.
As an example, consider as a good set of fields: the combination of `HTTP endpoint` (high-frequency and boring), `HTTP method`, and `status code` (normally boring but can become interesting when indicating an error) since it will allowing proper sampling of all endpoints under normal traffic and call out when there is failing traffic to any endpoint.
As of Refinery 2.8.0, the `root.` prefix can be used to limit the field value to that of the root span.
For example, `root.http.response.status_code` will only consider the `http.response.status_code` field from the root span rather than a combination of all the spans in the trace.
This is useful when you want to sample based on the root span's properties rather than the entire trace, and helps to reduce the cardinality of the sampler key.
In contrast, for example, consider as a bad set of fields: a combination of `HTTP endpoint`, `status code`, and `pod id`, since it would result in keys that are all unique, and therefore result in sampling 100% of traces.
For example, rather than a set of fields, using only the `HTTP endpoint` field is a **bad** choice, as it is not unique enough, and therefore interesting traces, like traces that experienced a `500`, might not be sampled.
Field names may come from any span in the trace; if they occur on multiple spans, then all unique values will be included in the key.
Expand Down Expand Up @@ -340,6 +346,9 @@ Using fields with very high cardinality, like `k8s.pod.id`, is a bad choice.
If the combination of fields essentially makes each trace unique, then the Dynamic Sampler will sample everything.
If the combination of fields is not unique enough, then you will not be guaranteed samples of the most interesting traces.
As an example, consider as a good set of fields: the combination of `HTTP endpoint` (high-frequency and boring), `HTTP method`, and `status code` (normally boring but can become interesting when indicating an error) since it will allowing proper sampling of all endpoints under normal traffic and call out when there is failing traffic to any endpoint.
As of Refinery 2.8.0, the `root.` prefix can be used to limit the field value to that of the root span.
For example, `root.http.response.status_code` will only consider the `http.response.status_code` field from the root span rather than a combination of all the spans in the trace.
This is useful when you want to sample based on the root span's properties rather than the entire trace, and helps to reduce the cardinality of the sampler key.
In contrast, for example, consider as a bad set of fields: a combination of `HTTP endpoint`, `status code`, and `pod id`, since it would result in keys that are all unique, and therefore result in sampling 100% of traces.
For example, rather than a set of fields, using only the `HTTP endpoint` field is a **bad** choice, as it is not unique enough, and therefore interesting traces, like traces that experienced a `500`, might not be sampled.
Field names may come from any span in the trace; if they occur on multiple spans, then all unique values will be included in the key.
Expand Down Expand Up @@ -428,6 +437,9 @@ Using fields with very high cardinality, like `k8s.pod.id`, is a bad choice.
If the combination of fields essentially makes each trace unique, then the Dynamic Sampler will sample everything.
If the combination of fields is not unique enough, then you will not be guaranteed samples of the most interesting traces.
As an example, consider as a good set of fields: the combination of `HTTP endpoint` (high-frequency and boring), `HTTP method`, and `status code` (normally boring but can become interesting when indicating an error) since it will allowing proper sampling of all endpoints under normal traffic and call out when there is failing traffic to any endpoint.
As of Refinery 2.8.0, the `root.` prefix can be used to limit the field value to that of the root span.
For example, `root.http.response.status_code` will only consider the `http.response.status_code` field from the root span rather than a combination of all the spans in the trace.
This is useful when you want to sample based on the root span's properties rather than the entire trace, and helps to reduce the cardinality of the sampler key.
In contrast, for example, consider as a bad set of fields: a combination of `HTTP endpoint`, `status code`, and `pod id`, since it would result in keys that are all unique, and therefore result in sampling 100% of traces.
For example, rather than a set of fields, using only the `HTTP endpoint` field is a **bad** choice, as it is not unique enough, and therefore interesting traces, like traces that experienced a `500`, might not be sampled.
Field names may come from any span in the trace; if they occur on multiple spans, then all unique values will be included in the key.
Expand Down Expand Up @@ -651,6 +663,9 @@ Using fields with very high cardinality, like `k8s.pod.id`, is a bad choice.
If the combination of fields essentially makes each trace unique, then the Dynamic Sampler will sample everything.
If the combination of fields is not unique enough, then you will not be guaranteed samples of the most interesting traces.
As an example, consider as a good set of fields: the combination of `HTTP endpoint` (high-frequency and boring), `HTTP method`, and `status code` (normally boring but can become interesting when indicating an error) since it will allowing proper sampling of all endpoints under normal traffic and call out when there is failing traffic to any endpoint.
As of Refinery 2.8.0, the `root.` prefix can be used to limit the field value to that of the root span.
For example, `root.http.response.status_code` will only consider the `http.response.status_code` field from the root span rather than a combination of all the spans in the trace.
This is useful when you want to sample based on the root span's properties rather than the entire trace, and helps to reduce the cardinality of the sampler key.
In contrast, for example, consider as a bad set of fields: a combination of `HTTP endpoint`, `status code`, and `pod id`, since it would result in keys that are all unique, and therefore result in sampling 100% of traces.
For example, rather than a set of fields, using only the `HTTP endpoint` field is a **bad** choice, as it is not unique enough, and therefore interesting traces, like traces that experienced a `500`, might not be sampled.
Field names may come from any span in the trace; if they occur on multiple spans, then all unique values will be included in the key.
Expand Down
5 changes: 3 additions & 2 deletions rules_complete.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,7 @@ Samplers:
ClearFrequency: 1m0s
FieldList:
- request.method
- http.target
- root.http.target
- response.status_code
UseTraceLength: true
env2:
Expand All @@ -47,7 +47,7 @@ Samplers:
BurstDetectionDelay: 3
FieldList:
- request.method
- http.target
- root.http.target
- response.status_code
UseTraceLength: true
env3:
Expand Down Expand Up @@ -134,3 +134,4 @@ Samplers:
GoalThroughputPerSec: 100
FieldList:
- request.method
- root.http.target
Loading