Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ddtrace/tracer: adding support for single span matching #1357

Merged
merged 35 commits into from
Aug 2, 2022
Merged
Show file tree
Hide file tree
Changes from 2 commits
Commits
Show all changes
35 commits
Select commit Hold shift + click to select a range
7044c89
pkg/tracer: added glob matcher for span sampling
dianashevchenko Jun 27, 2022
d8fa045
pkg/tracer: added single span sampling rules & updated the matcher
dianashevchenko Jun 27, 2022
1f9b759
pkg/tracer: added single span sampler
dianashevchenko Jun 29, 2022
d056ab1
pkg/tracer: added single span sampling at trace finish
dianashevchenko Jun 29, 2022
bd20964
pkg/tracer: filter out single spans to send to the agent if the decis…
dianashevchenko Jun 29, 2022
408f38d
updated according to comments
dianashevchenko Jun 30, 2022
f3de96c
removed outdated test
dianashevchenko Jun 30, 2022
a975fde
refactored according to comments Moved traceSampler and singleSpanSam…
dianashevchenko Jul 5, 2022
c465cae
fixed a test - wrong sampler method called
dianashevchenko Jul 5, 2022
db105b7
reworded log message
dianashevchenko Jul 5, 2022
95ee83b
Merge branch 'main' into shevchenko/single-span-ingestion
dianashevchenko Jul 5, 2022
92506eb
removed outdated constant in a test
dianashevchenko Jul 5, 2022
9939059
updated according to comments
dianashevchenko Jul 6, 2022
f554cda
added type field to marshalling sampling rules
dianashevchenko Jul 6, 2022
d7c149a
fixed comments, error formatting, added glob matcher benchmark
dianashevchenko Jul 7, 2022
4ab5b8c
added test for a rule marshalling, simple benchmark for single span r…
dianashevchenko Jul 8, 2022
11394fe
removed the top_level tag, added another glob match example
dianashevchenko Jul 13, 2022
b595aca
refactored according to comments
dianashevchenko Jul 19, 2022
8a134c5
Merge branch 'main' into shevchenko/single-span-ingestion
dianashevchenko Jul 19, 2022
b4d8d68
Merge branch 'main' into shevchenko/single-span-ingestion
dianashevchenko Jul 19, 2022
1348caa
updated according to comments
dianashevchenko Jul 19, 2022
4ce0635
moved single span sampling to the worker part
dianashevchenko Jul 20, 2022
9f73819
Merge branch 'shevchenko/single-span-ingestion' of https://github.com…
dianashevchenko Jul 20, 2022
f33308b
added docs to sampling rule struct
dianashevchenko Jul 20, 2022
e4e6f3c
added `partial` tag to dropped_p0_traces metric
dianashevchenko Jul 21, 2022
fbe94ea
fixed flaky sampling test
dianashevchenko Jul 21, 2022
2a625d5
Merge branch 'main' into shevchenko/single-span-ingestion
dianashevchenko Jul 22, 2022
d295eee
updated doc comments; set sampling priority to 2; do not remove the t…
dianashevchenko Aug 1, 2022
1de6ded
Merge branch 'main' into shevchenko/single-span-ingestion
dianashevchenko Aug 1, 2022
58ff019
changed Sampling Rule marshalling for better readability
dianashevchenko Aug 1, 2022
c0d7e02
Merge branch 'shevchenko/single-span-ingestion' of https://github.com…
dianashevchenko Aug 1, 2022
46b0fe6
fixed regex
dianashevchenko Aug 1, 2022
e8d62c7
refactored according to comments
dianashevchenko Aug 2, 2022
bafbfd9
removed unnecessary time.Sleep
dianashevchenko Aug 2, 2022
6dc3bc7
Merge branch 'main' into shevchenko/single-span-ingestion
dianashevchenko Aug 2, 2022
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
13 changes: 13 additions & 0 deletions ddtrace/ext/tags.go
Original file line number Diff line number Diff line change
Expand Up @@ -79,6 +79,19 @@ const (
// Search & Analytics event.
AnalyticsEvent = "analytics.event"

// SpanSamplingMechanism is the metric key holding a span sampling rule that a span was kept on
dianashevchenko marked this conversation as resolved.
Show resolved Hide resolved
SpanSamplingMechanism = "_dd.span_sampling.mechanism"

// SingleSpanSamplingMechanism is a constant reserved to indicate that a span was kept on account of a span sampling rule.
SingleSpanSamplingMechanism = 8
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are any of these constants meant to be used by users? Why are they public API?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that is good point, movedt to tracer/sampler

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's only resolve the comment once the change is committed please.


// SingleSpanSamplingRuleRate is the metric key containing the configured sampling probability for the span sampling rule.
SingleSpanSamplingRuleRate = "_dd.span_sampling.rule_rate"

// SingleSpanSamplingMPS is the metric key contains the configured limit for the span sampling rule that the span matched.
// // If there is no configured limit, then this tag is omitted.
ajgajg1134 marked this conversation as resolved.
Show resolved Hide resolved
SingleSpanSamplingMPS = "_dd.span_sampling.max_per_second"

// ManualKeep is a tag which specifies that the trace to which this span
// belongs to should be kept when set to true.
ManualKeep = "manual.keep"
Expand Down
4 changes: 2 additions & 2 deletions ddtrace/tracer/log_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -126,8 +126,8 @@ func TestLogSamplingRules(t *testing.T) {

lines := removeAppSec(tp.Lines())
assert.Len(lines, 2)
assert.Contains(lines[0], "WARN: at index 4: ignoring rule {Service: Name: Rate:9.10}: rate is out of [0.0, 1.0] range")
assert.Regexp(`Datadog Tracer v[0-9]+\.[0-9]+\.[0-9]+ WARN: DIAGNOSTICS Error\(s\) parsing DD_TRACE_SAMPLING_RULES: found errors:\n\tat index 1: rate not provided\n\tat index 3: rate not provided$`, lines[1])
assert.Contains(lines[0], "WARN: at index 4: ignoring rule {Service: Name: Rate:9.10 MaxPerSecond:0}: rate is out of [0.0, 1.0] range")
assert.Regexp(`Datadog Tracer v[0-9]+\.[0-9]+\.[0-9]+ WARN: DIAGNOSTICS Error\(s\) parsing sampling rules: found errors:\n\tat index 1: rate not provided\n\tat index 3: rate not provided$`, lines[1])
}

func TestLogAgentReachable(t *testing.T) {
Expand Down
113 changes: 99 additions & 14 deletions ddtrace/tracer/sampler.go
Original file line number Diff line number Diff line change
Expand Up @@ -188,19 +188,62 @@ func newRulesSampler(rules []SamplingRule) *rulesSampler {
}
}

// samplingRulesFromEnv parses sampling rules from the DD_TRACE_SAMPLING_RULES
// environment variable.
// samplingRulesFromEnv parses sampling rules from the DD_TRACE_SAMPLING_RULES environment variable
// and combines them with spanSamplingRules
dianashevchenko marked this conversation as resolved.
Show resolved Hide resolved
func samplingRulesFromEnv() ([]SamplingRule, error) {
rulesFromEnv := os.Getenv("DD_TRACE_SAMPLING_RULES")
if rulesFromEnv == "" {
errs := []string{}
dianashevchenko marked this conversation as resolved.
Show resolved Hide resolved
spanRules, err := spanSamplingRulesFromEnv()
if err != nil {
errs = append(errs, err.Error())
}
traceRules, err := processSamplingRulesFromEnv([]byte(os.Getenv("DD_TRACE_SAMPLING_RULES")), TraceSamplingRuleType)
if err != nil {
errs = append(errs, err.Error())
}

rules := append(spanRules, traceRules...)
if len(errs) != 0 {
return rules, fmt.Errorf("%s", strings.Join(errs, "\n\t"))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you look into what this error would look like?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm curious if this should be separated by ", " instead, as well.

}
return rules, nil
}

// spanSamplingRulesFromEnv parses sampling rules from the DD_SPAN_SAMPLING_RULES and
// DD_SPAN_SAMPLING_RULES_FILE environment variables.
func spanSamplingRulesFromEnv() ([]SamplingRule, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would absorb this function into the one above. It's a bit too much. Do we need both? They aren't really used anywhere and they don't necessarily make figuring this out easier.

Copy link
Contributor Author

@dianashevchenko dianashevchenko Jul 6, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wanted too but didn't like how it looked. I guess I can try again 🙂
Nvm, got it!

errs := []string{}
dianashevchenko marked this conversation as resolved.
Show resolved Hide resolved
rules, err := processSamplingRulesFromEnv([]byte(os.Getenv("DD_SPAN_SAMPLING_RULES")), SingleSpanSamplingType)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to do the same check here as we do for trace sampling rules? e.g.

	rulesFromEnv := os.Getenv("DD_SPAN_SAMPLING_RULES")
	if rulesFromEnv == "" {
		return nil, nil
	}
	return processSamplingRulesFromEnv([]byte(rulesFromEnv), SingleSpanSamplingType)

if err != nil {
errs = append(errs, err.Error())
}
if len(rules) != 0 {
if os.Getenv("DD_SPAN_SAMPLING_RULES_FILE") != "" {
log.Warn("DIAGNOSTICS Error(s): DD_SPAN_SAMPLING_RULES is available and will take precedence over DD_SPAN_SAMPLING_RULES_FILE")
}
return rules, err
}
rulesFile := os.Getenv("DD_SPAN_SAMPLING_RULES_FILE")
dianashevchenko marked this conversation as resolved.
Show resolved Hide resolved
if rulesFile == "" {
return rules, err
}
rulesFromEnvFile, err := os.ReadFile(rulesFile)
if err != nil {
dianashevchenko marked this conversation as resolved.
Show resolved Hide resolved
log.Warn("Couldn't read file from DD_SPAN_SAMPLING_RULES_FILE")
}
return processSamplingRulesFromEnv(rulesFromEnvFile, SingleSpanSamplingType)
}

func processSamplingRulesFromEnv(b []byte, ruleType SamplingRuleType) ([]SamplingRule, error) {
dianashevchenko marked this conversation as resolved.
Show resolved Hide resolved
if len(b) == 0 {
return nil, nil
}
jsonRules := []struct {
Service string `json:"service"`
Name string `json:"name"`
Rate json.Number `json:"sample_rate"`
}{}
err := json.Unmarshal([]byte(rulesFromEnv), &jsonRules)
var jsonRules []struct {
Service string `json:"service"`
Name string `json:"name"`
Rate json.Number `json:"sample_rate"`
MaxPerSecond float64 `json:"max_per_second"`
}
err := json.Unmarshal(b, &jsonRules)
if err != nil {
return nil, fmt.Errorf("error unmarshalling JSON: %v", err)
}
Expand All @@ -220,6 +263,32 @@ func samplingRulesFromEnv() ([]SamplingRule, error) {
log.Warn("at index %d: ignoring rule %+v: rate is out of [0.0, 1.0] range", i, v)
continue
}
if ruleType == SingleSpanSamplingType {
if v.Service == "" {
v.Service = "*"
}
srvGlob, err := globMatch(v.Service)
if err != nil {
log.Warn("at index %d: ignoring rule %+v: service name regex pattern can't be compiled", i, v)
continue
}
if v.Name == "" {
v.Name = "*"
}
opGlob, err := globMatch(v.Name)
if err != nil {
log.Warn("at index %d: ignoring rule %+v: operation name regex pattern can't be compiled", i, v)
continue
}
rules = append(rules, SamplingRule{
Service: srvGlob,
Name: opGlob,
Rate: rate,
MaxPerSecond: v.MaxPerSecond,
Type: SingleSpanSamplingType,
})
continue
}
switch {
case v.Service != "" && v.Name != "":
rules = append(rules, NameServiceRule(v.Name, v.Service, rate))
Expand All @@ -230,7 +299,7 @@ func samplingRulesFromEnv() ([]SamplingRule, error) {
}
}
if len(errs) != 0 {
return rules, fmt.Errorf("found errors:\n\t%s", strings.Join(errs, "\n\t"))
return rules, fmt.Errorf("\n\t%s", strings.Join(errs, "\n\t"))
}
return rules, nil
}
Expand Down Expand Up @@ -300,6 +369,13 @@ func (rs *rulesSampler) apply(span *span) bool {
if rule.match(span) {
matched = true
rate = rule.Rate
if rule.Type == SingleSpanSamplingType {
span.setMetric(ext.SpanSamplingMechanism, ext.SingleSpanSamplingMechanism)
span.setMetric(ext.SingleSpanSamplingRuleRate, rule.Rate)
if rule.MaxPerSecond != 0 {
span.setMetric(ext.SingleSpanSamplingMPS, rule.MaxPerSecond)
}
}
break
}
}
Expand Down Expand Up @@ -338,13 +414,22 @@ func (rs *rulesSampler) limit() (float64, bool) {
return math.NaN(), false
}

type SamplingRuleType int
dianashevchenko marked this conversation as resolved.
Show resolved Hide resolved

const (
TraceSamplingRuleType = iota
dianashevchenko marked this conversation as resolved.
Show resolved Hide resolved
SingleSpanSamplingType
dianashevchenko marked this conversation as resolved.
Show resolved Hide resolved
)

// SamplingRule is used for applying sampling rates to spans that match
// the service name, operation name or both.
// For basic usage, consider using the helper functions ServiceRule, NameRule, etc.
type SamplingRule struct {
Service *regexp.Regexp
Name *regexp.Regexp
Rate float64
Service *regexp.Regexp
Name *regexp.Regexp
Rate float64
MaxPerSecond float64
Type SamplingRuleType

exactService string
exactName string
Expand Down
Loading