-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[apm] peer.service aggregation for trace stats, option to compute stats based on span.kind #16103
Conversation
pkg/trace/stats/concentrator_test.go
Outdated
sp.Meta = map[string]string{"peer.service": "remote-service"} | ||
spans := []*pb.Span{sp} | ||
traceutil.ComputeTopLevel(spans) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we update concentrator.go
to compute stats for all peer.service
spans, regardless of whether it is _top_level
or _dd.measured
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good to me, although this implies that the tracer needs to explicitly set _dd.measured:1
for any span with peer.service
set if they want stats. Should we be collecting stats for all peer.service
spans? Is there danger of us computing too many stats if we do this?
Co-authored-by: Diana Shevchenko <40775148+dianashevchenko@users.noreply.github.com>
Co-authored-by: Diana Shevchenko <40775148+dianashevchenko@users.noreply.github.com>
….com:DataDog/datadog-agent into apm/peer-service-aggregation-for-trace-stats
…span.kind; add more tests
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, two copy suggestions!
releasenotes/notes/add-peer-service-for-trace-stats-225f1b90c9627c18.yaml
Outdated
Show resolved
Hide resolved
releasenotes/notes/add-peer-service-for-trace-stats-225f1b90c9627c18.yaml
Outdated
Show resolved
Hide resolved
Co-authored-by: Peter Kalmakis <peter.kalmakis@datadoghq.com>
…627c18.yaml Co-authored-by: Austin Lai <76412946+alai97@users.noreply.github.com>
…627c18.yaml Co-authored-by: Austin Lai <76412946+alai97@users.noreply.github.com>
Co-authored-by: Ahmed Mezghani <38987709+ahmed-mez@users.noreply.github.com>
Co-authored-by: Ahmed Mezghani <38987709+ahmed-mez@users.noreply.github.com>
Co-authored-by: Peter Kalmakis <peter.kalmakis@datadoghq.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, let's just document this in pkg/config/config_template.yaml
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ecdeaa4 looks good from OTel
What does this PR do?
Adds
peer.service
to stats payloads emitted by the trace agent.This PR also adds the ability for the agent to check a span's
span.kind
field to determine if the span is eligible for stats. This is a somewhat complementary but arguably necessary check to ensure that spans withpeer.service
(client/producer spans) get picked up for stats. Additionally, we'd want to ensure that the ingress spans (ones with kind equal to server/consumer) also are picked up for trace stats computation.Motivation
Feature enhancement:
peer.service
lets us compute edge-specific statistics between APM-instrumented services and remote services.Additional Notes
N/A
Possible Drawbacks / Trade-offs
Additional aggregation in the agent.
Describe how to test/QA your changes
apm_config.peer_service_aggregation
andapm_config.compute_stats_by_span_kind
.peer.service
set in the meta.PeerService
set in the grouped stats.PeerService
set in the grouped stats.Reviewer's Checklist
Triage
milestone is set.major_change
label if your change either has a major impact on the code base, is impacting multiple teams or is changing important well-established internals of the Agent. This label will be use during QA to make sure each team pay extra attention to the changed behavior. For any customer facing change use a releasenote.changelog/no-changelog
label has been applied.qa/skip-qa
label is not applied.team/..
label has been applied, indicating the team(s) that should QA this change.need-change/operator
andneed-change/helm
labels have been applied.k8s/<min-version>
label, indicating the lowest Kubernetes version compatible with this feature.