Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Tracing] Fix WebRequest bug with existing distributed tracing headers #4770

Merged

Conversation

zacharycmontoya
Copy link
Collaborator

Summary of changes

When we pre-emptively add distributed tracing headers in WebRequest.BeginGetRequestStream, WebRequest.BeginGetResponse, and WebRequest.GetRequestStream, add the headers to a cache to signal to the later integrations that we had last updated the headers and we are responsible for the distributed tracing headers in the WebRequest object.

When we check for the pre-emptively added distributed tracing headers in WebRequest.EndGetResponse, WebRequest.GetResponse and WebRequest.GetResponseAsync, first check a cache to know whether we added distributed tracing headers to the WebRequest object. If present, then continue to use the injected trace context. Otherwise, we create a new context.

Reason for change

In our GetResponse integrations, we check the WebRequest headers to see if we had to pre-emptively add distributed tracing headers before the headers became locked, and if they are present then we use that context for the newly created WebRequest span.

The problem with this, as seen in a user issue, is that some applications may proxy incoming headers to the outbound HTTP requests. Before this change, our handling would cause our integrations for WebRequest.EndGetResponse, WebRequest.GetResponse, and WebRequest.GetResponseAsync to re-use the trace context in the headers for the WebRequest span, which results in the server span and the outgoing HTTP span having the same trace-id and span-id. After the fix, we only re-use the trace context when we know that we had last updated the headers.

Implementation details

For our cache, we use (surprise!) a ConditionalWeakTable to keep track of request headers that we injected a trace context into. We use this data structure because it will remove stale entries when the keys are no longer referenced. In the GetResponse integrations we only use the trace context from the request headers if our automatic instrumentation had injected the headers.

Test coverage

Update the WebRequest test application to manually add distributed tracing headers, and rewrite the integration tests to collect all spans from the application and assert that there are no duplicate trace-id/span-id combinations. Before the product change, we would see duplicates from the GetResponse and GetResponseAsync calls.

Other details

N/A

@zacharycmontoya zacharycmontoya requested a review from a team as a code owner October 27, 2023 00:03
agent.SpanFilters.Add(s => s.Type == SpanTypes.Http);
var spans = agent.WaitForSpans(expectedSpanCount).OrderBy(s => s.Start);
var allSpans = agent.WaitForSpans(expectedAllSpansCount).OrderBy(s => s.Start);
allSpans.Should().OnlyHaveUniqueItems(s => new { s.SpanId, s.TraceId });
Copy link
Collaborator Author

@zacharycmontoya zacharycmontoya Oct 27, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Before the product change (and after we started collecting all spans from the application), we would see this error message from the test. Notice how the parent span and the WebRequest spans have the same SpanId.


Expected allSpans to only have unique items on new <>f__AnonymousType11`2(SpanId == s.SpanId, TraceId == s.TraceId), but items {TraceId: 3286320956883792616, SpanId: 7849642159726264990, Name: GetResponseWithDistributedTracingHeaders, Resource: GetResponseWithDistributedTracingHeaders, Service: Samples.WebRequest, TraceId: 3286320956883792616, SpanId: 7849642159726264990, Name: http.client.request, Resource: GET localhost:49189/?/, Service: Samples.WebRequest, TraceId: 3286320956883792616, SpanId: 12757857473955457543, Name: GetResponseAsyncWithDistributedTracingHeaders, Resource: GetResponseAsyncWithDistributedTracingHeaders, Service: Samples.WebRequest, TraceId: 3286320956883792616, SpanId: 12757857473955457543, Name: http.client.request, Resource: GET localhost:49189/?/, Service: Samples.WebRequest} are not unique.

@datadog-ddstaging
Copy link

datadog-ddstaging bot commented Oct 27, 2023

Datadog Report

Branch report: zach/bug-fix/web-request-propagation-headers
Commit report: 6fe933f

❄️ dd-trace-dotnet: 0 Failed, 1 New Flaky, 298129 Passed, 942 Skipped, 36m 41.55s Wall Time

New Flaky Tests (1)

  • SubmitsTraces - Datadog.Trace.ClrProfiler.IntegrationTests.KafkaTests - Last Failure

    Expand for error
     Expected allData to contain a single item matching x.IsRequestType("app-closing"), but no such item was found.
     Expected latestIntegrations not to be empty.
     Expected varintegration=latestIntegrations {empty} to contain key "Kafka".
     Expected integration.Enabled to be True because should only be enabled if we generate a span, but found False.
     Expected True because should only be auto-enabled if available, but found <null>.
    

@andrewlock
Copy link
Member

andrewlock commented Oct 27, 2023

Execution-Time Benchmarks Report ⏱️

Execution-time results for samples comparing the following branches/commits:

Execution-time benchmarks measure the whole time it takes to execute a program. And are intended to measure the one-off costs. Cases where the execution time results for the PR are worse than latest master results are shown in red. The following thresholds were used for comparing the execution times:

  • Welch test with statistical test for significance of 5%
  • Only results indicating a difference greater than 5% and 5 ms are considered.

Note that these results are based on a single point-in-time result for each branch. For full results, see the dashboard.

Graphs show the p99 interval based on the mean and StdDev of the test run, as well as the mean value of the run (shown as a diamond below the graph).

gantt
    title Execution time (ms) FakeDbCommand (.NET Framework 4.6.2) 
    dateFormat  X
    axisFormat %s
    todayMarker off
    section Baseline
    This PR (4770) - mean (72ms)  : 62, 83
     .   : milestone, 72,
    master - mean (71ms)  : 63, 78
     .   : milestone, 71,

    section CallTarget+Inlining+NGEN
    This PR (4770) - mean (995ms)  : 976, 1014
     .   : milestone, 995,
    master - mean (991ms)  : 970, 1012
     .   : milestone, 991,

Loading
gantt
    title Execution time (ms) FakeDbCommand (.NET Core 3.1) 
    dateFormat  X
    axisFormat %s
    todayMarker off
    section Baseline
    This PR (4770) - mean (106ms)  : 101, 110
     .   : milestone, 106,
    master - mean (106ms)  : 101, 110
     .   : milestone, 106,

    section CallTarget+Inlining+NGEN
    This PR (4770) - mean (683ms)  : 665, 701
     .   : milestone, 683,
    master - mean (683ms)  : 663, 702
     .   : milestone, 683,

Loading
gantt
    title Execution time (ms) FakeDbCommand (.NET 6) 
    dateFormat  X
    axisFormat %s
    todayMarker off
    section Baseline
    This PR (4770) - mean (90ms)  : 87, 94
     .   : milestone, 90,
    master - mean (90ms)  : 88, 91
     .   : milestone, 90,

    section CallTarget+Inlining+NGEN
    This PR (4770) - mean (652ms)  : 629, 675
     .   : milestone, 652,
    master - mean (650ms)  : 631, 669
     .   : milestone, 650,

Loading
gantt
    title Execution time (ms) HttpMessageHandler (.NET Framework 4.6.2) 
    dateFormat  X
    axisFormat %s
    todayMarker off
    section Baseline
    This PR (4770) - mean (188ms)  : 185, 191
     .   : milestone, 188,
    master - mean (188ms)  : 185, 190
     .   : milestone, 188,

    section CallTarget+Inlining+NGEN
    This PR (4770) - mean (1,134ms)  : 1111, 1157
     .   : milestone, 1134,
    master - mean (1,137ms)  : 1115, 1160
     .   : milestone, 1137,

Loading
gantt
    title Execution time (ms) HttpMessageHandler (.NET Core 3.1) 
    dateFormat  X
    axisFormat %s
    todayMarker off
    section Baseline
    This PR (4770) - mean (272ms)  : 267, 276
     .   : milestone, 272,
    master - mean (271ms)  : 267, 275
     .   : milestone, 271,

    section CallTarget+Inlining+NGEN
    This PR (4770) - mean (1,099ms)  : 1073, 1126
     .   : milestone, 1099,
    master - mean (1,103ms)  : 1076, 1129
     .   : milestone, 1103,

Loading
gantt
    title Execution time (ms) HttpMessageHandler (.NET 6) 
    dateFormat  X
    axisFormat %s
    todayMarker off
    section Baseline
    This PR (4770) - mean (263ms)  : 260, 266
     .   : milestone, 263,
    master - mean (262ms)  : 258, 267
     .   : milestone, 262,

    section CallTarget+Inlining+NGEN
    This PR (4770) - mean (1,066ms)  : 1043, 1088
     .   : milestone, 1066,
    master - mean (1,068ms)  : 1044, 1091
     .   : milestone, 1068,

Loading

@andrewlock
Copy link
Member

andrewlock commented Oct 27, 2023

Throughput/Crank Report:zap:

Throughput results for AspNetCoreSimpleController comparing the following branches/commits:

Cases where throughput results for the PR are worse than latest master (5% drop or greater), results are shown in red.

Note that these results are based on a single point-in-time result for each branch. For full results, see one of the many, many dashboards!

gantt
    title Throughput Linux x64 (Total requests) 
    dateFormat  X
    axisFormat %s
    section Baseline
    This PR (4770) (10.861M)   : 0, 10860898
    master (10.293M)   : 0, 10292812
    benchmarks/2.38.0 (11.867M)   : 0, 11866956
    benchmarks/2.9.0 (10.958M)   : 0, 10957550

    section Automatic
    This PR (4770) (7.846M)   : 0, 7845545
    master (7.626M)   : 0, 7626259
    benchmarks/2.38.0 (8.176M)   : 0, 8175636
    benchmarks/2.9.0 (7.908M)   : 0, 7907971

    section Trace stats
    This PR (4770) (8.038M)   : 0, 8037599
    master (7.981M)   : 0, 7980764
    benchmarks/2.38.0 (8.450M)   : 0, 8450370

    section Manual
    This PR (4770) (9.620M)   : 0, 9619892
    master (9.696M)   : 0, 9696455
    benchmarks/2.38.0 (10.334M)   : 0, 10334368

    section Manual + Automatic
    This PR (4770) (7.243M)   : 0, 7243143
    master (7.342M)   : 0, 7342321
    benchmarks/2.38.0 (7.750M)   : 0, 7750484

    section Version Conflict
    This PR (4770) (6.564M)   : 0, 6563856
    master (6.511M)   : 0, 6510524
    benchmarks/2.38.0 (7.137M)   : 0, 7136691

Loading
gantt
    title Throughput Linux arm64 (Total requests) 
    dateFormat  X
    axisFormat %s
    section Baseline
    This PR (4770) (9.515M)   : 0, 9515163
    master (9.373M)   : 0, 9372820
    benchmarks/2.38.0 (9.548M)   : 0, 9548121
    benchmarks/2.9.0 (9.589M)   : 0, 9589306

    section Automatic
    This PR (4770) (6.628M)   : 0, 6627651
    master (6.736M)   : 0, 6736494
    benchmarks/2.38.0 (6.747M)   : 0, 6747227

    section Trace stats
    This PR (4770) (7.005M)   : 0, 7005269
    master (6.838M)   : 0, 6838459
    benchmarks/2.38.0 (6.815M)   : 0, 6814846

    section Manual
    This PR (4770) (8.446M)   : 0, 8445701
    master (8.359M)   : 0, 8359027
    benchmarks/2.38.0 (8.263M)   : 0, 8263131

    section Manual + Automatic
    This PR (4770) (6.237M)   : 0, 6236969
    master (6.341M)   : 0, 6341109
    benchmarks/2.38.0 (6.275M)   : 0, 6275411

    section Version Conflict
    This PR (4770) (5.625M)   : 0, 5625412
    master (5.798M)   : 0, 5797802
    benchmarks/2.38.0 (5.670M)   : 0, 5669744

Loading
gantt
    title Throughput Windows x64 (Total requests) 
    dateFormat  X
    axisFormat %s
    section Baseline
    This PR (4770) (9.602M)   : 0, 9601520
    master (9.505M)   : 0, 9504792
    benchmarks/2.38.0 (9.007M)   : 0, 9007066
    benchmarks/2.9.0 (9.595M)   : 0, 9594704

    section Automatic
    This PR (4770) (6.686M)   : 0, 6685823
    master (6.739M)   : 0, 6739142
    benchmarks/2.38.0 (6.423M)   : 0, 6422756
    benchmarks/2.9.0 (7.049M)   : 0, 7048800

    section Trace stats
    This PR (4770) (7.050M)   : 0, 7050477
    master (7.014M)   : 0, 7013989
    benchmarks/2.38.0 (6.721M)   : 0, 6720838

    section Manual
    This PR (4770) (8.427M)   : 0, 8426964
    master (8.381M)   : 0, 8380765
    benchmarks/2.38.0 (7.977M)   : 0, 7977494

    section Manual + Automatic
    This PR (4770) (6.452M)   : 0, 6451609
    master (6.515M)   : 0, 6515344
    benchmarks/2.38.0 (6.349M)   : 0, 6349066

    section Version Conflict
    This PR (4770) (5.949M)   : 0, 5949452
    master (6.043M)   : 0, 6043023
    benchmarks/2.38.0 (5.682M)   : 0, 5681654

Loading
gantt
    title Throughput Linux x64 (ASM) (Total requests) 
    dateFormat  X
    axisFormat %s
    section Baseline
    master (7.912M)   : 0, 7912308
    benchmarks/2.38.0 (7.578M)   : 0, 7577544
    benchmarks/2.9.0 (7.477M)   : 0, 7476547

    section No attack
    master (2.269M)   : 0, 2269345
    benchmarks/2.38.0 (2.184M)   : 0, 2184279
    benchmarks/2.9.0 (3.151M)   : 0, 3151044

    section Attack
    master (1.782M)   : 0, 1782477
    benchmarks/2.38.0 (1.711M)   : 0, 1710542
    benchmarks/2.9.0 (2.361M)   : 0, 2361022

    section Blocking
    master (3.615M)   : 0, 3614991
    benchmarks/2.38.0 (3.486M)   : 0, 3486124

    section IAST default
    master (7.079M)   : 0, 7078508

    section IAST full
    master (6.347M)   : 0, 6346955

    section Base vuln
    master (0.957M)   : 0, 956568

    section IAST vuln
    master (0.931M)   : 0, 930647

Loading

@andrewlock
Copy link
Member

andrewlock commented Oct 27, 2023

Benchmarks Report 🐌

Benchmarks for #4770 compared to master:

  • 1 benchmarks are faster, with geometric mean 1.179
  • 1 benchmarks are slower, with geometric mean 1.123
  • All benchmarks have the same allocations

The following thresholds were used for comparing the benchmark speeds:

  • Mann–Whitney U test with statistical test for significance of 5%
  • Only results indicating a difference greater than 10% and 0.3 ns are considered.

Allocation changes below 0.5% are ignored.

Benchmark details

Benchmarks.Trace.ActivityBenchmark - Same speed ✔️ Same allocations ✔️

Raw results

Branch Method Toolchain Mean StdError StdDev Gen 0 Gen 1 Gen 2 Allocated
master StartStopWithChild net6.0 8.12μs 44.2ns 272ns 0.021 0.00839 0 7.29 KB
master StartStopWithChild netcoreapp3.1 10μs 53.3ns 287ns 0.024 0.0048 0 7.39 KB
master StartStopWithChild net472 15.7μs 47ns 182ns 1.29 0.335 0.101 7.66 KB
#4770 StartStopWithChild net6.0 8.52μs 44.3ns 221ns 0.0217 0.00869 0 7.29 KB
#4770 StartStopWithChild netcoreapp3.1 10.5μs 55.5ns 299ns 0.0255 0.0102 0 7.38 KB
#4770 StartStopWithChild net472 15.8μs 36.8ns 138ns 1.28 0.331 0.102 7.66 KB
Benchmarks.Trace.AgentWriterBenchmark - Same speed ✔️ Same allocations ✔️

Raw results

Branch Method Toolchain Mean StdError StdDev Gen 0 Gen 1 Gen 2 Allocated
master WriteAndFlushEnrichedTraces net6.0 479μs 336ns 1.3μs 0 0 0 2.7 KB
master WriteAndFlushEnrichedTraces netcoreapp3.1 653μs 272ns 1.05μs 0 0 0 2.7 KB
master WriteAndFlushEnrichedTraces net472 783μs 447ns 1.67μs 0.391 0 0 3.3 KB
#4770 WriteAndFlushEnrichedTraces net6.0 494μs 145ns 521ns 0 0 0 2.7 KB
#4770 WriteAndFlushEnrichedTraces netcoreapp3.1 649μs 191ns 715ns 0 0 0 2.7 KB
#4770 WriteAndFlushEnrichedTraces net472 795μs 378ns 1.41μs 0.396 0 0 3.3 KB
Benchmarks.Trace.Asm.AppSecBodyBenchmark - Same speed ✔️ Same allocations ✔️

Raw results

Branch Method Toolchain Mean StdError StdDev Gen 0 Gen 1 Gen 2 Allocated
master AllCycleSimpleBody net6.0 44.8μs 20.9ns 81.1ns 0.022 0 0 2.03 KB
master AllCycleSimpleBody netcoreapp3.1 45.4μs 83.7ns 324ns 0.0223 0 0 2.01 KB
master AllCycleSimpleBody net472 47μs 88.6ns 343ns 0.325 0 0 2.08 KB
master AllCycleMoreComplexBody net6.0 228μs 84ns 325ns 0.114 0 0 8.63 KB
master AllCycleMoreComplexBody netcoreapp3.1 233μs 126ns 487ns 0.117 0 0 8.52 KB
master AllCycleMoreComplexBody net472 238μs 100ns 388ns 1.31 0 0 8.7 KB
master ObjectExtractorSimpleBody net6.0 120ns 0.0871ns 0.337ns 0.00397 0 0 280 B
master ObjectExtractorSimpleBody netcoreapp3.1 178ns 0.141ns 0.545ns 0.00379 0 0 272 B
master ObjectExtractorSimpleBody net472 150ns 0.238ns 0.921ns 0.0446 0 0 281 B
master ObjectExtractorMoreComplexBody net6.0 3μs 1.07ns 4.01ns 0.054 0 0 3.88 KB
master ObjectExtractorMoreComplexBody netcoreapp3.1 4.13μs 1.52ns 5.9ns 0.0516 0 0 3.78 KB
master ObjectExtractorMoreComplexBody net472 4.1μs 3.54ns 13.3ns 0.618 0.00627 0 3.89 KB
#4770 AllCycleSimpleBody net6.0 43.6μs 113ns 436ns 0.022 0 0 2.03 KB
#4770 AllCycleSimpleBody netcoreapp3.1 45.7μs 33.3ns 125ns 0.0225 0 0 2.01 KB
#4770 AllCycleSimpleBody net472 46.6μs 45.9ns 165ns 0.308 0 0 2.08 KB
#4770 AllCycleMoreComplexBody net6.0 228μs 75.6ns 273ns 0.114 0 0 8.63 KB
#4770 AllCycleMoreComplexBody netcoreapp3.1 234μs 82.4ns 319ns 0.117 0 0 8.52 KB
#4770 AllCycleMoreComplexBody net472 240μs 83.9ns 314ns 1.31 0 0 8.7 KB
#4770 ObjectExtractorSimpleBody net6.0 119ns 0.111ns 0.416ns 0.00397 0 0 280 B
#4770 ObjectExtractorSimpleBody netcoreapp3.1 178ns 0.0647ns 0.233ns 0.00369 0 0 272 B
#4770 ObjectExtractorSimpleBody net472 146ns 0.15ns 0.58ns 0.0446 0 0 281 B
#4770 ObjectExtractorMoreComplexBody net6.0 3.07μs 2.15ns 8.05ns 0.0539 0 0 3.88 KB
#4770 ObjectExtractorMoreComplexBody netcoreapp3.1 4.12μs 1.43ns 5.53ns 0.0514 0 0 3.78 KB
#4770 ObjectExtractorMoreComplexBody net472 4.17μs 2.48ns 9.59ns 0.618 0.00622 0 3.89 KB
Benchmarks.Trace.Asm.AppSecWafBenchmark - Same speed ✔️ Same allocations ✔️

Raw results

Branch Method Toolchain Mean StdError StdDev Gen 0 Gen 1 Gen 2 Allocated
master RunWaf(args=NestedMap (10)) net6.0 12.9μs 2.1ns 7.57ns 0.129 0 0 9.42 KB
master RunWaf(args=NestedMap (10)) netcoreapp3.1 19.6μs 41.5ns 155ns 0.127 0 0 9.42 KB
master RunWaf(args=NestedMap (10)) net472 28.4μs 20.4ns 79ns 1.5 0 0 9.48 KB
master RunWafWithAttack(args=Neste(...)tack) [22]) net6.0 76.4μs 127ns 492ns 0.191 0 0 15.77 KB
master RunWafWithAttack(args=Neste(...)tack) [22]) netcoreapp3.1 85.3μs 83ns 311ns 0.214 0 0 15.72 KB
master RunWafWithAttack(args=Neste(...)tack) [22]) net472 98.3μs 66.2ns 256ns 2.54 0 0 16.04 KB
master RunWaf(args=NestedMap (100)) net6.0 24μs 8.51ns 31.8ns 0.276 0 0 19.66 KB
master RunWaf(args=NestedMap (100)) netcoreapp3.1 39.4μs 24ns 89.8ns 0.275 0 0 20.42 KB
master RunWaf(args=NestedMap (100)) net472 55.7μs 114ns 441ns 3.26 0.0276 0 20.63 KB
master RunWafWithAttack(args=Neste(...)tack) [23]) net6.0 91μs 224ns 869ns 0.359 0 0 26.01 KB
master RunWafWithAttack(args=Neste(...)tack) [23]) netcoreapp3.1 109μs 230ns 860ns 0.329 0 0 26.72 KB
master RunWafWithAttack(args=Neste(...)tack) [23]) net472 134μs 80.7ns 312ns 4.29 0.066 0 27.19 KB
master RunWaf(args=NestedMap (20)) net6.0 24.1μs 5ns 18.7ns 0.264 0 0 19.4 KB
master RunWaf(args=NestedMap (20)) netcoreapp3.1 38.2μs 11.7ns 43.9ns 0.267 0 0 19.84 KB
master RunWaf(args=NestedMap (20)) net472 56.1μs 18.3ns 63.5ns 3.17 0.0562 0 20.04 KB
master RunWafWithAttack(args=Neste(...)tack) [22]) net6.0 90.5μs 205ns 792ns 0.358 0 0 25.74 KB
master RunWafWithAttack(args=Neste(...)tack) [22]) netcoreapp3.1 108μs 113ns 436ns 0.323 0 0 26.14 KB
master RunWafWithAttack(args=Neste(...)tack) [22]) net472 130μs 196ns 732ns 4.17 0.0641 0 26.6 KB
#4770 RunWaf(args=NestedMap (10)) net6.0 12.6μs 4.85ns 18.8ns 0.132 0 0 9.42 KB
#4770 RunWaf(args=NestedMap (10)) netcoreapp3.1 19.5μs 7.05ns 26.4ns 0.126 0 0 9.42 KB
#4770 RunWaf(args=NestedMap (10)) net472 28.6μs 9.79ns 35.3ns 1.5 0 0 9.48 KB
#4770 RunWafWithAttack(args=Neste(...)tack) [22]) net6.0 75.7μs 43.5ns 163ns 0.19 0 0 15.77 KB
#4770 RunWafWithAttack(args=Neste(...)tack) [22]) netcoreapp3.1 85.2μs 207ns 801ns 0.214 0 0 15.72 KB
#4770 RunWafWithAttack(args=Neste(...)tack) [22]) net472 97.6μs 192ns 743ns 2.52 0 0 16.04 KB
#4770 RunWaf(args=NestedMap (100)) net6.0 24μs 9.48ns 35.5ns 0.274 0 0 19.66 KB
#4770 RunWaf(args=NestedMap (100)) netcoreapp3.1 38.2μs 10.3ns 39.8ns 0.268 0 0 20.42 KB
#4770 RunWaf(args=NestedMap (100)) net472 56.8μs 64.7ns 251ns 3.27 0.0282 0 20.63 KB
#4770 RunWafWithAttack(args=Neste(...)tack) [23]) net6.0 91.9μs 45.2ns 175ns 0.363 0 0 26.01 KB
#4770 RunWafWithAttack(args=Neste(...)tack) [23]) netcoreapp3.1 109μs 86.6ns 312ns 0.326 0 0 26.72 KB
#4770 RunWafWithAttack(args=Neste(...)tack) [23]) net472 132μs 87.8ns 317ns 4.31 0.0652 0 27.19 KB
#4770 RunWaf(args=NestedMap (20)) net6.0 23.7μs 8.89ns 32ns 0.273 0 0 19.4 KB
#4770 RunWaf(args=NestedMap (20)) netcoreapp3.1 37.6μs 9.69ns 37.5ns 0.262 0 0 19.84 KB
#4770 RunWaf(args=NestedMap (20)) net472 53.5μs 20.7ns 77.6ns 3.18 0.0535 0 20.04 KB
#4770 RunWafWithAttack(args=Neste(...)tack) [22]) net6.0 88.7μs 53.5ns 207ns 0.36 0 0 25.74 KB
#4770 RunWafWithAttack(args=Neste(...)tack) [22]) netcoreapp3.1 107μs 198ns 767ns 0.324 0 0 26.14 KB
#4770 RunWafWithAttack(args=Neste(...)tack) [22]) net472 130μs 128ns 461ns 4.21 0.0648 0 26.6 KB
Benchmarks.Trace.AspNetCoreBenchmark - Same speed ✔️ Same allocations ✔️

Raw results

Branch Method Toolchain Mean StdError StdDev Gen 0 Gen 1 Gen 2 Allocated
master SendRequest net6.0 167μs 165ns 637ns 0.25 0 0 18.04 KB
master SendRequest netcoreapp3.1 189μs 273ns 985ns 0.19 0 0 20.2 KB
master SendRequest net472 0.000326ns 0.000183ns 0.00071ns 0 0 0 0 b
#4770 SendRequest net6.0 169μs 107ns 413ns 0.17 0 0 18.04 KB
#4770 SendRequest netcoreapp3.1 189μs 230ns 892ns 0.189 0 0 20.2 KB
#4770 SendRequest net472 0ns 0ns 0ns 0 0 0 0 b
Benchmarks.Trace.CIVisibilityProtocolWriterBenchmark - Same speed ✔️ Same allocations ✔️

Raw results

Branch Method Toolchain Mean StdError StdDev Gen 0 Gen 1 Gen 2 Allocated
master WriteAndFlushEnrichedTraces net6.0 534μs 362ns 1.3μs 0.539 0 0 41.54 KB
master WriteAndFlushEnrichedTraces netcoreapp3.1 632μs 781ns 3.02μs 0.321 0 0 41.7 KB
master WriteAndFlushEnrichedTraces net472 853μs 3.69μs 14.3μs 8.45 2.53 0.422 53.24 KB
#4770 WriteAndFlushEnrichedTraces net6.0 551μs 993ns 3.85μs 0.561 0 0 41.73 KB
#4770 WriteAndFlushEnrichedTraces netcoreapp3.1 667μs 1.02μs 3.95μs 0.329 0 0 41.74 KB
#4770 WriteAndFlushEnrichedTraces net472 825μs 3.73μs 16.3μs 8.17 2.45 0.408 53.27 KB
Benchmarks.Trace.DbCommandBenchmark - Same speed ✔️ Same allocations ✔️

Raw results

Branch Method Toolchain Mean StdError StdDev Gen 0 Gen 1 Gen 2 Allocated
master ExecuteNonQuery net6.0 1.05μs 0.601ns 2.33ns 0.0105 0 0 768 B
master ExecuteNonQuery netcoreapp3.1 1.31μs 0.749ns 2.9ns 0.0106 0 0 768 B
master ExecuteNonQuery net472 1.67μs 3.7ns 14.3ns 0.115 0 0 730 B
#4770 ExecuteNonQuery net6.0 1.02μs 0.468ns 1.81ns 0.0108 0 0 768 B
#4770 ExecuteNonQuery netcoreapp3.1 1.36μs 1.44ns 5.58ns 0.0102 0 0 768 B
#4770 ExecuteNonQuery net472 1.59μs 0.757ns 2.93ns 0.116 0 0 730 B
Benchmarks.Trace.ElasticsearchBenchmark - Same speed ✔️ Same allocations ✔️

Raw results

Branch Method Toolchain Mean StdError StdDev Gen 0 Gen 1 Gen 2 Allocated
master CallElasticsearch net6.0 1.22μs 0.706ns 2.74ns 0.0128 0 0 936 B
master CallElasticsearch netcoreapp3.1 1.44μs 0.658ns 2.46ns 0.0131 0 0 936 B
master CallElasticsearch net472 2.47μs 0.925ns 3.58ns 0.151 0 0 955 B
master CallElasticsearchAsync net6.0 1.31μs 0.838ns 3.25ns 0.013 0 0 912 B
master CallElasticsearchAsync netcoreapp3.1 1.63μs 0.757ns 2.93ns 0.013 0 0 984 B
master CallElasticsearchAsync net472 2.55μs 0.875ns 3.27ns 0.161 0 0 1.01 KB
#4770 CallElasticsearch net6.0 1.15μs 0.291ns 1.05ns 0.0133 0 0 936 B
#4770 CallElasticsearch netcoreapp3.1 1.5μs 0.488ns 1.83ns 0.012 0 0 936 B
#4770 CallElasticsearch net472 2.35μs 0.681ns 2.64ns 0.151 0 0 955 B
#4770 CallElasticsearchAsync net6.0 1.31μs 1.4ns 5.06ns 0.0124 0 0 912 B
#4770 CallElasticsearchAsync netcoreapp3.1 1.51μs 0.404ns 1.51ns 0.0128 0 0 984 B
#4770 CallElasticsearchAsync net472 2.54μs 0.803ns 3.11ns 0.16 0 0 1.01 KB
Benchmarks.Trace.GraphQLBenchmark - Same speed ✔️ Same allocations ✔️

Raw results

Branch Method Toolchain Mean StdError StdDev Gen 0 Gen 1 Gen 2 Allocated
master ExecuteAsync net6.0 1.16μs 1.06ns 3.98ns 0.0128 0 0 912 B
master ExecuteAsync netcoreapp3.1 1.49μs 1.01ns 3.77ns 0.012 0 0 912 B
master ExecuteAsync net472 1.68μs 2.48ns 9.62ns 0.139 0 0 875 B
#4770 ExecuteAsync net6.0 1.11μs 0.794ns 2.97ns 0.0126 0 0 912 B
#4770 ExecuteAsync netcoreapp3.1 1.49μs 1.73ns 6.72ns 0.0127 0 0 912 B
#4770 ExecuteAsync net472 1.67μs 1.35ns 5.22ns 0.139 0 0 875 B
Benchmarks.Trace.HttpClientBenchmark - Same speed ✔️ Same allocations ✔️

Raw results

Branch Method Toolchain Mean StdError StdDev Gen 0 Gen 1 Gen 2 Allocated
master SendAsync net6.0 3.73μs 1.27ns 4.75ns 0.026 0 0 1.9 KB
master SendAsync netcoreapp3.1 4.31μs 2.94ns 11.4ns 0.0325 0 0 2.43 KB
master SendAsync net472 7.09μs 15.6ns 60.4ns 0.473 0 0 2.99 KB
#4770 SendAsync net6.0 3.75μs 4.98ns 18.6ns 0.0264 0 0 1.9 KB
#4770 SendAsync netcoreapp3.1 4.47μs 1.44ns 5.19ns 0.0334 0 0 2.43 KB
#4770 SendAsync net472 6.97μs 2.15ns 8.03ns 0.474 0 0 2.99 KB
Benchmarks.Trace.ILoggerBenchmark - Same speed ✔️ Same allocations ✔️

Raw results

Branch Method Toolchain Mean StdError StdDev Gen 0 Gen 1 Gen 2 Allocated
master EnrichedLog net6.0 1.36μs 0.796ns 3.08ns 0.0217 0 0 1.57 KB
master EnrichedLog netcoreapp3.1 2μs 0.585ns 2.19ns 0.021 0 0 1.57 KB
master EnrichedLog net472 2.3μs 4.03ns 15.6ns 0.237 0 0 1.49 KB
#4770 EnrichedLog net6.0 1.39μs 0.664ns 2.49ns 0.0223 0 0 1.57 KB
#4770 EnrichedLog netcoreapp3.1 2.05μs 1.03ns 3.84ns 0.0215 0 0 1.57 KB
#4770 EnrichedLog net472 2.32μs 2.07ns 8.01ns 0.237 0 0 1.49 KB
Benchmarks.Trace.Log4netBenchmark - Same speed ✔️ Same allocations ✔️

Raw results

Branch Method Toolchain Mean StdError StdDev Gen 0 Gen 1 Gen 2 Allocated
master EnrichedLog net6.0 112μs 77.2ns 299ns 0.0559 0 0 4.21 KB
master EnrichedLog netcoreapp3.1 117μs 125ns 482ns 0.0582 0 0 4.21 KB
master EnrichedLog net472 147μs 75.7ns 283ns 0.668 0.223 0 4.38 KB
#4770 EnrichedLog net6.0 114μs 106ns 409ns 0 0 0 4.21 KB
#4770 EnrichedLog netcoreapp3.1 118μs 149ns 577ns 0.06 0 0 4.21 KB
#4770 EnrichedLog net472 149μs 156ns 604ns 0.676 0.225 0 4.38 KB
Benchmarks.Trace.NLogBenchmark - Same speed ✔️ Same allocations ✔️

Raw results

Branch Method Toolchain Mean StdError StdDev Gen 0 Gen 1 Gen 2 Allocated
master EnrichedLog net6.0 2.84μs 1.03ns 4ns 0.0295 0 0 2.13 KB
master EnrichedLog netcoreapp3.1 4.01μs 1.1ns 4.26ns 0.0282 0 0 2.13 KB
master EnrichedLog net472 4.48μs 0.921ns 3.44ns 0.306 0 0 1.93 KB
#4770 EnrichedLog net6.0 2.9μs 1.13ns 4.37ns 0.0304 0 0 2.13 KB
#4770 EnrichedLog netcoreapp3.1 3.96μs 1.19ns 4.63ns 0.0278 0 0 2.13 KB
#4770 EnrichedLog net472 4.51μs 3.13ns 12.1ns 0.307 0 0 1.93 KB
Benchmarks.Trace.RedisBenchmark - Slower ⚠️ Same allocations ✔️

Slower ⚠️ in #4770

Benchmark diff/base Base Median (ns) Diff Median (ns) Modality
Benchmarks.Trace.RedisBenchmark.SendReceive‑net6.0 1.123 1,247.38 1,400.52

Raw results

Branch Method Toolchain Mean StdError StdDev Gen 0 Gen 1 Gen 2 Allocated
master SendReceive net6.0 1.25μs 0.731ns 2.83ns 0.0156 0 0 1.1 KB
master SendReceive netcoreapp3.1 1.64μs 1.17ns 4.04ns 0.0147 0 0 1.1 KB
master SendReceive net472 2.01μs 0.967ns 3.75ns 0.177 0 0 1.12 KB
#4770 SendReceive net6.0 1.4μs 0.438ns 1.58ns 0.0154 0 0 1.1 KB
#4770 SendReceive netcoreapp3.1 1.65μs 0.524ns 2.03ns 0.0149 0 0 1.1 KB
#4770 SendReceive net472 1.94μs 1.13ns 4.36ns 0.176 0 0 1.12 KB
Benchmarks.Trace.SerilogBenchmark - Same speed ✔️ Same allocations ✔️

Raw results

Branch Method Toolchain Mean StdError StdDev Gen 0 Gen 1 Gen 2 Allocated
master EnrichedLog net6.0 2.64μs 1.42ns 5.52ns 0.0211 0 0 1.53 KB
master EnrichedLog netcoreapp3.1 3.6μs 0.896ns 3.35ns 0.0215 0 0 1.58 KB
master EnrichedLog net472 4.01μs 1.6ns 6.21ns 0.311 0 0 1.96 KB
#4770 EnrichedLog net6.0 2.71μs 0.785ns 3.04ns 0.0217 0 0 1.53 KB
#4770 EnrichedLog netcoreapp3.1 3.59μs 1.6ns 6.21ns 0.0216 0 0 1.58 KB
#4770 EnrichedLog net472 4.05μs 0.915ns 3.3ns 0.31 0 0 1.96 KB
Benchmarks.Trace.SpanBenchmark - Faster 🎉 Same allocations ✔️

Faster 🎉 in #4770

Benchmark base/diff Base Median (ns) Diff Median (ns) Modality
Benchmarks.Trace.SpanBenchmark.StartFinishSpan‑net6.0 1.179 460.37 390.53

Raw results

Branch Method Toolchain Mean StdError StdDev Gen 0 Gen 1 Gen 2 Allocated
master StartFinishSpan net6.0 460ns 0.344ns 1.33ns 0.00741 0 0 536 B
master StartFinishSpan netcoreapp3.1 577ns 0.276ns 1.07ns 0.00726 0 0 536 B
master StartFinishSpan net472 650ns 0.335ns 1.3ns 0.0853 0 0 538 B
master StartFinishScope net6.0 492ns 0.146ns 0.565ns 0.00925 0 0 656 B
master StartFinishScope netcoreapp3.1 774ns 0.3ns 1.12ns 0.00892 0 0 656 B
master StartFinishScope net472 809ns 0.287ns 1.11ns 0.0978 0 0 618 B
#4770 StartFinishSpan net6.0 391ns 0.121ns 0.47ns 0.00762 0 0 536 B
#4770 StartFinishSpan netcoreapp3.1 606ns 0.158ns 0.592ns 0.00728 0 0 536 B
#4770 StartFinishSpan net472 601ns 0.274ns 0.988ns 0.0853 0 0 538 B
#4770 StartFinishScope net6.0 541ns 0.198ns 0.768ns 0.00913 0 0 656 B
#4770 StartFinishScope netcoreapp3.1 760ns 0.31ns 1.2ns 0.00903 0 0 656 B
#4770 StartFinishScope net472 872ns 0.42ns 1.63ns 0.098 0 0 618 B
Benchmarks.Trace.TraceAnnotationsBenchmark - Same speed ✔️ Same allocations ✔️

Raw results

Branch Method Toolchain Mean StdError StdDev Gen 0 Gen 1 Gen 2 Allocated
master RunOnMethodBegin net6.0 608ns 0.233ns 0.901ns 0.00936 0 0 656 B
master RunOnMethodBegin netcoreapp3.1 756ns 0.504ns 1.95ns 0.00882 0 0 656 B
master RunOnMethodBegin net472 945ns 0.218ns 0.786ns 0.0979 0 0 618 B
#4770 RunOnMethodBegin net6.0 634ns 0.441ns 1.59ns 0.00923 0 0 656 B
#4770 RunOnMethodBegin netcoreapp3.1 818ns 1.44ns 5.56ns 0.0089 0 0 656 B
#4770 RunOnMethodBegin net472 945ns 1.03ns 3.98ns 0.0978 0 0 618 B

Copy link
Member

@andrewlock andrewlock left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I'm slightly confused by this:

  • AFAICT, we always overwrite any manually-added spans by injecting our own.
  • In that case, shouldn't it always be safe to use the headers we find?

I'm assuming there's some edge case where we don't inject the headers, but I'm not sure what that is?

Also, this is a "bad practice" by the customer right, it feels like in an OTel world this would be a great way to get some really weird traces too?

internal static class HeadersInjectedCache
{
private static readonly object InjectedValue = new();
private static readonly ConditionalWeakTable<WebHeaderCollection, object> Cache = new();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm starting to think you're a shill for "Big ConditionalWeakTable"🕵️‍♂️

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's too late for me, I can't turn back now 😂

@zacharycmontoya
Copy link
Collaborator Author

zacharycmontoya commented Oct 27, 2023

I think I'm slightly confused by this:

  • AFAICT, we always overwrite any manually-added spans by injecting our own.
  • In that case, shouldn't it always be safe to use the headers we find?

I'm assuming there's some edge case where we don't inject the headers, but I'm not sure what that is?

Unfortunately, we do not always overwrite manually-added distributed tracing headers. With WebRequest.GetResponse and WebRequest.GetResponseAsync we have to check if there are any existing distributed tracing headers, to get around the WebRequest GetRequestStream API's freezing the headers. If none of the other WebRequest API's are actually called, then we consume the ones set by the application.

So the error case looks like this:

  • A request enters a server application with distributed tracing headers
    • Headers: x-datadog-trace-id:A,x-datadog-parent-id:B
  • Our automatic instrumentation creates a new ASP.NET span based on incoming distributed tracing headers
    • Span: TraceId:A,SpanId:C
  • The server application creates a new HTTP request and propagates the incoming distributed tracing headers
    • Headers: x-datadog-trace-id:A,x-datadog-parent-id:B
  • The server application calls WebRequest.GetResponse or WebRequest.GetResponseAsync to issue the downstream request which also triggers our automatic instrumentation to create a HTTP span
    • Span: TraceId:A,SpanId:B

In the end, two spans are reported to Datadog with a matching TraceId and SpanId, causing ✨ problems ✨

@andrewlock
Copy link
Member

andrewlock commented Nov 1, 2023

The server application calls WebRequest.GetResponse or WebRequest.GetResponseAsync to issue the downstream request

Ok, I had totally missed that you don't have to call WebRequest.GetRequestStream() et al, so that part makes sense, thanks 👍

OK, so just to confirm, the only problematic scenario is:

  • The customer doesn't call WebRequest.GetRequestStream() (if they do, we inject the correct headers, and then they're frozen so the customer can't change them).
  • The customer injects the (incorrect) headers, and then calls...
  • WebRequest.GetResponse or WebRequest.GetResponseAsync. At the start of that we try to retrieve the headers from the request and 💥

Thanks, rubber ducking over! 😄

Copy link
Member

@andrewlock andrewlock left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work + nice tests!

…ted tracing headers to their WebRequest headers. This will cause the following WebRequest integrations to incorrectly use the existing distributed tracing headers:

- GetResponse
- GetResponseAsync
…a static, self-cleansing cache to know when we injected distributed tracing headers into the WebRequest (not the application), so WebRequest.GetResponse and WebRequest.GetResponseAsync can use the correct trace context.
@zacharycmontoya zacharycmontoya force-pushed the zach/bug-fix/web-request-propagation-headers branch from d2fea00 to 6fe933f Compare November 1, 2023 18:00
@zacharycmontoya zacharycmontoya merged commit 7124515 into master Nov 2, 2023
@zacharycmontoya zacharycmontoya deleted the zach/bug-fix/web-request-propagation-headers branch November 2, 2023 13:38
@github-actions github-actions bot added this to the vNext milestone Nov 2, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants