RUM-4894 Fix the CoreTracer flaky tests #2081

mariusc83 · 2024-06-12T14:26:35Z

What does this PR do?

A brief description of the change being made with this pull request.

Motivation

What inspired you to submit this pull request?

Additional Notes

Anything else we should know when reviewing?

Review checklist (to be filled by reviewers)

Feature or bugfix MUST have appropriate tests (unit, integration, e2e)
Make sure you discussed the feature or bugfix with the maintaining team in an Issue
Make sure each commit and the PR mention the Issue number (cf the CONTRIBUTING doc)

0xnm · 2024-06-12T14:35:45Z

...t/trace/src/test/kotlin/com/datadog/android/trace/integration/otel/OtelTracerProviderTest.kt

@@ -803,7 +803,7 @@ internal class OtelTracerProviderTest {
        val keptSpans = spansWritten.filter {
            it.getInt(SAMPLING_PRIORITY_KEY) == PrioritySampling.USER_KEEP.toInt()
        }
-        val offset = 10
+        val offset = 20


I think the test will be still flaky, even with this change. And 20% is quite big offset, maybe it makes sense only a small absolute values. And if you have actual say 1, and expected as 2, this won't help (because diff will be 50% for such numbers).

I tested with 1000 repeated tests and never failed. I think it is fine but also my question is regarding the offset as it is too high. Should we still keep this test ?

I don't have a definitive answer here. I think we have similar tests checking sampling rate in other modules, and they probably don't have a high offset? Maybe we can modify the test somehow?

yes...will try to have a look what else I can do

Note
One way to limit flakyness is to increase the numberOfSpans

codecov-commenter · 2024-06-12T15:01:51Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 68.58%. Comparing base (5197289) to head (dc254f2).
Report is 2 commits behind head on develop.

Additional details and impacted files

@@             Coverage Diff             @@
##           develop    #2081      +/-   ##
===========================================
- Coverage    68.73%   68.58%   -0.15%     
===========================================
  Files          719      719              
  Lines        26571    26569       -2     
  Branches      4472     4472              
===========================================
- Hits         18262    18220      -42     
- Misses        7123     7139      +16     
- Partials      1186     1210      +24

see 22 files with indirect coverage changes

xgouchet · 2024-06-13T07:58:17Z

...t/trace/src/test/kotlin/com/datadog/android/trace/integration/otel/OtelTracerProviderTest.kt

    fun `M use user-keep or user-drop priority W buildSpan { tracer was provided a sample rate }`(
        @StringForgery fakeInstrumentationName: String,
-        @DoubleForgery(min = 0.0, max = 100.0) sampleRate: Double,
+        @IntForgery(min = 0, max = 100) sampleRate: Int,


Note
Why do we need to generate an Int and cast it to a Double instead of generating a Double directly?

yes good catch was a left over during my tests to see why it is flaky

xgouchet · 2024-06-13T08:00:38Z

...t/trace/src/test/kotlin/com/datadog/android/trace/integration/otel/OtelTracerProviderTest.kt

-        val offset = 10
+        // Because of the way sampling works the deviation can be quite high so we will have to use an offset of 20
+        // here to make sure this test will never be flaky
+        val offset = 20


Note
The offset should be based off of the numberOfSpans and not hardcoded, e.g.:

val offset = (numberOfSpans * 15) / 100

xgouchet · 2024-06-13T08:01:13Z

...t/trace/src/test/kotlin/com/datadog/android/trace/integration/otel/OtelTracerProviderTest.kt

@@ -803,7 +803,7 @@ internal class OtelTracerProviderTest {
        val keptSpans = spansWritten.filter {
            it.getInt(SAMPLING_PRIORITY_KEY) == PrioritySampling.USER_KEEP.toInt()
        }
-        val offset = 10
+        val offset = 20


Note
One way to limit flakyness is to increase the numberOfSpans

0xnm · 2024-06-13T08:58:16Z

...t/trace/src/test/kotlin/com/datadog/android/trace/integration/otel/OtelTracerProviderTest.kt

+            val span = tracer.spanBuilder(forge.anAlphabeticalString()).startSpan()
+            // there is a throttle on the sampler which drops all the spans over the 100 limit in 1 second
+            // so we need to sleep a bit to make sure the spans are not dropped because of throttling
+            Thread.sleep(100)


this makes test very slow: 100ms*300 spans is 30 seconds at least for waiting in this test, which is a lot.

drops all the spans over the 100 limit in 1 second

it means we can sleep for 10ms instead of 100 to be within the limit? But ideally we should disable throttling for the test or change the threshold.

yes I wanted to make sure but indeed we can sleep for 10ms

still ideally we won't have to use sleep, but 3s is much better than 30s.

0xnm · 2024-06-13T12:14:01Z

...t/trace/src/test/kotlin/com/datadog/android/trace/integration/otel/OtelTracerProviderTest.kt

-        val offset = 10
+        // Because of the way sampling works the deviation can be quite high so we will have to use an offset of 15%
+        // here to make sure this test will never be flaky
+        val offset = numberOfSpans * 15 / 100


nit: I think here we should use float types, otherwise we are losing a lot by doing an integer division.

Suggested change

val offset = numberOfSpans * 15 / 100

val offset = numberOfSpans * 15f / 100

the offset needs to be an int in the end so I will keep it like that.

mariusc83 self-assigned this Jun 12, 2024

mariusc83 requested review from a team as code owners June 12, 2024 14:26

0xnm approved these changes Jun 12, 2024

View reviewed changes

mariusc83 requested a review from 0xnm June 12, 2024 14:53

0xnm approved these changes Jun 12, 2024

View reviewed changes

mariusc83 force-pushed the mconstantin/rum-4894/fix-coretracer-flaky-tests branch 2 times, most recently from 8f5f6b1 to 7ee5529 Compare June 13, 2024 07:17

xgouchet reviewed Jun 13, 2024

View reviewed changes

mariusc83 force-pushed the mconstantin/rum-4894/fix-coretracer-flaky-tests branch from 7ee5529 to 4c94c57 Compare June 13, 2024 08:34

mariusc83 requested review from xgouchet and 0xnm June 13, 2024 08:46

0xnm requested changes Jun 13, 2024

View reviewed changes

RUM-4894 Fix the CoreTracer flaky tests

dc254f2

mariusc83 force-pushed the mconstantin/rum-4894/fix-coretracer-flaky-tests branch from 4c94c57 to dc254f2 Compare June 13, 2024 11:54

mariusc83 requested a review from 0xnm June 13, 2024 11:54

0xnm approved these changes Jun 13, 2024

View reviewed changes

xgouchet approved these changes Jun 13, 2024

View reviewed changes

mariusc83 merged commit 98c43e2 into develop Jun 13, 2024
20 checks passed

mariusc83 deleted the mconstantin/rum-4894/fix-coretracer-flaky-tests branch June 13, 2024 13:24

xgouchet added this to the 2.11.x milestone Jul 31, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RUM-4894 Fix the CoreTracer flaky tests #2081

RUM-4894 Fix the CoreTracer flaky tests #2081

mariusc83 commented Jun 12, 2024

0xnm Jun 12, 2024

mariusc83 Jun 12, 2024

0xnm Jun 12, 2024

mariusc83 Jun 12, 2024

xgouchet Jun 13, 2024

codecov-commenter commented Jun 12, 2024 •

edited

Loading

xgouchet Jun 13, 2024

mariusc83 Jun 13, 2024

xgouchet Jun 13, 2024

xgouchet Jun 13, 2024

0xnm Jun 13, 2024

mariusc83 Jun 13, 2024

0xnm Jun 13, 2024

0xnm Jun 13, 2024 •

edited

Loading

mariusc83 Jun 13, 2024

	val offset = numberOfSpans * 15 / 100
	val offset = numberOfSpans * 15f / 100

RUM-4894 Fix the CoreTracer flaky tests #2081

RUM-4894 Fix the CoreTracer flaky tests #2081

Conversation

mariusc83 commented Jun 12, 2024

What does this PR do?

Motivation

Additional Notes

Review checklist (to be filled by reviewers)

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

codecov-commenter commented Jun 12, 2024 • edited Loading

Codecov Report

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

0xnm Jun 13, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

codecov-commenter commented Jun 12, 2024 •

edited

Loading

0xnm Jun 13, 2024 •

edited

Loading