-
Notifications
You must be signed in to change notification settings - Fork 61
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RUM-5447 Fix the racing condition in the RotatingDnsResolver logic #2127
RUM-5447 Fix the racing condition in the RotatingDnsResolver logic #2127
Conversation
818f959
to
289993a
Compare
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## develop #2127 +/- ##
===========================================
- Coverage 70.03% 70.02% -0.01%
===========================================
Files 721 721
Lines 26773 26785 +12
Branches 4499 4501 +2
===========================================
+ Hits 18749 18754 +5
- Misses 6771 6779 +8
+ Partials 1253 1252 -1
|
@@ -111,6 +114,30 @@ internal class RotatingDnsResolverTest { | |||
assertThat(result2).containsExactlyElementsOf(fakeInetAddresses) | |||
} | |||
|
|||
@RepeatedTest(30) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we need to repeat it 30 times? won't it be too long? we are already spinning 100 threads
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes but it's randomly re-produceable. I am afraid that if we lower it it might be a false positive
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if it runs no more than ~5s for these 30 repetitions, than it is fine
Thread { | ||
assertThatNoException().isThrownBy { | ||
Thread.sleep(forge.aLong(min = 0, max = 100)) | ||
testedDns.lookup(fakeHostname) | ||
} | ||
countDownLatch.countDown() | ||
}.start() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if I'm not wrong, this will never throw in the end, the thread spinned will be killed, but error won't be propagated further and won't stop the test.
Moreover, since countDownLatch.countDown()
won't happen in case of the error, countDownLatch.await()
will be waiting forever.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmmm not sure based on my tests, theoretically assertThatNoException
should stop the test with a fail. In that case the countDownLatch
will not wait anymore. I will double check this though.
289993a
to
3967404
Compare
@@ -111,6 +114,30 @@ internal class RotatingDnsResolverTest { | |||
assertThat(result2).containsExactlyElementsOf(fakeInetAddresses) | |||
} | |||
|
|||
@RepeatedTest(30) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if it runs no more than ~5s for these 30 repetitions, than it is fine
} | ||
} | ||
|
||
private fun safeCopy(list: List<InetAddress>): List<InetAddress> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note
Is this a function that might be useful in other places in the project ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I asked myself the same question but for now I could not see any other place. My rule of thumb is that an extension function should have at least 2 different classes where is being used in order to be extracted so that's the reason why I decided to keep it private, non extension here.
What does this PR do?
In this PR we are fixing the race condition problem that was caused by a concurrent access to an unprotected list in the
RotatingDnsResolver
class. This was raised by the Support team in this ticket: https://datadog.zendesk.com/agent/tickets/1761693Motivation
What inspired you to submit this pull request?
Additional Notes
Anything else we should know when reviewing?
Review checklist (to be filled by reviewers)