Fix race condition in ExponentiallyDecayingReservoir's rescale method #1033
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Fixes issues #1005 and #793 - note that in #793 the
longPeriodsOfInactivityShouldNotCorruptSamplingState
test did not cover multiple concurrent requests after a long idle period.The race condition currently occurs when two threads update the reservoir concurrently - after the first thread has successfully updated
nextScaleTime
(line 162) but before the write lock is obtained (line 163) a second thread updates the reservoir (line 95+). With several threads pushing updates concurrently this presents a narrow window during which the second thread may obtain the lock before the first resulting in a corrupt reservoir state. Note, however, that if a snapshot is requested just prior to the two concurrent updates the read lock will already be active when the two threads enterrescaleIfNeeded
, resulting in a much larger window for this race condition.