-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Backport 1.3 fixes #8578
Merged
Merged
Backport 1.3 fixes #8578
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
The monitor goroutine calls enable compactions every 10s to spin down (or start up) goroutines for cold shards. This frequent Lock may be causing lock contention for writes and queries which get blocked trying to acquire an RLock. The go RWMutex says that new RLock calls will block if there is a pending Lock call that is blocked. Switching the common path to use an RLock should avoid the Lock and reduce lock contention for writes and queries.
The in-memory index can get out of sync when deletes and writes to the same measurement are running concurrently. The index is updated independently from data on disk and it's possible for the index to unassign a shard when data still exists on disk. What happens is that there are TSM files on disk, but the index does not know that the series that exist in those files still are in the shard. Restarting the server reloads the index and the data is visible again. From and end user perspective, this can look like more data is deleted than should have been or that deleted data re-appears after a restart or writes to the shard occur again. There isn't an easy way to resolve this since the index and storage are not transactional resources and we cannot atomically commit or rollback changes to both at once. As a workaround, after new TSM files are installed, we refresh the index with series keys that exist in the new tsm files as well as any lingering data still in the cache. There is a small window of time when the index may be missing series, but it will re-appear after the refresh completes.
The min key was not used in OverlapsKeyRange which caused it to return false when it should be true. This causes a bug where deletes would not write tombstones for files that actually contained the data it was supposed to delete.
There was a race in the WAL writeToLog and scheduleSync which could lead to a writing goroutine blocking indefinitely on its syncErr channel. The issue was that the clearing of the syncCount happenend after the wal was unlock. If a goroutine was able to lock, write and call scheduleSync before the existing scheduleSync goroutine returns and ran the defer to clear the syncCount, then a new scheduleSync goroutine would not get started. This left the writing goroutine block with nothing to signal it. While in this state, a RLock on the engine was held. If a Lock was requested on the engine during this time, all future writes and queries would block waiting on the blocked wal writer. The fix is to move the atomic clearing of syncCount before the Lock is released.
stuartcarnie
approved these changes
Jul 7, 2017
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Required for all non-trivial PRs
Backport #8577 #8576 #8567 #8518