Backport 1.3 fixes #8578

jwilder · 2017-07-07T20:44:44Z

Required for all non-trivial PRs

Rebased/mergable
Tests pass
CHANGELOG.md updated
Sign CLA (if not already signed)

The monitor goroutine calls enable compactions every 10s to spin down (or start up) goroutines for cold shards. This frequent Lock may be causing lock contention for writes and queries which get blocked trying to acquire an RLock. The go RWMutex says that new RLock calls will block if there is a pending Lock call that is blocked. Switching the common path to use an RLock should avoid the Lock and reduce lock contention for writes and queries.

The in-memory index can get out of sync when deletes and writes to the same measurement are running concurrently. The index is updated independently from data on disk and it's possible for the index to unassign a shard when data still exists on disk. What happens is that there are TSM files on disk, but the index does not know that the series that exist in those files still are in the shard. Restarting the server reloads the index and the data is visible again. From and end user perspective, this can look like more data is deleted than should have been or that deleted data re-appears after a restart or writes to the shard occur again. There isn't an easy way to resolve this since the index and storage are not transactional resources and we cannot atomically commit or rollback changes to both at once. As a workaround, after new TSM files are installed, we refresh the index with series keys that exist in the new tsm files as well as any lingering data still in the cache. There is a small window of time when the index may be missing series, but it will re-appear after the refresh completes.

The min key was not used in OverlapsKeyRange which caused it to return false when it should be true. This causes a bug where deletes would not write tombstones for files that actually contained the data it was supposed to delete.

There was a race in the WAL writeToLog and scheduleSync which could lead to a writing goroutine blocking indefinitely on its syncErr channel. The issue was that the clearing of the syncCount happenend after the wal was unlock. If a goroutine was able to lock, write and call scheduleSync before the existing scheduleSync goroutine returns and ran the defer to clear the syncCount, then a new scheduleSync goroutine would not get started. This left the writing goroutine block with nothing to signal it. While in this state, a RLock on the engine was held. If a Lock was requested on the engine during this time, all future writes and queries would block waiting on the blocked wal writer. The fix is to move the atomic clearing of syncCount before the Lock is released.

stuartcarnie

LGTM

jwilder added 5 commits July 7, 2017 14:25

Fix incorrect condition in OverlapsKeyRange

4de21ac

The min key was not used in OverlapsKeyRange which caused it to return false when it should be true. This causes a bug where deletes would not write tombstones for files that actually contained the data it was supposed to delete.

Fix possible deadlocks in inmem index

893cc88

jwilder added the review label Jul 7, 2017

jwilder added this to the 1.3.0 milestone Jul 7, 2017

stuartcarnie self-requested a review July 7, 2017 21:16

stuartcarnie approved these changes Jul 7, 2017

View reviewed changes

jwilder merged commit 7dbc803 into 1.3 Jul 7, 2017

jwilder deleted the jw-13-backports branch July 7, 2017 21:43

jwilder removed the review label Jul 7, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Backport 1.3 fixes #8578

Backport 1.3 fixes #8578

jwilder commented Jul 7, 2017 •

edited

Loading

stuartcarnie left a comment

Backport 1.3 fixes #8578

Backport 1.3 fixes #8578

Conversation

jwilder commented Jul 7, 2017 • edited Loading

Required for all non-trivial PRs

stuartcarnie left a comment

Choose a reason for hiding this comment

jwilder commented Jul 7, 2017 •

edited

Loading