Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Backport 1.3 fixes #8578

Merged
merged 5 commits into from
Jul 7, 2017
Merged

Backport 1.3 fixes #8578

merged 5 commits into from
Jul 7, 2017

Conversation

jwilder
Copy link
Contributor

@jwilder jwilder commented Jul 7, 2017

Required for all non-trivial PRs
  • Rebased/mergable
  • Tests pass
  • CHANGELOG.md updated
  • Sign CLA (if not already signed)

Backport #8577 #8576 #8567 #8518

jwilder added 5 commits July 7, 2017 14:25
The monitor goroutine calls enable compactions every 10s to spin down
(or start up) goroutines for cold shards.  This frequent Lock may be
causing lock contention for writes and queries which get blocked trying
to acquire an RLock.

The go RWMutex says that new RLock calls will block if there is a
pending Lock call that is blocked.  Switching the common path to use
an RLock should avoid the Lock and reduce lock contention for writes
and queries.
The in-memory index can get out of sync when deletes and writes
to the same measurement are running concurrently.  The index is
updated independently from data on disk and it's possible for the
index to unassign a shard when data still exists on disk.  What happens
is that there are TSM files on disk, but the index does not know that
the series that exist in those files still are in the shard.  Restarting
the server reloads the index and the data is visible again.  From and
end user perspective, this can look like more data is deleted than should
have been or that deleted data re-appears after a restart or writes to the
shard occur again.

There isn't an easy way to resolve this since the index and storage
are not transactional resources and we cannot atomically commit or
rollback changes to both at once.

As a workaround, after new TSM files are installed, we refresh the
index with series keys that exist in the new tsm files as well as
any lingering data still in the cache.  There is a small window of time
when the index may be missing series, but it will re-appear after the refresh
completes.
The min key was not used in OverlapsKeyRange which caused it to return
false when it should be true.  This causes a bug where deletes would not
write tombstones for files that actually contained the data it was supposed
to delete.
There was a race in the WAL writeToLog and scheduleSync which could
lead to a writing goroutine blocking indefinitely on its syncErr channel.

The issue was that the clearing of the syncCount happenend after the
wal was unlock.  If a goroutine was able to lock, write and call scheduleSync
before the existing scheduleSync goroutine returns and ran the defer to
clear the syncCount, then a new scheduleSync goroutine would not get started.
This left the writing goroutine block with nothing to signal it.

While in this state, a RLock on the engine was held.  If a Lock was requested
on the engine during this time, all future writes and queries would block waiting
on the blocked wal writer.

The fix is to move the atomic clearing of syncCount before the Lock is released.
@jwilder jwilder added the review label Jul 7, 2017
@jwilder jwilder added this to the 1.3.0 milestone Jul 7, 2017
@stuartcarnie stuartcarnie self-requested a review July 7, 2017 21:16
Copy link
Contributor

@stuartcarnie stuartcarnie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@jwilder jwilder merged commit 7dbc803 into 1.3 Jul 7, 2017
@jwilder jwilder deleted the jw-13-backports branch July 7, 2017 21:43
@jwilder jwilder removed the review label Jul 7, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants