Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: deadlock when reloading configs #398

Merged
merged 1 commit into from
Feb 10, 2022
Merged

fix: deadlock when reloading configs #398

merged 1 commit into from
Feb 10, 2022

Conversation

vreynolds
Copy link
Contributor

Which problem is this PR solving?

This manifested in k8s because we get double notified about config changes.

Previously, onChange would acquire a read lock, and then each callback would also acquire a read lock
when reading whatever config value they care about. onChange also acquires a write lock when unmarshalling.
With additional goroutines calling onChange (e.g. in k8s), we were almost guaranteed to have a write lock in
between two recursive read locks.

From Go RLock docs:

// It should not be used for recursive read locking; a blocked Lock
// call excludes new readers from acquiring the lock. See the
// documentation on the RWMutex type.

A helpful answer from StackOverflow: https://stackoverflow.com/a/30549188

Fixes #380

Short description of the changes

Removing the read lock around callbacks. Every config accessor function already has a read lock, and we don't care about locking config if a callback doesn't read config values.

This manifested in k8s because we get double notified about config changes.

Previously, onChange would acquire a read lock, and then each callback would also acquire a read lock
when reading whatever config value they care about. onChange also acquires a write lock when unmarshalling.
With additional goroutines calling onChange (e.g. in k8s), we were almost guaranteed to have a write lock in
between two recursive read locks.

From Go RLock docs:
// It should not be used for recursive read locking; a blocked Lock
// call excludes new readers from acquiring the lock. See the
// documentation on the RWMutex type.

Removing the read lock around callbacks - every config accessor function already has a read lock.
@vreynolds vreynolds requested a review from a team February 10, 2022 00:08
@vreynolds vreynolds added type: bug Something isn't working version: bump patch A PR with release-worthy changes and is backwards-compatible. labels Feb 10, 2022
Copy link
Contributor

@seh seh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Such a small thing, with such drastic consequences. Thank you, Vera.

We look forward to a release including this fix.

@vreynolds vreynolds merged commit 5b5bc1a into main Feb 10, 2022
@vreynolds vreynolds deleted the vera.fix-deadlock branch February 10, 2022 17:59
ghost pushed a commit to opsramp/tracing-proxy that referenced this pull request Jul 5, 2024
This manifested in k8s because we get double notified about config changes.

Previously, onChange would acquire a read lock, and then each callback would also acquire a read lock
when reading whatever config value they care about. onChange also acquires a write lock when unmarshalling.
With additional goroutines calling onChange (e.g. in k8s), we were almost guaranteed to have a write lock in
between two recursive read locks.

From Go RLock docs:
// It should not be used for recursive read locking; a blocked Lock
// call excludes new readers from acquiring the lock. See the
// documentation on the RWMutex type.

Removing the read lock around callbacks - every config accessor function already has a read lock.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type: bug Something isn't working version: bump patch A PR with release-worthy changes and is backwards-compatible.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Live reload of rules breaks cluster and requires restart
3 participants