Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DefaultServiceDiscoveryRetryStrategy may emit duplicated SD events #1586

Merged

Conversation

idelpivnitskiy
Copy link
Member

@idelpivnitskiy idelpivnitskiy commented May 27, 2021

Motivation:

When DefaultServiceDiscoveryRetryStrategy retains addresses till
success and recovers from the previous SD error, it doesn't check which
of new addresses were previously retained. As the result, it may
propagate duplicated "available" events to RoundRobinLoadBalancer.
RRLB does not support duplicated entries. It may lead to retaining
duplicated addresses after they become unavailable, causing failures
of requests.

Modifications:

  • Filter out previously retained events when we recover from SD-error;
  • Do not return duplicated unavailable events on subsequent SD-errors;
  • Enhance tests to catch these use-cases;

Result:

DefaultServiceDiscoveryRetryStrategyTest does not emit duplicated SD
events for the same availability state => RRLB does not retain duplicated
entries.

Motivation:

When `DefaultServiceDiscoveryRetryStrategy` retains addresses till
success and recovers from the previous SD error, it doesn't check which
of new addresses were previously retained. As the result, it may
propagate duplicated "available" events to `RoundRobinLoadBalancer`.

Modifications:

- Filter out previously retained events when we recover from SD-error;
- Enhance `overlapAddressPostRetry()` tests to catch this use-case;
- Filter out events for duplicated addresses on each resolution;
- Add `duplicatedAddresses()` test to catch this use-case;

Result:

`DefaultServiceDiscoveryRetryStrategyTest` does not emit duplicated SD
events for the same availability state.
}
// FIXME: remove addresses that are available and unavailable in the same events collection
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think "last event wins" is the expected strategy. what did you have in mind here?

Copy link
Member Author

@idelpivnitskiy idelpivnitskiy May 27, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If Collection has two entries:

  1. foo, available
  2. foo, unavailable

we will propagate both events to RRLB instead of propagating nothing. It will add more work for RRLB to add the address and remove it immediately. Due to concurrency between threads, that address can be used by a client before the 2nd event is processed

Copy link
Contributor

@bondolo bondolo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not confident I understand it will enough to approve or disapprove.

@idelpivnitskiy idelpivnitskiy requested a review from Scottmitch May 28, 2021 00:30
Copy link
Member

@Scottmitch Scottmitch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good catch @idelpivnitskiy !

@idelpivnitskiy idelpivnitskiy merged commit a51d2f2 into apple:main May 28, 2021
@idelpivnitskiy idelpivnitskiy deleted the DefaultServiceDiscoveryRetryStrategy branch May 28, 2021 03:12
idelpivnitskiy added a commit that referenced this pull request May 28, 2021
…1586)

Motivation:

When `DefaultServiceDiscoveryRetryStrategy` retains addresses till
success and recovers from the previous SD error, it doesn't check which
of new addresses were previously retained. As the result, it may
propagate duplicated "available" events to `RoundRobinLoadBalancer`.
RRLB does not support duplicated entries. It may lead to retaining
duplicated addresses after they become unavailable, causing failures
of requests.

Modifications:

- Filter out previously retained events when we recover from SD-error;
- Do not return duplicated `unavailable` events on subsequent SD-errors;
- Enhance tests to catch these use-cases;

Result:

`DefaultServiceDiscoveryRetryStrategyTest` does not emit duplicated SD
events for the same availability state => RRLB does not retain duplicated
entries.
idelpivnitskiy added a commit that referenced this pull request May 28, 2021
…1586)

Motivation:

When `DefaultServiceDiscoveryRetryStrategy` retains addresses till
success and recovers from the previous SD error, it doesn't check which
of new addresses were previously retained. As the result, it may
propagate duplicated "available" events to `RoundRobinLoadBalancer`.
RRLB does not support duplicated entries. It may lead to retaining
duplicated addresses after they become unavailable, causing failures
of requests.

Modifications:

- Filter out previously retained events when we recover from SD-error;
- Do not return duplicated `unavailable` events on subsequent SD-errors;
- Enhance tests to catch these use-cases;

Result:

`DefaultServiceDiscoveryRetryStrategyTest` does not emit duplicated SD
events for the same availability state => RRLB does not retain duplicated
entries.
idelpivnitskiy added a commit that referenced this pull request May 28, 2021
…1586)

Motivation:

When `DefaultServiceDiscoveryRetryStrategy` retains addresses till
success and recovers from the previous SD error, it doesn't check which
of new addresses were previously retained. As the result, it may
propagate duplicated "available" events to `RoundRobinLoadBalancer`.
RRLB does not support duplicated entries. It may lead to retaining
duplicated addresses after they become unavailable, causing failures
of requests.

Modifications:

- Filter out previously retained events when we recover from SD-error;
- Do not return duplicated `unavailable` events on subsequent SD-errors;
- Enhance tests to catch these use-cases;

Result:

`DefaultServiceDiscoveryRetryStrategyTest` does not emit duplicated SD
events for the same availability state => RRLB does not retain duplicated
entries.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants