Skip to content
This repository has been archived by the owner on Sep 25, 2020. It is now read-only.

Merge partitioned clusters #42

Open
jwolski opened this issue Mar 4, 2015 · 6 comments
Open

Merge partitioned clusters #42

jwolski opened this issue Mar 4, 2015 · 6 comments

Comments

@jwolski
Copy link
Contributor

jwolski commented Mar 4, 2015

Ringpop should be able to handle network partitions that temporarily cause two or more clusters to form. A ringpop instance never attempts to rejoin faulty members, but could since we maintain faulty members in the membership list.

@jcorbin
Copy link
Contributor

jcorbin commented Mar 4, 2015

I'd posit that such logic is very similar to airlock/prober style logic.
Let's keep this case in mind when architecting tchannel peering strategies,
something I'll be focusing on shortly.

On Tue, Mar 3, 2015 at 10:33 PM, Jeff Wolski notifications@github.com
wrote:

Ringpop should be able to handle network partitions that temporarily cause
two or more clusters to form. A ringpop instance never attempts to rejoin
faulty members, but could since we maintain faulty members in the
membership list.


Reply to this email directly or view it on GitHub
#42.

@jwolski
Copy link
Contributor Author

jwolski commented Mar 4, 2015

i dont see how this is related to airlock behavior.

this problem occurs at the gossip/membership level. if the network partitions, members of the original cluster will eventually converge on either side of the partition. once the partition heals, nothing within ringpop will attempt to merge both halves. serf has a mechanism by which nodes attempt to rejoin faulty members. we'll need similar behavior.

excuse the brevity of original description.

Sent from my iPhone

On Mar 4, 2015, at 8:36 AM, Joshua T Corbin notifications@github.com wrote:

I'd posit that such logic is very similar to airlock/prober style logic.
Let's keep this case in mind when architecting tchannel peering strategies,
something I'll be focusing on shortly.

On Tue, Mar 3, 2015 at 10:33 PM, Jeff Wolski notifications@github.com
wrote:

Ringpop should be able to handle network partitions that temporarily cause
two or more clusters to form. A ringpop instance never attempts to rejoin
faulty members, but could since we maintain faulty members in the
membership list.


Reply to this email directly or view it on GitHub
#42.


Reply to this email directly or view it on GitHub.

@jcorbin
Copy link
Contributor

jcorbin commented Mar 5, 2015

What I meant was:

  • so you have some scheme for deciding that a peer is unhealthy
  • this decision is then a forever decision
  • you probably want to probe remembered unhealthy peers in some fashion on an increasingly infrequent basis until they unbreak or are removed from peer list
  • such strategy sounds very similar to how at a tchannel level we'll be marking peers as unhealthy and then maybe probing them opportunistically airlock-style and/or with TChannel pings

@jwolski
Copy link
Contributor Author

jwolski commented Mar 5, 2015

this makes more sense now.

yes, the approach may very well be similar.

ringpop bases whom it chooses to ping as part of its protocol period based on status alone, but can easily take into account status + last health probe and reverse a faulty member to an alive one as the result of a valid response.

Sent from my iPhone

On Mar 4, 2015, at 6:06 PM, Joshua T Corbin notifications@github.com wrote:

What I meant was:

so you have some scheme for deciding that a peer is unhealthy
this decision is then a forever decision
you probably want to probe remembered unhealthy peers in some fashion on an increasingly infrequent basis until they unbreak or are removed from peer list
such strategy sounds very similar to how at a tchannel level we'll be marking peers as unhealthy and then maybe probing them opportunistically airlock-style and/or with TChannel pings

Reply to this email directly or view it on GitHub.

@mranney
Copy link

mranney commented Mar 5, 2015

One potential issue with using the primary protocol for this is that hosts that are down might always run up against the timeout. So if you lose 10 out of 100 nodes, you might end up waiting for 10 ping timeouts before advancing. I think it would be better to have a second protocol with a slower period to attempt to revive any previously up nodes.

@jwolski
Copy link
Contributor Author

jwolski commented Mar 5, 2015

@mranney Yep, good point. From the sounds of Hashicorp's presentation, they also maintain a separate faulty member loop. I thought it'd be a nice/clean to fit into the membership iterator used during normal protocol period operation, but delaying the pings because of faulty/slow members would suck.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants