Merge partitioned clusters #42

jwolski · 2015-03-04T06:33:36Z

Ringpop should be able to handle network partitions that temporarily cause two or more clusters to form. A ringpop instance never attempts to rejoin faulty members, but could since we maintain faulty members in the membership list.

jcorbin · 2015-03-04T16:36:16Z

I'd posit that such logic is very similar to airlock/prober style logic.
Let's keep this case in mind when architecting tchannel peering strategies,
something I'll be focusing on shortly.

On Tue, Mar 3, 2015 at 10:33 PM, Jeff Wolski notifications@github.com
wrote:

Ringpop should be able to handle network partitions that temporarily cause
two or more clusters to form. A ringpop instance never attempts to rejoin
faulty members, but could since we maintain faulty members in the
membership list.

—
Reply to this email directly or view it on GitHub
#42.

jwolski · 2015-03-04T17:50:06Z

i dont see how this is related to airlock behavior.

this problem occurs at the gossip/membership level. if the network partitions, members of the original cluster will eventually converge on either side of the partition. once the partition heals, nothing within ringpop will attempt to merge both halves. serf has a mechanism by which nodes attempt to rejoin faulty members. we'll need similar behavior.

excuse the brevity of original description.

Sent from my iPhone

On Mar 4, 2015, at 8:36 AM, Joshua T Corbin notifications@github.com wrote:

I'd posit that such logic is very similar to airlock/prober style logic.
Let's keep this case in mind when architecting tchannel peering strategies,
something I'll be focusing on shortly.

On Tue, Mar 3, 2015 at 10:33 PM, Jeff Wolski notifications@github.com
wrote:

Ringpop should be able to handle network partitions that temporarily cause
two or more clusters to form. A ringpop instance never attempts to rejoin
faulty members, but could since we maintain faulty members in the
membership list.

—
Reply to this email directly or view it on GitHub
#42.

—
Reply to this email directly or view it on GitHub.

jcorbin · 2015-03-05T02:06:01Z

What I meant was:

so you have some scheme for deciding that a peer is unhealthy
this decision is then a forever decision
you probably want to probe remembered unhealthy peers in some fashion on an increasingly infrequent basis until they unbreak or are removed from peer list
such strategy sounds very similar to how at a tchannel level we'll be marking peers as unhealthy and then maybe probing them opportunistically airlock-style and/or with TChannel pings

jwolski · 2015-03-05T03:44:01Z

this makes more sense now.

yes, the approach may very well be similar.

ringpop bases whom it chooses to ping as part of its protocol period based on status alone, but can easily take into account status + last health probe and reverse a faulty member to an alive one as the result of a valid response.

Sent from my iPhone

On Mar 4, 2015, at 6:06 PM, Joshua T Corbin notifications@github.com wrote:

What I meant was:

so you have some scheme for deciding that a peer is unhealthy
this decision is then a forever decision
you probably want to probe remembered unhealthy peers in some fashion on an increasingly infrequent basis until they unbreak or are removed from peer list
such strategy sounds very similar to how at a tchannel level we'll be marking peers as unhealthy and then maybe probing them opportunistically airlock-style and/or with TChannel pings
—
Reply to this email directly or view it on GitHub.

mranney · 2015-03-05T07:15:29Z

One potential issue with using the primary protocol for this is that hosts that are down might always run up against the timeout. So if you lose 10 out of 100 nodes, you might end up waiting for 10 ping timeouts before advancing. I think it would be better to have a second protocol with a slower period to attempt to revive any previously up nodes.

jwolski · 2015-03-05T16:50:39Z

@mranney Yep, good point. From the sounds of Hashicorp's presentation, they also maintain a separate faulty member loop. I thought it'd be a nice/clean to fit into the membership iterator used during normal protocol period operation, but delaying the pings because of faulty/slow members would suck.

robins mentioned this issue May 27, 2015

Allow Nodes to mark other nodes Rejected only if not itself in Suspect / Faulty lists #84

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Merge partitioned clusters #42

Merge partitioned clusters #42

jwolski commented Mar 4, 2015

jcorbin commented Mar 4, 2015

jwolski commented Mar 4, 2015

jcorbin commented Mar 5, 2015

jwolski commented Mar 5, 2015

mranney commented Mar 5, 2015

jwolski commented Mar 5, 2015

Merge partitioned clusters #42

Merge partitioned clusters #42

Comments

jwolski commented Mar 4, 2015

jcorbin commented Mar 4, 2015

jwolski commented Mar 4, 2015

jcorbin commented Mar 5, 2015

jwolski commented Mar 5, 2015

mranney commented Mar 5, 2015

jwolski commented Mar 5, 2015