-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Heal raft cluster #4685
Heal raft cluster #4685
Conversation
} | ||
|
||
if sz == 0 { | ||
return internal.RPCType_Error, nil, fmt.Errorf("invalid message size: %d", sz) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jwilder this originally returned 0
which didn't seem valid, so I updated it to internal.RPCType_Error
Needs a changelog and just some minor naming comments, but 👍. |
2e2766c
to
cc74eed
Compare
@@ -63,6 +65,8 @@ func (r *rpc) proxyLeader(conn *net.TCPConn) { | |||
defer leaderConn.Close() | |||
|
|||
leaderConn.Write([]byte{MuxRPCHeader}) | |||
// re-write the original message to the leader | |||
leaderConn.Write(buf) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's going on here? Was the red blob removed below moved into this function?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We used to always forward everything to the leader. No, depending on the message, it may not be intended for the leader. In the case of promoting a new raft node, we explicitly don't want to talk to the leader, but talk to the node that needs to be promoted to the raft cluster. The read was refactored into two other methods to make it easier to interrogate the message coming in. Then, if we did want to talk to the leader, we have to re-write the original message, which is what we are doing above.
This is a pretty significant feature, that some people may not want. I could envision some unstable clusters where nodes come and go, and auto-promotion makes it harder to understand what is happening. So can I suggest a config option to allow this to be disabled? In the https://github.com/influxdb/influxdb/blob/master/etc/config.sample.toml#L27
for example. |
@otoolep I agree about the config setting to disable this feature. I'll add that. I'll do it as |
cc74eed
to
9f8e014
Compare
if err := s.rpc.enableRaft(n.Host, peers); err != nil { | ||
return fmt.Errorf("error notifying raft peer: %s", err) | ||
} | ||
s.Logger.Printf("promoted nodeID %d, host %s to raft peer", n.ID, n.Host) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@otoolep added log entry for successful promotion of node to raft peer.
115e368
to
da32c82
Compare
@@ -35,6 +35,7 @@ | |||
- [#4721](https://github.com/influxdb/influxdb/pull/4721): Export tsdb.InterfaceValues | |||
- [#4681](https://github.com/influxdb/influxdb/pull/4681): Increase default buffer size for collectd and graphite listeners | |||
- [#4659](https://github.com/influxdb/influxdb/pull/4659): Support IF EXISTS for DROP DATABASE | |||
- [#4685](https://github.com/influxdb/influxdb/pull/4685): Heal Raft Cluster |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW, I think this could be a bit more descriptive. E.g. "Cluster auto-promotes data node to Raft nodes on loss of Raft node". Just a suggestion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This only promotes a raft node when it sees a gap in the peers. It can only see a gap in the peers if someone issues a DROP SERVER
.
They type of healing you are referring too is beyond the scope of this PR, and I'm not sure we would actually want to take those steps or not.
@corylanou -- what will happen in the following scenario?
What is the status of the Raft node that comes back? What if it has data on it, that was solely replicated on that node? Will that data be queryable? |
If a partition happens, we don't change any of the peers. You have to manually issue a drop server before another node would promote itself. |
Ah, OK, this only happens in response to |
da32c82
to
4187fbb
Compare
+1 |
This PR enables the ability for a node to be promoted to a raft node when needed to fully heal the raft cluster.