Skip to content
This repository has been archived by the owner on Nov 9, 2020. It is now read-only.

Deal with Total Cluster Failure in diskless log scenario #75

Open
andrewjstone opened this issue Jan 13, 2017 · 0 comments
Open

Deal with Total Cluster Failure in diskless log scenario #75

andrewjstone opened this issue Jan 13, 2017 · 0 comments

Comments

@andrewjstone
Copy link
Contributor

andrewjstone commented Jan 13, 2017

The Viewstamped Replication Revisited protocol used by Haret is different from both Raft and Paxos in that it doesn't require any syncing to disk to operate and tolerate a minority of failures. However, it also suffers from the fact that if a majority of replicas fail at the same time, the system becomes unrecoverable.

Utilizing snapshots, haret can minimize the amount of data loss in this scenario and allow a restart of failed replicas. Their needs to be some way to join them into the cluster manually via an admin disaster recovery protocol. etcd has good docs on this for their system.

@andrewjstone andrewjstone added this to the Diskless-log-KV-1.0 milestone Jun 22, 2017
@andrewjstone andrewjstone changed the title Deal with Total Cluster Failure Deal with Total Cluster Failure in diskless log scenario Jun 22, 2017
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

1 participant