etcdserver: when using --unsafe-no-fsync write data · etcd-io/etcd@94634fc

Commit

etcdserver: when using --unsafe-no-fsync write data

There are situations where we don't wish to fsync but we do want to
write the data.

Typically this occurs in clusters where fsync latency (often the
result of firmware) transiently spikes.  For Kubernetes clusters this
causes (many) elections which have knock-on effects such that the API
server will transiently fail causing other components fail in turn.

By writing the data (buffered and asynchronously flushed, so in most
situations the write is fast) and avoiding the fsync we no longer
trigger this situation and opportunistically write out the data.

Anecdotally:
  Because the fsync is missing there is the argument that certain
  types of failure events will cause data corruption or loss, in
  testing this wasn't seen.  If this was to occur the expectation is
  the member can be readded to a cluster or worst-case restored from a
  robust persisted snapshot.

  The etcd members are deployed across isolated racks with different
  power feeds.  An instantaneous failure of all of them simultaneously
  is unlikely.

  Testing was usually of the form:
   * create (Kubernetes) etcd write-churn by creating replicasets of
     some 1000s of pods
   * break/fail the leader

  Failure testing included:
   * hard node power-off events
   * disk removal
   * orderly reboots/shutdown

  In all cases when the node recovered it was able to rejoin the
  cluster and synchronize.

Loading branch information

cwedgwood committed Mar 5, 2021

1 parent afd6d8a commit 94634fc

wal/wal.go

-Original file line number
+Diff line change
@@ Expand Up / @@ -789,14 +789,16 @@ func (w *WAL) cut() error { @@
     }
     func (w *WAL) sync() error {
-    	if w.unsafeNoSync {
-    		return nil
-    	}
     	if w.encoder != nil {
     		if err := w.encoder.flush(); err != nil {
     			return err
     		}
     	}
+    	if w.unsafeNoSync {
+    		return nil
+    	}
     	start := time.Now()
     	err := fileutil.Fdatasync(w.tail().File)
@@ Expand Down @@

0 comments on commit `94634fc`

Please sign in to comment.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Commit

There are no files selected for viewing

0 comments on commit `94634fc`

Commit

There are no files selected for viewing

0 comments on commit 94634fc

0 comments on commit `94634fc`