Backup & Restore #2053

benbjohnson · 2015-03-22T21:20:50Z

Overview

This pull request adds the ability to snapshot a single data node at a point-in-time and restore it.

Usage

While a data node is running, you can create a hot backup to a snapshot file (mysnapshot):

$ influxd backup mysnapshot

By default, this can only be run from the data node itself. See configuration options below to snapshot from another machine.

Once you have your snapshot file, you can copy it to another machine and restore it:

$ influxd restore -config influxdb.conf mysnapshot

This command will remove the broker and data directories listed in the configuration file provided and replace them with the data in the snapshot. Once the restore is complete, you can start the influxd server normally.

Configuration Options

A configuration section has been added for the snapshot handler with the following defaults:

[snapshot]
bind-address = "127.0.0.1"
port = 8087

The bind address restricts snapshot so they can only be run from the local machine.

API

The following are the primary API additions:

Snapshots

// Snapshot represents the state of the Server at a given time.
type Snapshot struct {
    Files []SnapshotFile `json:"files"`
}

// SnapshotFile represents a single file in a Snapshot.
type SnapshotFile struct {
    Name  string `json:"name"`  // filename
    Size  int64  `json:"size"`  // file size
    Index uint64 `json:"index"` // highest index applied
}

Snapshot Reader

// SnapshotReader reads a snapshot from a Reader.
type SnapshotReader struct

// Snapshot returns the snapshot meta data.
func (sr *SnapshotReader) Snapshot() (*Snapshot, error)

// Next returns the next file in the snapshot.
func (sr *SnapshotReader) Next() (SnapshotFile, error)

// Read reads the current entry in the snapshot.
func (sr *SnapshotReader) Read(b []byte) (n int, err error)

Snapshot Writer

// SnapshotWriter writes a snapshot and the underlying files to disk as a tar archive.
type SnapshotWriter struct {
    // The snapshot to write from.
    // Removing files from the snapshot after creation will cause those files to be ignored.
    Snapshot *Snapshot

    // Writers for each file by filename.
    // Writers will be closed as they're processed and will close by the end of WriteTo().
    FileWriters map[string]SnapshotFileWriter
}

// WriteTo writes the snapshot to the writer.
// File writers are closed as they are written.
// This function will always return n == 0.
func (sw *SnapshotWriter) WriteTo(w io.Writer) (n int64, err error)

// SnapshotFileWriter is the interface used for writing a file to a snapshot.
type SnapshotFileWriter interface {
    io.WriterTo
    io.Closer
}

Server Snapshot Writer Creation

// CreateSnapshotWriter returns a writer for the current snapshot.
func (s *Server) CreateSnapshotWriter() (*SnapshotWriter, error)

Implementation

The snapshot file is a single tar archive that contains a manifest file at the beginning, the data node's meta file next, and then a list of all shard files. The metastore and shards all use Bolt so they contain a point-in-time copy of the database when the backup was initiated.

The broker node is not backed up because it can be materialized from the data in the data node. The restore command generates a broker meta store based on the highest index in the data node and generates a raft configuration based on the InfluxDB config passed in.

Caveats

This approach currently only works in clusters where the replication factor is the same as the number of nodes in the cluster. A cluster wide backup and restore will be done in the future.

TODO

Incremental backup

Fixes: #1468 #1947

This commit adds the backup command to the influxd binary as well as implements a SnapshotWriter in the influxdb package. By default the snapshot handler binds to 127.0.0.1 so it cannot be accessed outside of the local machine.

This commit adds the "influxd restore" command to the CLI. This allows a snapshot that has been produced by "influxd backup" to be restored to a config location and the broker and raft directories will be bootstrapped based on the state of the snapshot.

…ckup-restore Conflicts: cmd/influxd/main.go cmd/influxd/run.go

pauldix · 2015-03-23T17:10:07Z

cmd/influxd/restore.go

+		} else if err != nil {
+			return fmt.Errorf("next: entry=%s, err=%s", sf.Name, err)
+		}
+


Should it log output here just to let the user know that the file is being done? That way they get incremental output as the restore progresses

Yeah, that's a good idea.

Perhaps the unpack() call should take a progress callback. I've seen this pattern other places, but can't remember where.

Added progress: 4bc92c3

pauldix · 2015-03-23T17:22:13Z

is it possible to specify the path for the snapshot so you can save it on another volume?

benbjohnson · 2015-03-23T18:19:03Z

@pauldix Yep, that's supported. Sorry that wasn't more clear in the PR description:

$ influxd backup /path/to/my/snapshot

otoolep · 2015-03-23T19:57:47Z

cmd/influxd/backup.go

+
+backup downloads a snapshot of a data node and saves it to disk.
+
+        -host <url>


Minor: would this help output be more accurate if it read -host <host:port> ?

The fact that it is used as part of a URL could be considered an implementation detail, no?

We parse it as a URL so it needs to specify scheme as well (http or https).

Then the default shown should include the scheme, no?

Fixed: 4bc92c3

otoolep · 2015-03-23T22:11:48Z

cmd/influxd/backup_test.go

+
+// Ensure the backup returns an error if it cannot connect to the server.
+func TestBackupCommand_ErrConnectionRefused(t *testing.T) {
+	// Start and immediate stop a server so we have a dead port.


Super nit-pick: "immediately".

Fixed: 4bc92c3

otoolep · 2015-03-23T22:24:02Z

Generally makes sense, happy to play with this once it's merged. +1

Did you consider compressing as well as tarring? a tar.gz file would be handy. We might get decent compression.

This commit adds incremental backup support. Snapshotting from the server now creates a full backup if one does not exist and creates numbered incremental backups after that. For example, if you ran: $ influxd backup /tmp/snapshot Then you'll see a full snapshot in /tmp/snapshot. If you run the same command again then an incremental snapshot will be created at /tmp/snapshot.0. Running it again will create /tmp/snapshot.1.

…ckup-restore

Backup & Restore

benbjohnson added 2 commits March 19, 2015 22:23

Add SnapshotWriter.

0461f40

Add "influxd backup" command.

963d277

This commit adds the backup command to the influxd binary as well as implements a SnapshotWriter in the influxdb package. By default the snapshot handler binds to 127.0.0.1 so it cannot be accessed outside of the local machine.

benbjohnson added the 2 - Working label Mar 22, 2015

Add restore and bootstrap.

11c808f

This commit adds the "influxd restore" command to the CLI. This allows a snapshot that has been produced by "influxd backup" to be restored to a config location and the broker and raft directories will be bootstrapped based on the state of the snapshot.

benbjohnson force-pushed the backup-restore branch from 7044f7a to 11c808f Compare March 22, 2015 21:31

Merge branch 'master' of https://github.com/influxdb/influxdb into ba…

3befa12

…ckup-restore Conflicts: cmd/influxd/main.go cmd/influxd/run.go

pauldix reviewed Mar 23, 2015
View reviewed changes

otoolep reviewed Mar 23, 2015
View reviewed changes

Code review fixes.

29cb550

otoolep reviewed Mar 23, 2015
View reviewed changes

benbjohnson added 3 commits March 24, 2015 15:57

Code review fixes.

4bc92c3

Merge branch 'master' of https://github.com/influxdb/influxdb into ba…

21782c0

…ckup-restore

pauldix added a commit that referenced this pull request Mar 25, 2015

Merge pull request #2053 from influxdb/backup-restore

5e47ed1

Backup & Restore

pauldix merged commit 5e47ed1 into master Mar 25, 2015

pauldix removed the 2 - Working label Mar 25, 2015

pauldix mentioned this pull request Apr 7, 2015

Should be able to bootstrap broker from some given raft index #1947

Closed

toddboom deleted the backup-restore branch May 4, 2015 21:58

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Backup & Restore #2053

Backup & Restore #2053

benbjohnson commented Mar 22, 2015

pauldix Mar 23, 2015

benbjohnson Mar 23, 2015

otoolep Mar 23, 2015

benbjohnson Mar 24, 2015

pauldix commented Mar 23, 2015

benbjohnson commented Mar 23, 2015

otoolep Mar 23, 2015

otoolep Mar 23, 2015

benbjohnson Mar 23, 2015

otoolep Mar 23, 2015

benbjohnson Mar 24, 2015

otoolep Mar 23, 2015

benbjohnson Mar 24, 2015

otoolep commented Mar 23, 2015


		backup downloads a snapshot of a data node and saves it to disk.

		-host <url>

Backup & Restore #2053

Backup & Restore #2053

Conversation

benbjohnson commented Mar 22, 2015

Overview

Usage

Configuration Options

API

Snapshots

Snapshot Reader

Snapshot Writer

Server Snapshot Writer Creation

Implementation

Caveats

TODO

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

pauldix commented Mar 23, 2015

benbjohnson commented Mar 23, 2015

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

otoolep commented Mar 23, 2015