Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC + Support of BACKUP and RESTORE statements #15274

Merged
merged 10 commits into from
Apr 30, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
331 changes: 331 additions & 0 deletions docs/design/2020-04-20-brie.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,331 @@
# Proposal: Backup/Restore/Import/Export via SQL

- Author(s): [kennytm](https://github.com/kennytm)
- Last updated: 2020-04-20
- Discussion at: https://github.com/pingcap/tidb/pull/15274

## Abstract

Support running backup, restore, import and export (BRIE) tasks directly on TiDB via SQL.

## Background

[BR] is a TiDB ecosystem tool providing fast, KV-level logical backup and restore in [SST] format.
[Lightning] is another tool for importing SQL and CSV data dump into TiDB, using SST as intermediate storage.

We would like to integrate these tools directly into TiDB as SQL statements because

* BR and Lightning requires direct connection with TiKV and PD.
Since direct operation on TiKV bypass TiDB's authentication, these ports must be closed from public.
In highly protected systems like DBaaS, this means BR and Lightning as separate tools cannot be used at all.

* Supporting SQL access also improves usability, as you don't need to install new components to use these functions,
and SQL should be more familiar with database operators.

[BR]: https://github.com/pingcap/br/
[SST]: https://github.com/facebook/rocksdb/wiki/Creating-and-Ingesting-SST-files
[Lightning]: https://github.com/pingcap/tidb-lightning

## Proposal

### Syntax

Introduce 4 kinds of statements for executing backup, restore, import and export:

```sql
BACKUP DATABASE * TO 's3://storage/url';
RESTORE DATABASE * FROM 's3://storage/url';
IMPORT DATABASE * FROM 'file:///data/dump/';
EXPORT DATABASE * TO 's3://storage/url';
```

(Note: there is no plan to implement `EXPORT` as of TiDB 4.0.)

`SCHEMA` can be used as alias of `DATABASE`, as usual.

These can be restricted to selected databases or tables only:

```sql
BACKUP DATABASE db1, db2, db3 TO 's3://storage/url';
BACKUP TABLE db1.tbl1, db2.tbl2, db3.tbl3 TO 's3://storage/url';
```

Use query parameters to describe about the data source. Other options can be supplied at the end of the statement.

```sql
RESTORE DATABASE *
FROM 's3://storage/url?access-key=AAAAA&secret-access-key=SSSSS&force-path-style=1&endpoint=http://10.0.1.55:8888'
RATE_LIMIT = 512 MB/SECOND;
```

The `=` sign in the options can be omitted.

### List of options

<details><summary>Data source parameters</summary>

* s3://
* endpoint=http://s3.example.com:8080
* region=us-east-1
* storage-class=STANDARD
* sse=AES256
* acl=READ
* access-key=AKIAIOSFODNN7EXAMPLE
* secret-access-key=wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY
SunRunAway marked this conversation as resolved.
Show resolved Hide resolved
* provider=aws
* force-path-style=1
* use-accelerate-endpoint=0
* gcs://
* endpoint=http://gcs.example.com:8080
* storage-class=STANDARD
* predefined-acl=READER
* credentials-file=/home/tidb/gcp_creds.json

</details>

<details><summary>Common options</summary>

* SEND_CREDENTIALS_TO_TIKV = 1
* CONCURRENCY = 32
* CHECKSUM = 1
* RATE_LIMIT = 125 MB/SECOND
* CHECKPOINT = 1

</details>

<details><summary>Backup/Export options</summary>

* SNAPSHOT = 4 HOUR AGO
* SNAPSHOT = '2020-04-20 15:00:00'
* SNAPSHOT = 416110777682551
* LAST_BACKUP = '2020-04-20 15:00:00'
* LAST_BACKUP = 416110777682551

</details>

<details><summary>Restore/Import options</summary>

* ONLINE = 1

</details>

<details><summary>Import/Export CSV options</summary>

* CSV_SEPARATOR = ','
* CSV_DELIMITER = '"'
* CSV_HEADER = COLUMNS
* CSV_HEADER = 0
* CSV_NOT_NULL = 1
* CSV_NULL = '\\N'
* CSV_BACKSLASH_ESCAPE = 1
* CSV_TRIM_LAST_SEPARATORS = 1

</details>

<details><summary>Other Import options</summary>

* ANALYZE = 1
* SKIP_SCHEMA_FILES = 1
* BACKEND = 'local'
* TIKV_IMPORTER = '127.0.0.1:8287'
* ON_DUPLICATE = 'replace'
* STRICT_FORMAT = 1

</details>

Some niche configuration (e.g. Lightning's `io-concurrency`) are not exposed currently.

### Execution

All 4 kinds of BRIE statements are **synchronous** — they block the current session until completion.
Nevertheless, these queries can be canceled using the standard `KILL TIDB QUERY n` statement.

Additionally, because the BRIE tasks are not using transactions, they cannot be executed concurrently.
When one BRIE task is already running, another BRIE statement must not start until the existing task has completed.

Because the BRIE statements interact with TiKV directly and can read from / write to arbitrary path,
all 4 statements require SUPER privilege to execute.

If the statement is canceled or an error is encountered, it will report an error e.g.
```
MySQL [tpcc]> backup table tpcc.stock to 'local:///tmp/br_tpcc_30';
ERROR 8124 (HY000): Backup failed: backup meta exists, may be some backup files in the path already
```

If the statement completed successfully, it will report one row of summary e.g.
```
MySQL [tpcc]> backup database tpcc to 'local:///tmp/br_tpcc_32';
+-------------------------+-----------+-----------------+---------------------+---------------------+
| Destination | Size | BackupTS | Queue Time | Execution Time |
+-------------------------+-----------+-----------------+---------------------+---------------------+
| local:///tmp/br_tpcc_32 | 248665063 | 416099531454472 | 2020-04-12 23:09:48 | 2020-04-12 23:09:48 |
+-------------------------+-----------+-----------------+---------------------+---------------------+
1 row in set (58.453 sec)
```

> **Note:**
>
> The `RESTORE` (BR) and `IMPORT` (Lightning) operations are not using transactions and are not ACID compliant.
>
> The data and index KV pairs are ingested separately, i.e. a table being restored would have in an intermediate state
> where some rows are missing index, and some index doesn't have corresponding rows.
> If the operation has failed or canceled, the table would be stuck in this inconsistent state.
>
> There is currently no plan to change this behavior.

### Status query

The status of the queued and running tasks can be queried using the `SHOW` statement:

```sql
SHOW BACKUPS;
SHOW RESTORES;
SHOW IMPORTS;
SHOW EXPORTS;
```

Example:
```
MySQL [(none)]> show backups;
+-------------------------+---------+-------------------+---------------------+---------------------+------------+
| Destination | State | Progress | Queue_Time | Execution_Time | Connection |
+-------------------------+---------+-------------------+---------------------+---------------------+------------+
| local:///tmp/br_tpcc_30 | Backup | 98.38709677419355 | 2020-04-12 23:09:03 | 2020-04-12 23:09:25 | 4 |
| s3://bucket/prefix | Wait | 0 | 2020-04-12 23:09:48 | 0000-00-00 00:00:00 | 9 |
+-------------------------+---------+-------------------+---------------------+---------------------+------------+
```

The content can be filtered using `LIKE` and `WHERE` clauses, e.g.
```sql
SHOW BACKUP LIKE 'local:%'; -- same as `Destination` LIKE 'local:%'
SHOW BACKUP WHERE `State` = 'Wait';
```

Similar to the BRIE statements, these `SHOW` statements also require SUPER privilege.

To protect against leakage of secret keys,
XuHuaiyu marked this conversation as resolved.
Show resolved Hide resolved
all query parameters of S3 and GCS destinations are discarded in the SHOW statements.
SunRunAway marked this conversation as resolved.
Show resolved Hide resolved

## Rationale

### Synchronous execution

We have chosen synchronous execution entirely due to usability.
Internal tests find that it is very confusing that running `BACKUP` returns immediately while nothing is written out,
and not knowing when the `BACKUP` has finished.

This is also similar to the treatment of other long-running tasks in MySQL like `CREATE INDEX` and `ANALYZE`.

In most existing RDBMS which supports BRIE via SQL, they are executed synchronously:

| Dialect | Sync? | Syntax |
|---------|-------|--------|
| [Transact-SQL](https://docs.microsoft.com/en-us/sql/t-sql/statements/restore-statements-transact-sql?view=sql-server-ver15) | Sync | `RESTORE DATABASE db1 FROM URL = 'https://example.com/backup1.bak'` |
| [IBM DB2](https://www.ibm.com/support/knowledgecenter/en/SSEPGG_11.1.0/com.ibm.db2.luw.admin.cmd.doc/doc/r0001976.html) | Sync | `RESTORE DATABASE db1 FROM 'db2remote://example//backup/1'` |
| [SAP HANA](https://help.sap.com/viewer/4fe29514fd584807ac9f2a04f6754767/2.0.04/en-US/93637a07e3b544398aa02de1541b903c.html) | Sync | `RECOVER DATA USING FILE ('/data/backup1') CLEAR LOG` |
| [MemSQL](https://docs.memsql.com/v7.0/reference/sql-reference/pipelines-commands/create-pipeline/) (ETL) | Async | `CREATE PIPELINE restore_pl AS LOAD DATA S3 'bucket/backup1' CREDENTIALS '{…}' INTO TABLE db1.tbl1; START PIPELINE restore_pl` |
| [CockroachDB](https://www.cockroachlabs.com/docs/stable/restore.html) | Sync | `RESTORE DATABASE db1 FROM 's3://bucket/backup1'` |

On the other hand, synchronous execution is prone to accidental cancellation due to losing connection.

Anyway, if we do want to make it asynchronous, the syntax should be changed to emphasize this aspect (similar to MemSQL's syntax).

### Syntax alternatives

The current syntax inherits from an early design, but they may be tweaked, e.g.

* We use `DATABASE *` to indicate no filtering, but perhaps we should omit it?

```sql
BACKUP TO 's3://storage/url';
RESTORE FROM 's3://storage/url';
IMPORT FROM 'file:///data/dump/';
EXPORT TO 's3://storage/url';
```

### Configuring the data source

Currently we encode all storage-related parameters into the data source URL.
There are two other alternatives:

1. Use options:

```sql
RESTORE DATABASE * FROM 's3://storage/url'
S3_ACCESS_KEY = 'AAAAA'
S3_SECRET_ACCESS_KEY = 'SSSSS'
S3_FORCE_PATH_STYLE = 1
S3_ENDPOINT = 'http://10.0.1.55:8888'
RATE_LIMIT = 512 MB/SECOND;
```

2. Use a JSON configuration:

```sql
RESTORE DATABASE * FROM 's3://storage/url'
CONFIG = '{"s3":{"access-key":"AAAAA", "secret-access-key":"SSSSS", "force-path-style":true, "endpoint":"http://10.0.1.55:8888"}}'
RATE_LIMIT = 512 MB/SECOND;
```

The current BR tool's command line is close to the first option, changed from using query parameters.
This was mainly introduced to avoid accidentally leaking the secret keys.
In this integration we need to strip off the query parameter as soon as possible.

For comparison, to configure a cloud destination in other RDBMS,

| Dialect | Configuration |
|---------|---------------|
| Transact-SQL | Automatically configured inside Azure, not supported otherwise |
| IBM DB2 | User needs to configure the `db2remote://` first, before executing the BRIE statements |
| SAP HANA | Cloud storage seems not supported |
| MemSQL | Two JSON strings, one for credentials and one for non-secret settings (e.g. region, endpoint) |
| CockroachDB | URL query parameters |

## Compatibility and Migration Plan

The four BRIE statements are pure additions in TiDB 4.0+, and thus compatible with older versions.

During data migration, the `BACKUP` and `EXPORT` statements from upstream should be ignored.

`RESTORE` and `IMPORT` statements cannot be properly replicated.
Both of these will execute some DDL statements via SQL (which can be replicated).
However, the data ingestion skips the SQL layer, so binlog replication fails.
Even at TiKV level, because downstream doesn't have the same SST files, CDC may not work properly.

Currently `RESTORE` and `IMPORT` only support *offline* execution on a *fresh* cluster, i.e.
the cluster is not supposed to be serving external actors, and replication should not be set up,
so this is not expected to be a problem.
However, we should take this in mind when we support true *online* restore.

## Implementation

The BRIE statements are implemented by embedding [BR] and [Lightning] as modules into TiDB.

These modules will expose an entry point (e.g. [the `RunBackup` function] in BR) which

* accepts a configuration structure defining entirely how a task should run, and
* executes the task in the current goroutine

The entry point function will be shared by both the CLI tool and TiDB executor.
TiDB should provide every information it knows into the configuration, including:

* PD addresses
* TLS certificate and key paths

The tools should upstream logs and metrics (which should happen automatically).

[the `RunBackup` function]: https://github.com/pingcap/br/blob/f5c37f36457b582710634f31fc2b9a1fb66254c6/pkg/task/backup.go#L94

On the Backup/Restore part, PRs are already submitted to implement this:

* Parser — https://github.com/pingcap/parser/pull/746
* TiDB — https://github.com/pingcap/tidb/pull/15274

The Import (Lightning) part will start after the Backup/Restore design is accepted.

There is currently no plan to implement the Export part.

## Testing Plan

Since BRIE relies on proper TiKV, the built-in testing framework using mocktikv will not work.
Testing is going to be relying entirely on integration tests.
4 changes: 4 additions & 0 deletions errno/errcode.go
Original file line number Diff line number Diff line change
Expand Up @@ -1038,6 +1038,10 @@ const (
ErrPrivilegeCheckFail = 8121
ErrInvalidWildCard = 8122
ErrMixOfGroupFuncAndFieldsIncompatible = 8123
ErrBRIEBackupFailed = 8124
ErrBRIERestoreFailed = 8125
ErrBRIEImportFailed = 8126
ErrBRIEExportFailed = 8127

// Error codes used by TiDB ddl package
ErrUnsupportedDDLOperation = 8200
Expand Down
5 changes: 5 additions & 0 deletions errno/errname.go
Original file line number Diff line number Diff line change
Expand Up @@ -1068,6 +1068,11 @@ var MySQLErrName = map[uint16]string{
ErrUnsupportedSecondArgumentType: "JSON_OBJECTAGG: unsupported second argument type %v",
ErrLockExpire: "TTL manager has timed out, pessimistic locks may expire, please commit or rollback this transaction",

ErrBRIEBackupFailed: "Backup failed: %s",
ErrBRIERestoreFailed: "Restore failed: %s",
ErrBRIEImportFailed: "Import failed: %s",
ErrBRIEExportFailed: "Export failed: %s",

// TiKV/PD errors.
ErrPDServerTimeout: "PD server timeout",
ErrTiKVServerTimeout: "TiKV server timeout",
Expand Down
Loading