Skip to content

Commit

Permalink
en, zh: update doc for backup and restore (#233)
Browse files Browse the repository at this point in the history
* update doc for backup and restore

* update toc

* Apply suggestions from code review

Co-authored-by: Ran <huangran@pingcap.com>

* update notes

* update title

* en: udpate description and format

* en, zh: update term and descriptions

* zh: update titles and format

Co-authored-by: Ran <huangran@pingcap.com>
Co-authored-by: lilin90 <lilin@pingcap.com>
  • Loading branch information
3 people authored May 8, 2020
1 parent 7f398d9 commit 749f460
Show file tree
Hide file tree
Showing 19 changed files with 226 additions and 118 deletions.
6 changes: 3 additions & 3 deletions en/TOC.md
Original file line number Diff line number Diff line change
Expand Up @@ -45,10 +45,10 @@
+ Backup and Restore
- [Use Helm Charts](backup-and-restore-using-helm-charts.md)
+ Use CRDs
- [Back up Data to GCS](backup-to-gcs.md)
- [Restore Data from GCS](restore-from-gcs.md)
- [Back up Data to GCS Using Mydumper](backup-to-gcs.md)
- [Restore Data from GCS Using TiDB Lightning](restore-from-gcs.md)
- [Back up Data to S3-Compatible Storage Using Mydumper](backup-to-s3.md)
- [Restore Data from S3-Compatible Storage Using Loader](restore-from-s3.md)
- [Restore Data from S3-Compatible Storage Using TiDB Lightning](restore-from-s3.md)
- [Back up Data to S3-Compatible Storage Using BR](backup-to-aws-s3-using-br.md)
- [Restore Data from S3-Compatible Storage Using BR](restore-from-aws-s3-using-br.md)
- [Restore Data Using TiDB Lightning](restore-data-using-tidb-lightning.md)
Expand Down
6 changes: 3 additions & 3 deletions en/backup-and-restore-using-helm-charts.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,10 +12,10 @@ This document describes how to back up and restore the data of a TiDB cluster in
For TiDB Operator 1.1 or later versions, it is recommended that you use the backup and restoration methods based on CustomResourceDefinition (CRD).

+ If the TiDB cluster version < v3.1, refer to the following documents:
- [Back up Data to GCS](backup-to-gcs.md)
- [Restore Data from GCS](restore-from-gcs.md)
- [Back up Data to GCS Using Mydumper](backup-to-gcs.md)
- [Restore Data from GCS Using TiDB Lightning](restore-from-gcs.md)
- [Back up Data to S3-Compatible Storage Using Mydumper](backup-to-s3.md)
- [Restore Data from S3-Compatible Storage Using Loader](restore-from-s3.md)
- [Restore Data from S3-Compatible Storage Using TiDB Lightning](restore-from-s3.md)
+ If the TiDB cluster version >= v3.1, refer to the following documents:
- [Back up Data to S3-Compatible Storage Using BR](backup-to-aws-s3-using-br.md)
- [Restore Data from S3-Compatible Storage Using BR](restore-from-aws-s3-using-br.md)
Expand Down
4 changes: 2 additions & 2 deletions en/backup-to-aws-s3-using-br.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
---
title: Back up Data to S3-Compatible Storage Using BR
summary: Learn how to back up data to AWS S3 using BR.
summary: Learn how to back up data to Amazon S3 using BR.
category: how-to
---

Expand Down Expand Up @@ -96,7 +96,7 @@ Before you perform ad-hoc full backup, AWS account permissions need to be grante
3. Create the IAM role:

- To create an IAM role for the account, refer to [Create an IAM User](https://docs.aws.amazon.com/IAM/latest/UserGuide/id_users_create.html).
- Give the required permission to the IAM role you have created. Refer to [Adding and Removing IAM Identity Permissions](https://docs.aws.amazon.com/IAM/latest/UserGuide/access_policies_manage-attach-detach.html) for details. Because `Backup` needs to access the AWS S3 storage, IAM is granted the `AmazonS3FullAccess` permission.
- Give the required permission to the IAM role you have created. Refer to [Adding and Removing IAM Identity Permissions](https://docs.aws.amazon.com/IAM/latest/UserGuide/access_policies_manage-attach-detach.html) for details. Because `Backup` needs to access the Amazon S3 storage, IAM is granted the `AmazonS3FullAccess` permission.

4. Associate IAM with TiKV Pod:

Expand Down
74 changes: 58 additions & 16 deletions en/backup-to-s3.md
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,22 @@ Refer to [Ad-hoc full backup prerequisites](backup-to-aws-s3-using-br.md#prerequ

### Ad-hoc backup process

+ Create the `Backup` CR, and back up cluster data to AWS S3 by importing AccessKey and SecretKey to grant permissions:
> **Note:**
>
> Because of the `rclone` [issue](https://rclone.org/s3/#key-management-system-kms), if the backup data is stored in Amazon S3 and the `AWS-KMS` encryption is enabled, you need to add the following `spec.s3.options` configuration to the YAML file in the examples of this section:
>
> ```yaml
> spec:
> ...
> s3:
> ...
> options:
> - --ignore-checksum
> ```
**Examples:**
+ Create the `Backup` CR, and back up cluster data to Amazon S3 by importing AccessKey and SecretKey to grant permissions:
{{< copyable "shell-regular" >}}
Expand Down Expand Up @@ -55,7 +70,8 @@ Refer to [Ad-hoc full backup prerequisites](backup-to-aws-s3-using-br.md#prerequ
s3:
provider: aws
secretName: s3-secret
# region: us-east-1
region: ${region}
bucket: ${bucket}
# storageClass: STANDARD_IA
# acl: private
# endpoint:
Expand Down Expand Up @@ -89,12 +105,13 @@ Refer to [Ad-hoc full backup prerequisites](backup-to-aws-s3-using-br.md#prerequ
s3:
provider: ceph
secretName: s3-secret
endpoint: http://10.0.0.1:30074
endpoint: ${endpoint}
bucket: ${bucket}
storageClassName: local-storage
storageSize: 10Gi
```
+ Create the `Backup` CR, and back up data by binding IAM with Pod to grant permissions:
+ Create the `Backup` CR, and back up data to Amazon S3 by binding IAM with Pod to grant permissions:
{{< copyable "shell-regular" >}}
Expand Down Expand Up @@ -122,15 +139,16 @@ Refer to [Ad-hoc full backup prerequisites](backup-to-aws-s3-using-br.md#prerequ
secretName: backup-demo1-tidb-secret
s3:
provider: aws
# region: us-east-1
region: ${region}
bucket: ${bucket}
# storageClass: STANDARD_IA
# acl: private
# endpoint:
storageClassName: local-storage
storageSize: 10Gi
```
+ Create the `Backup` CR, and back up data by binding IAM with ServiceAccount to grant permissions:
+ Create the `Backup` CR, and back up data to Amazon S3 by binding IAM with ServiceAccount to grant permissions:
{{< copyable "shell-regular" >}}
Expand All @@ -157,15 +175,16 @@ Refer to [Ad-hoc full backup prerequisites](backup-to-aws-s3-using-br.md#prerequ
secretName: backup-demo1-tidb-secret
s3:
provider: aws
# region: us-east-1
region: ${region}
bucket: ${bucket}
# storageClass: STANDARD_IA
# acl: private
# endpoint:
storageClassName: local-storage
storageSize: 10Gi
```
In the above two examples, all data of the TiDB cluster is exported and backed up to Amazon S3 and Ceph respectively. You can ignore the `region`, `acl`, `endpoint`, and `storageClass` configuration items in the Amazon S3 configuration. S3-compatible storage types other than Amazon S3 can also use configuration similar to that of Amazon S3. You can also leave the configuration item fields empty if you do not need to configure these items as shown in the above Ceph configuration.
In the examples above, all data of the TiDB cluster is exported and backed up to Amazon S3 and Ceph respectively. You can ignore the `acl`, `endpoint`, and `storageClass` configuration items in the Amazon S3 configuration. S3-compatible storage types other than Amazon S3 can also use configuration similar to that of Amazon S3. You can also leave the configuration item fields empty if you do not need to configure these items as shown in the above Ceph configuration.
Amazon S3 supports the following access-control list (ACL) polices:
Expand Down Expand Up @@ -203,9 +222,11 @@ More `Backup` CRs are described as follows:
* `.spec.from.host`: the address of the TiDB cluster to be backed up.
* `.spec.from.port`: the port of the TiDB cluster to be backed up.
* `.spec.from.user`: the accessing user of the TiDB cluster to be backed up.
* `.spec.from.tidbSecretName`: the secret of the credential needed by the TiDB cluster to be backed up.
* `.spec.storageClassName`: the persistent volume (PV) type specified for the backup operation. If this item is not specified, the value of the `default-backup-storage-class-name` parameter (`standard` by default, specified when TiDB Operator is started) is used by default.
* `.spec.storageSize`: the PV size specified for the backup operation. This value must be greater than the size of the TiDB cluster to be backed up.
* `.spec.from.secretName`:the secret contains the password of the `.spec.from.user`.
* `.spec.s3.region`: the region of Amazon S3.
* `.spec.s3.bucket`: the bucket name of S3.
* `.spec.storageClassName`: the persistent volume (PV) type specified for the backup operation.
* `.spec.storageSize`: the PV size specified for the backup operation. This value must be greater than the backup data size of the TiDB cluster.
More S3-compatible `provider`s are described as follows:
Expand All @@ -228,6 +249,23 @@ The prerequisites for the scheduled backup is the same as the [prerequisites for
### Scheduled backup process
> **Note:**
>
> Because of the `rclone` [issue](https://rclone.org/s3/#key-management-system-kms), if the backup data is stored in Amazon S3 and the `AWS-KMS` encryption is enabled, you need to add the following `spec.backupTemplate.s3.options` configuration to the YAML file in the examples of this section:
>
> ```yaml
> spec:
> ...
> backupTemplate:
> ...
> s3:
> ...
> options:
> - --ignore-checksum
> ```
**Examples:**
+ Create the `BackupSchedule` CR to enable the scheduled full backup to Amazon S3 by importing AccessKey and SecretKey to grant permissions:
{{< copyable "shell-regular" >}}
Expand Down Expand Up @@ -259,7 +297,8 @@ The prerequisites for the scheduled backup is the same as the [prerequisites for
s3:
provider: aws
secretName: s3-secret
# region: us-east-1
region: ${region}
bucket: ${bucket}
# storageClass: STANDARD_IA
# acl: private
# endpoint:
Expand Down Expand Up @@ -298,7 +337,8 @@ The prerequisites for the scheduled backup is the same as the [prerequisites for
s3:
provider: ceph
secretName: s3-secret
endpoint: http://10.0.0.1:30074
endpoint: ${endpoint}
bucket: ${bucket}
storageClassName: local-storage
storageSize: 10Gi
```
Expand Down Expand Up @@ -335,7 +375,8 @@ The prerequisites for the scheduled backup is the same as the [prerequisites for
secretName: backup-demo1-tidb-secret
s3:
provider: aws
# region: us-east-1
region: ${region}
bucket: ${bucket}
# storageClass: STANDARD_IA
# acl: private
# endpoint:
Expand Down Expand Up @@ -374,7 +415,8 @@ The prerequisites for the scheduled backup is the same as the [prerequisites for
secretName: backup-demo1-tidb-secret
s3:
provider: aws
# region: us-east-1
region: ${region}
bucket: ${bucket}
# storageClass: STANDARD_IA
# acl: private
# endpoint:
Expand All @@ -398,7 +440,7 @@ You can use the following command to check all the backup items:
kubectl get bk -l tidb.pingcap.com/backup-schedule=demo1-backup-schedule-s3 -n test1
```
From the above two examples, you can see that the `backupSchedule` configuration consists of two parts. One is the unique configuration of `backupSchedule`, and the other is `backupTemplate`. `backupTemple` specifies the configuration related to the S3-compatible storage, which is the same as the configuration of the ad-hoc full backup to the S3-compatible storage (refer to [Ad-hoc backup process](#ad-hoc-backup-process) for details). The following are the unique configuration items of `backupSchedule`:
From the examples above, you can see that the `backupSchedule` configuration consists of two parts. One is the unique configuration of `backupSchedule`, and the other is `backupTemplate`. `backupTemple` specifies the configuration related to the S3-compatible storage, which is the same as the configuration of the ad-hoc full backup to the S3-compatible storage (refer to [Ad-hoc backup process](#ad-hoc-backup-process) for details). The following are the unique configuration items of `backupSchedule`:
+ `.spec.maxBackups`: A backup retention policy, which determines the maximum number of backup items to be retained. When this value is exceeded, the outdated backup items will be deleted. If you set this configuration item to `0`, all backup items are retained.
+ `.spec.maxReservedTime`: A backup retention policy based on time. For example, if you set the value of this configuration to `24h`, only backup items within the recent 24 hours are retained. All backup items out of this time are deleted. For the time format, refer to [`func ParseDuration`](https://golang.org/pkg/time/#ParseDuration). If you have set the maximum number of backup items and the longest retention time of backup items at the same time, the latter setting takes effect.
Expand Down
6 changes: 3 additions & 3 deletions en/notes-tidb-operator-v1.1.md
Original file line number Diff line number Diff line change
Expand Up @@ -130,9 +130,9 @@ After TiDB Operator is upgraded to v1.1, you can perform full backup using the B

After the TiDB Operator is upgraded to v1.1, you can restore data using the Restore CR.

- If the TiDB cluster version < v3.1, refer to [Restore data using Loader](restore-from-s3.md).
- If the TiDB cluster version >= v3.1, refer to [Restore data using BR](restore-from-aws-s3-using-br.md).
- If the TiDB cluster version < v3.1, refer to [Restore Data from S3-Compatible Storage Using TiDB Lightning](restore-from-s3.md).
- If the TiDB cluster version >= v3.1, refer to [Restore Data from S3-Compatible Storage Using BR](restore-from-aws-s3-using-br.md).

> **Note:**
>
> Currently, with Backup CR, you can restore data only from S3 and GCS using Loader, and restore data from S3 using BR. If you need to restore the backup data from local Persistent Volume Claim (PVC), you cannot switch to the CR management.
> Currently, with Restore CR, you can use TiDB Lightning to restore data from S3 and GCS, and use BR to restore data only from S3. If you need to restore the backup data from local Persistent Volume Claim (PVC), you cannot switch to the CR management.
16 changes: 8 additions & 8 deletions en/restore-data-using-tidb-lightning.md
Original file line number Diff line number Diff line change
Expand Up @@ -90,17 +90,17 @@ TiDB Lightning Helm chart supports both local and remote data sources.

* Remote

Unlike the local mode, the remote mode needs to use [rclone](https://rclone.org) to download Mydumper backup tarball file from a network storage to a PV. Any cloud storage supported by rclone should work, but currently only the following have been tested: [Google Cloud Storage (GCS)](https://cloud.google.com/storage/), [AWS S3](https://aws.amazon.com/s3/), [Ceph Object Storage](https://ceph.com/ceph-storage/object-storage/).
Unlike the local mode, the remote mode needs to use [rclone](https://rclone.org) to download Mydumper backup tarball file from a network storage to a PV. Any cloud storage supported by rclone should work, but currently only the following have been tested: [Google Cloud Storage (GCS)](https://cloud.google.com/storage/), [Amazon S3](https://aws.amazon.com/s3/), [Ceph Object Storage](https://ceph.com/ceph-storage/object-storage/).

To restore backup data from the remote source, take the following steps:

1. Make sure that `dataSource.local.nodeName` and `dataSource.local.hostPath` in `values.yaml` are commented out.

2. Create a `Secret` containing the rclone configuration. A sample configuration is listed below. Only one cloud storage configuration is required. For other cloud storages, refer to [rclone documentation](https://rclone.org/). Using AWS S3 as the storage is the same as restoring data using BR and Mydumper.
2. Create a `Secret` containing the rclone configuration. A sample configuration is listed below. Only one cloud storage configuration is required. For other cloud storages, refer to [rclone documentation](https://rclone.org/). Using Amazon S3 as the storage is the same as restoring data using BR and Mydumper.

There are three methods to grant permissions. The configuration varies with different methods. For details, see [Backup the TiDB Cluster on AWS using BR](backup-to-aws-s3-using-br.md#three-methods-to-grant-aws-account-permissions).

* If you grant permissions by importing AWS S3 AccessKey and SecretKey, or if you use Ceph or GCS as the storage, use the following configuration:
* If you grant permissions by importing Amazon S3 AccessKey and SecretKey, or if you use Ceph or GCS as the storage, use the following configuration:

{{< copyable "" >}}

Expand Down Expand Up @@ -136,7 +136,7 @@ TiDB Lightning Helm chart supports both local and remote data sources.
service_account_credentials = ${service_account_json_file_content}
```

* If you grant permissions by associating AWS S3 IAM with Pod or with ServiceAccount, you can ignore `s3.access_key_id` and `s3.secret_access_key`:
* If you grant permissions by associating Amazon S3 IAM with Pod or with ServiceAccount, you can ignore `s3.access_key_id` and `s3.secret_access_key`:

{{< copyable "" >}}

Expand Down Expand Up @@ -165,19 +165,19 @@ TiDB Lightning Helm chart supports both local and remote data sources.

The method of deploying TiDB Lightning varies with different methods of granting permissions and with different storages.

* If you grant permissions by importing AWS S3 AccessKey and SecretKey, or if you use Ceph or GCS as the storage, run the following command to deploy TiDB Lightning:
* If you grant permissions by importing Amazon S3 AccessKey and SecretKey, or if you use Ceph or GCS as the storage, run the following command to deploy TiDB Lightning:

{{< copyable "shell-regular" >}}

```shell
helm install pingcap/tidb-lightning --name=${release_name} --namespace=${namespace} --set failFast=true -f tidb-lightning-values.yaml --version=${chart_version}
```

* If you grant permissions by associating AWS S3 IAM with Pod, take the following steps:
* If you grant permissions by associating Amazon S3 IAM with Pod, take the following steps:

1. Create the IAM role:

[Create an IAM role](https://docs.aws.amazon.com/IAM/latest/UserGuide/id_users_create.html) for the account, and [grant the required permission](https://docs.aws.amazon.com/IAM/latest/UserGuide/access_policies_manage-attach-detach.html) to the role. The IAM role requires the `AmazonS3FullAccess` permission because TiDB Lightning needs to access AWS S3 storage.
[Create an IAM role](https://docs.aws.amazon.com/IAM/latest/UserGuide/id_users_create.html) for the account, and [grant the required permission](https://docs.aws.amazon.com/IAM/latest/UserGuide/access_policies_manage-attach-detach.html) to the role. The IAM role requires the `AmazonS3FullAccess` permission because TiDB Lightning needs to access Amazon S3 storage.

2. Modify `tidb-lightning-values.yaml`, and add the `iam.amazonaws.com/role: arn:aws:iam::123456789012:role/user` annotation in the `annotations` field.

Expand All @@ -193,7 +193,7 @@ The method of deploying TiDB Lightning varies with different methods of granting
>
> `arn:aws:iam::123456789012:role/user` is the IAM role created in Step 1.

* If you grant permissions by associating AWS S3 with ServiceAccount, take the following steps:
* If you grant permissions by associating Amazon S3 with ServiceAccount, take the following steps:

1. Enable the IAM role for the service account on the cluster:

Expand Down
Loading

0 comments on commit 749f460

Please sign in to comment.