Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add document of dumpling, the new export tool #3151

Merged
merged 25 commits into from
May 26, 2020
Merged
Show file tree
Hide file tree
Changes from 2 commits
Commits
Show all changes
25 commits
Select commit Hold shift + click to select a range
764f465
dumpling: add a how-to of dumpling.
May 19, 2020
df46891
Merge branch 'docs-special-week' into sw-dumpling
YuJuncen May 19, 2020
feb154c
dumpling: change document by code review.
May 20, 2020
7b767a5
Merge branch 'sw-dumpling' of https://github.com/YuJuncen/docs-cn int…
May 20, 2020
d784de9
Update mydumper-overview.md
YuJuncen May 20, 2020
44b09f7
Merge branch 'docs-special-week' into sw-dumpling
YuJuncen May 20, 2020
f3821cc
Merge branch 'docs-special-week' into sw-dumpling
lilin90 May 20, 2020
446d081
*: udpate file name and wording
lilin90 May 20, 2020
de2afb9
Apply suggestions from code review
YuJuncen May 21, 2020
0e03726
dumpling: move intrduction of --sql and --where.
YuJuncen May 21, 2020
63012c8
dumpling: move a warnning of --where flag
YuJuncen May 21, 2020
66bd9c8
Rename export-data-using-dumpling.md to export-or-backup-using-dumpli…
YuJuncen May 21, 2020
a39267a
dumpling: fix some links
YuJuncen May 21, 2020
4be6349
Merge branch 'docs-special-week' into sw-dumpling
YuJuncen May 21, 2020
1ab5ee9
Update export-or-backup-using-dumpling.md
YuJuncen May 21, 2020
e95c470
Merge branch 'docs-special-week' into sw-dumpling
YuJuncen May 24, 2020
7276c64
Merge branch 'docs-special-week' into sw-dumpling
YuJuncen May 25, 2020
d50d517
mydumper: remove deprecated info for mydumper
May 25, 2020
09f0854
Merge branch 'docs-special-week' into sw-dumpling
YuJuncen May 25, 2020
8818321
Merge branch 'docs-special-week' into sw-dumpling
3pointer May 25, 2020
10370c0
Merge branch 'docs-special-week' into sw-dumpling
3pointer May 25, 2020
325c78a
Update export-or-backup-using-dumpling.md
YuJuncen May 26, 2020
4fe19d4
Update export-or-backup-using-dumpling.md
YuJuncen May 26, 2020
3b389f1
Update export-or-backup-using-dumpling.md
lilin90 May 26, 2020
134f954
Merge branch 'docs-special-week' into sw-dumpling
sre-bot May 26, 2020
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions TOC.md
Original file line number Diff line number Diff line change
Expand Up @@ -190,6 +190,7 @@
- [Ansible 常见运维操作](/maintain-tidb-using-ansible.md)
+ 备份与恢复
- [使用 Mydumper/TiDB Lightning 进行备份与恢复](/backup-and-restore-using-mydumper-lightning.md)
- [使用 Dumpling 进行数据导出](/export-using-dumpling.md)
- [使用 BR 进行备份与恢复](/br/backup-and-restore-tool.md)
- [BR 备份与恢复场景示例](/br/backup-and-restore-use-cases.md)
+ 定位异常查询
Expand Down
6 changes: 6 additions & 0 deletions backup-and-restore-using-mydumper-lightning.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,12 @@ aliases: ['/docs-cn/dev/how-to/maintain/backup-and-restore/mydumper-lightning/',

# 使用 Mydumper/TiDB Lightning 进行备份与恢复
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这个文档标题直接改成 使用 dumpling/TiDB Lightning 进行备份与恢复 怎么样?
下面就不用再特殊说明使用 dumpling 替换 mydumper 了
@3pointer 感觉咋样?

Copy link
Contributor

@kissmydb kissmydb May 19, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

如果 dumpling 定位是已经可以代替 mydumper 进行 tidb 的备份,建议 dumpling/TiDB Lightning 进行备份与恢复,然后 mydumper 独立一个文档,用 Deprecated 标识出来。

另外感觉缺一个说明,比如,什么时候选择 dumpling+lightning ,什么时候选择 BR备份,什么时候用 CDC 增量备份,什么时候用 BR 增量备份

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

在 tools 的 use guide 里有这个说明

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

放在 tools 的 guide 里不太合适,用户从备份这里进来,会错过,应该在备份这里加一个总述


> **警告:**
>
> 本文中提到的 `MyDumper` 已经不再由我们维护。
>
> 建议尽可能转移到新的数据导出工具 [Dumpling](/export-using-dumpling.md) 加上现有的 [TiDB Lightning](/tidb-lightning/tidb-lightning-overview.md) 完成全量数据的备份与恢复。
YuJuncen marked this conversation as resolved.
Show resolved Hide resolved

本文档将详细介绍如何使用 Mydumper/TiDB Lightning 对 TiDB 进行全量备份与恢复。增量备份与恢复可使用 [TiDB Binlog](/tidb-binlog/tidb-binlog-overview.md)。

这里假定 TiDB 服务信息如下:
Expand Down
96 changes: 96 additions & 0 deletions export-using-dumpling.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,96 @@
---
title: 使用 Dumpling 进行数据导出
summary: 使用新的导出工具 Dumpling 导出或者备份数据。
category: how-to
---

# 使用 Dumpling 进行数据导出

本文档将会介绍如何使用新的导出工具 Dumpling。
它可以把 TiDB 中存储的数据导出为 SQL 或者 CSV 格式,可以用它完成逻辑上的全量备份或者导出。
如果需要直接备份 SST 文件(KV 对)或者对延迟不敏感的增量备份,请参阅 [BR](/br/backup-and-restore-tool.md)。
如果需要实时的增量备份,请参阅 [TiCDC](/ticdc/ticdc-overview.md)。

接下来,我们会假设你已经启动了待导出的集群。当然,如果你只是想玩一下这个工具,也可以用 [TiUP](/tiup/tiup-overview.md) 现场搭建一个:
YuJuncen marked this conversation as resolved.
Show resolved Hide resolved

{{< copyable "shell-regular" >}}

```shell
tiup playground
```

然后用 `sysbench` 灌入测试用的数据:

{{< copyable "shell-regular" >}}

```shell
sysbench --mysql-host=127.0.0.1 \
--mysql-user=root \
--mysql-port=4000 \
--mysql-db=test \
oltp_insert \
--tables=3 --table-size=1000 prepare
```
YuJuncen marked this conversation as resolved.
Show resolved Hide resolved

## 从 TiDB 导出数据

使用如下命令导出数据:

{{< copyable "shell-regular" >}}

```shell
dumpling \
-u root \
-P 4000 \
-H 127.0.0.1 \
--filetype sql \
--threads 32 \
-o /tmp/test \
-F $(( 1024 * 1024 * 256 ))
```

这个命令中,`-H`,`-P`,`-u` 是经典的“地址,端口,用户”三元组。如果需要密码验证,可以用 `-p $YOUR_SECRET_PASSWORD` 传给 Dumpling。

默认情况下,除了系统数据库中的表之外,Dumpling 会导出整个数据库的表。你可以使用 `--where` 来选定要导出的记录。假如导出数据的格式是 CSV(使用 `--filetype csv` 即可导出 CSV 文件),还可以使用 `--sql` 导出指定 SQL 选择出来的记录。
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

不支持指定表的话,--where column=xxx 是如何指定条件呢?可以增加 --where 和 --sql 的例子


> **注意:**
>
> 截止到这篇文章写作时候,Dumpling 暂时还不支持仅导出用户指定的某几张表(见[这个 issue](https://github.com/pingcap/dumpling/issues/76))。
> 如果你确实需要这些功能,可以先使用 [MyDumper](/backup-and-restore-using-mydumper-lightning.md)。

默认情况下,导出的文件会存储到 `./export-<current local time>` 目录下。你可以使用 `-o` 来选择存储导出文件的目录。`-F` 选项能够指定单个文件的最大大小(和 MyDumper 不同,这里的单位是字节),与之相似的是 `-r` 选项,它指定单个文件的最大记录数(或者说,数据库中的行数)。利用这些参数可以让 Dumpling 的并行度更高。

除此之外,你可以使用 `--snapshot` 标志来指定欲导出快照的时间戳。与之相关的是 `--consistency`,这个标志控制导出数据“一致性保证”的方式。对于 TiDB 来说,默认情况下我们会通过获取某个时间戳的快照来保证一致性,因此我们才可以使用 `--snapshot` 参数指定要备份的时间戳。
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这句感觉不太清晰,要先指定--consistency,才能指定--snapshot吗?否则导出的是不一致性的?建议重新组织一下语言

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这两处已经修改~现在的怎么样?


一切完成之后,你应该可以在 `/tmp/test` 看到导出的文件了:

``` shell
$ ls -lh /tmp/test | awk '{print $5 "\t" $9}'

140B metadata
66B test-schema-create.sql
300B test.sbtest1-schema.sql
190K test.sbtest1.0.sql
300B test.sbtest2-schema.sql
190K test.sbtest2.0.sql
300B test.sbtest3-schema.sql
190K test.sbtest3.0.sql
```

另外,假如数据量非常大,可以提前调长 GC 时间避免因为导出中 GC 导致导出失败:

{{< copyable "sql" >}}

```sql
update mysql.tidb set VARIABLE_VALUE = '720h' where VARIABLE_NAME = 'tikv_gc_life_time';
```

在操作结束之后,再将 GC 时间调回原样(默认是 `10m`):

{{< copyable "sql" >}}

```sql
update mysql.tidb set VARIABLE_VALUE = '10m' where VARIABLE_NAME = 'tikv_gc_life_time';
```

最后,所有的这些导出数据都可以用 [Lightning](/tidb-lightning/tidb-lightning-tidb-backend.md) 导入回 TiDB。
6 changes: 6 additions & 0 deletions mydumper-overview.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,12 @@ aliases: ['/docs-cn/dev/reference/tools/mydumper/']

# Mydumper 使用文档

> **警告:**
>
> 本文中提到的 `MyDumper` 已经不再由我们维护。
>
> 建议尽可能转移到新的数据导出工具 [Dumpling](/export-using-dumpling.md)。
YuJuncen marked this conversation as resolved.
Show resolved Hide resolved

## Mydumper 简介

[Mydumper](https://github.com/pingcap/mydumper) 是一个 fork 项目,针对 TiDB 的特性进行了优化,推荐使用此工具对 TiDB 进行逻辑备份。
YuJuncen marked this conversation as resolved.
Show resolved Hide resolved
Expand Down