Skip to content

Commit

Permalink
Apply suggestions from code review
Browse files Browse the repository at this point in the history
Co-authored-by: ekexium <eke@fastmail.com>
  • Loading branch information
qiancai and ekexium authored Dec 31, 2024
1 parent 2e33e21 commit 3c394f1
Show file tree
Hide file tree
Showing 2 changed files with 3 additions and 3 deletions.
2 changes: 1 addition & 1 deletion batch-processing.md
Original file line number Diff line number Diff line change
Expand Up @@ -46,7 +46,7 @@ Pipelined DML is an experimental feature introduced in TiDB v8.0.0. In v8.5.0, t

#### Key benefits

- Streams data to the storage layer during transaction execution instead of caching it entirely in memory, allowing transaction size no longer limited by TiDB memory and supporting ultra-large-scale data processing
- Streams data to the storage layer during transaction execution instead of buffering it entirely in memory, allowing transaction size no longer limited by TiDB memory and supporting ultra-large-scale data processing
- Achieves faster performance compared to standard DML
- Can be enabled through system variables without SQL modifications

Expand Down
4 changes: 2 additions & 2 deletions pipelined-dml.md.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@ This document introduces the use cases, methods, limitations, and common issues

## Overview

Pipelined DML is an experimental feature introduced in TiDB v8.0.0 to improve the performance of large-scale data write operations. When this feature is enabled, TiDB streams data directly to the storage layer during DML operations, instead of caching it entirely in memory. This pipeline-like approach simultaneously reads data (input) and writes it to the storage layer (output), effectively resolving common challenges in large-scale DML operations as follows:
Pipelined DML is an experimental feature introduced in TiDB v8.0.0 to improve the performance of large-scale data write operations. When this feature is enabled, TiDB streams data directly to the storage layer during DML operations, instead of buffering it entirely in memory. This pipeline-like approach simultaneously reads data (input) and writes it to the storage layer (output), effectively resolving common challenges in large-scale DML operations as follows:

- Memory limits: traditional DML operations might encounter out-of-memory (OOM) errors when handling large datasets.
- Performance bottlenecks: large transactions are often inefficient and is prone to causing workload fluctuations.
Expand Down Expand Up @@ -113,7 +113,7 @@ You can monitor the execution of Pipelined DML using the following methods:
- Check the [`tidb_last_txn_info`](/system-variables.md#tidb_last_txn_info-new-in-v409) system variable to get information about the last transaction executed in the current session, including whether Pipelined DML was used.
- Look for lines containing `"[pipelined dml]"` in TiDB logs to understand the execution process and progress of Pipelined DML, including the current stage and the amount of data written.
- View the `affected rows` field in the [`expensive query`](/identify-expensive-queries.md#expensive-query-log-example) logs to track the progress of long-running statements.
- Query the [`INFORMATION_SCHEMA.PROCESSLIST`](/information-schema/information-schema-processlist.md) table to view transaction execution progress. Pipelined DML is typically used for large transactions, so you can use this table monitor their execution progress.
- Query the [`INFORMATION_SCHEMA.PROCESSLIST`](/information-schema/information-schema-processlist.md) table to view transaction execution progress. Pipelined DML is typically used for large transactions, so you can use this table to monitor their execution progress.

## FAQs

Expand Down

0 comments on commit 3c394f1

Please sign in to comment.