diff --git a/best-practices/pd-scheduling-best-practices.md b/best-practices/pd-scheduling-best-practices.md index 91e0b0586b971..2c786c5df5f5e 100644 --- a/best-practices/pd-scheduling-best-practices.md +++ b/best-practices/pd-scheduling-best-practices.md @@ -280,3 +280,5 @@ For v3.0.4 and v2.1.16 or earlier, the `approximate_keys` of regions are inaccur If a TiKV node fails, PD defaults to setting the corresponding node to the **down** state after 30 minutes (customizable by configuration item `max-store-down-time`), and rebalancing replicas for regions involved. Practically, if a node failure is considered unrecoverable, you can immediately take it offline. This makes PD replenish replicas soon in another node and reduces the risk of data loss. In contrast, if a node is considered recoverable, but the recovery cannot be done in 30 minutes, you can temporarily adjust `max-store-down-time` to a larger value to avoid unnecessary replenishment of the replicas and resources waste after the timeout. + +In TiDB v5.2.0, TiKV introduces the mechanism of slow TiKV node detection. By sampling the requests in TiKV, it calculates a score ranging from 1 to 100. A TiKV node with a score greater than or equal to 80 is marked as slow. You can add [`evict-slow-store-scheduler`](/pd-control.md#scheduler-show--add--remove--pause--resume--config) to detect and schedule slow nodes. When one and only one slow node appears, and the slow score reaches the upper limit (100 by default), all leaders in the node will be evicted. \ No newline at end of file diff --git a/pd-control.md b/pd-control.md index fe003c7457ba6..b6e0ac083c029 100644 --- a/pd-control.md +++ b/pd-control.md @@ -700,7 +700,8 @@ Usage: >> scheduler add evict-leader-scheduler 1 // Move all the Region leaders on store 1 out >> scheduler config evict-leader-scheduler // Display the stores in which the scheduler is located since v4.0.0 >> scheduler add shuffle-leader-scheduler // Randomly exchange the leader on different stores ->> scheduler add shuffle-region-scheduler // Randomly scheduling the regions on different stores +>> scheduler add shuffle-region-scheduler // Randomly scheduling the Regions on different stores +>> scheduler add evict-slow-store-scheduler // When there is one and only one slow store, evict all Region leaders of that store >> scheduler remove grant-leader-scheduler-1 // Remove the corresponding scheduler, and `-1` corresponds to the store ID >> scheduler pause balance-region-scheduler 10 // Pause the balance-region scheduler for 10 seconds >> scheduler pause all 10 // Pause all schedulers for 10 seconds diff --git a/tikv-configuration-file.md b/tikv-configuration-file.md index 07b05cc8e1c62..5e0e6b238301e 100644 --- a/tikv-configuration-file.md +++ b/tikv-configuration-file.md @@ -676,6 +676,13 @@ Configuration items related to Raftstore + Controls whether to enable batch processing of the requests. When it is enabled, the write performance is significantly improved. + Default value: `true` +### `inspect-interval` + ++ At a certain interval, TiKV inspects the latency of the Raftstore component. This parameter specifies the interval of the inspection. If the latency exceeds this value, this inspection is marked as timeout. ++ Judges whether the TiKV node is slow based on the ratio of timeout inspection. ++ Default value: `"500ms"` ++ Minimum value: `"1ms"` + ## Coprocessor Configuration items related to Coprocessor