diff --git a/ydb/docs/en/core/dev/index.md b/ydb/docs/en/core/dev/index.md index e344e0829d68..cb4ccb140566 100644 --- a/ydb/docs/en/core/dev/index.md +++ b/ydb/docs/en/core/dev/index.md @@ -27,6 +27,4 @@ Main resources: - [{#T}](../postgresql/intro.md) - [{#T}](../reference/kafka-api/index.md) -- [{#T}](troubleshooting/index.md) - -If you're interested in developing {{ ydb-short-name }} core or satellite projects, refer to the [documentation for contributors](../contributor/index.md). \ No newline at end of file +If you're interested in developing {{ ydb-short-name }} core or satellite projects, refer to the [documentation for contributors](../contributor/index.md). diff --git a/ydb/docs/en/core/dev/toc_p.yaml b/ydb/docs/en/core/dev/toc_p.yaml index 4d9e04dd192a..30c072102005 100644 --- a/ydb/docs/en/core/dev/toc_p.yaml +++ b/ydb/docs/en/core/dev/toc_p.yaml @@ -18,11 +18,6 @@ items: path: primary-key/toc_p.yaml - name: Secondary indexes href: secondary-indexes.md -- name: Troubleshooting - href: troubleshooting/index.md - include: - mode: link - path: troubleshooting/toc_p.yaml - name: Query plans optimization href: query-plans-optimization.md - name: Batch upload diff --git a/ydb/docs/en/core/dev/troubleshooting/performance/hardware/cpu-bottleneck.md b/ydb/docs/en/core/dev/troubleshooting/performance/hardware/cpu-bottleneck.md deleted file mode 100644 index 5fdc79e13044..000000000000 --- a/ydb/docs/en/core/dev/troubleshooting/performance/hardware/cpu-bottleneck.md +++ /dev/null @@ -1,14 +0,0 @@ -# CPU bottleneck - -High CPU usage can lead to slow query processing and increased response times. When CPU resources are constrained, the database may have difficulty handling complex queries or large transaction volumes. - -{{ ydb-short-name }} nodes primarily consume CPU resources for running [actors](../../../../concepts/glossary.md#actor). On each node, actors are executed using multiple [actor system pools](../../../../concepts/glossary.md#actor-system-pools). The resource consumption of each pool is measured separately which allows to identify what kind of activity changed its behavior. - -## Diagnostics - - -{% include notitle [#](_includes/cpu-bottleneck.md) %} - -## Recommendation - -Add additional [database nodes](../../../../concepts/glossary.md#database-node) to the cluster or allocate more CPU cores to the existing nodes. If that's not possible, consider distributing CPU cores between pools differently. diff --git a/ydb/docs/en/core/dev/troubleshooting/performance/hardware/disk-space.md b/ydb/docs/en/core/dev/troubleshooting/performance/hardware/disk-space.md deleted file mode 100644 index fc35621e122a..000000000000 --- a/ydb/docs/en/core/dev/troubleshooting/performance/hardware/disk-space.md +++ /dev/null @@ -1,29 +0,0 @@ -# Disk space - -A lack of available disk space can prevent the database from storing new data, resulting in the database becoming read-only. This can also cause slowdowns as the system tries to reclaim disk space by compacting existing data more aggressively. - -## Diagnostics - -1. See if the **[DB overview > Storage](../../../../reference/observability/metrics/grafana-dashboards.md#dboverview)** charts in Grafana show any spikes. - -1. In [Embedded UI](../../../../reference/embedded-ui/index.md), on the **Storage** tab, analyze the list of available storage groups and nodes and their disk usage. - - {% note tip %} - - Use the **Out of Space** filter to list only the storage groups with full disks. - - {% endnote %} - - ![](_assets/storage-groups-disk-space.png) - -{% note info %} - -It is also recommended to use the [Healthcheck API](../../../../reference/ydb-sdk/health-check-api.md) to get this information. - -{% endnote %} - -## Recommendations - -Add more [storage groups](../../../../concepts/glossary.md#storage-group) to the database. - -If the cluster doesn't have spare storage groups, configure them first. Add additional [storage nodes](../../../../concepts/glossary.md#storage-node), if necessary. diff --git a/ydb/docs/en/core/dev/troubleshooting/performance/schemas/splits-merges.md b/ydb/docs/en/core/dev/troubleshooting/performance/schemas/splits-merges.md deleted file mode 100644 index e4429a16128e..000000000000 --- a/ydb/docs/en/core/dev/troubleshooting/performance/schemas/splits-merges.md +++ /dev/null @@ -1,32 +0,0 @@ -# Excessive tablet splits and merges - -{% if oss == true and backend_name == "YDB" %} - -{% include [OLAP_not_allow_note](../../../../_includes/not_allow_for_olap_note.md) %} - -{% endif %} - -Each [row-oriented table](../../../../concepts/datamodel/table.md#row-oriented-tables) partition in {{ ydb-short-name }} is processed by a [data shard](../../../../concepts/glossary.md#data-shard) tablet. {{ ydb-short-name }} supports automatic [splitting and merging](../../../../concepts/datamodel/table.md#partitioning) of data shards which allows it to seamlessly adapt to changes in workloads. However, these operations are not free and might have a short-term negative impact on query latencies. - -When {{ ydb-short-name }} splits a partition, it replaces the original partition with two new partitions covering the same range of primary keys. Now, two data shards process the range of primary keys that was previously handled by a single data shard, thereby adding more computing resources for the table. - -By default, {{ ydb-short-name }} splits a table partition when it reaches 2 GB in size. However, it's recommended to also enable partitioning by load, allowing {{ ydb-short-name }} to split overloaded partitions even if they are smaller than 2 GB. - -A [scheme shard](../../../../concepts/glossary.md#scheme-shard) takes approximately 15 seconds to assess whether a data shard requires splitting. By default, the CPU usage threshold for splitting a data shard is set at 50%. - -When {{ ydb-short-name }} merges adjacent partitions in a row-oriented table, they are replaced with a single partition that covers their range of primary keys. TThe corresponding data shards are also consolidated into a single data shard to manage the new partition. - -For merging to occur, data shards must have existed for at least 10 minutes, and their CPU usage over the last hour must not exceed 35%. - -When configuring [table partitioning](../../../../concepts/datamodel/table.md#partitioning), you can also set limits for the [minimum](../../../../concepts/datamodel/table.md#auto_partitioning_min_partitions_count) and [maximum number of partitions](../../../../concepts/datamodel/table.md#auto_partitioning_max_partitions_count). If the difference between the minimum and maximum limits exceeds 20% and the table load varies significantly over time, [Hive](../../../../concepts/glossary.md#hive) may start splitting overloaded tables and then merging them back during periods of low load. - -## Diagnostics - - -{% include notitle [#](_includes/splits-merges.md) %} - -## Recommendations - -If the user load on {{ ydb-short-name }} has not changed, consider adjusting the gap between the min and max limits for the number of table partitions to the recommended 20% difference. Use the [`ALTER TABLE table_name SET (key = value)`](../../../../yql/reference/syntax/alter_table/set.md) YQL statement to update the [`AUTO_PARTITIONING_MIN_PARTITIONS_COUNT`](../../../../concepts/datamodel/table.md#auto_partitioning_min_partitions_count) and [`AUTO_PARTITIONING_MAX_PARTITIONS_COUNT`](../../../../concepts/datamodel/table.md#auto_partitioning_max_partitions_count) parameters. - -If you want to avoid splitting and merging data shards, you can set the min limit to the max limit value or disable partitioning by load. diff --git a/ydb/docs/en/core/toc_i.yaml b/ydb/docs/en/core/toc_i.yaml index adedd86f8346..7fac9d9efa4f 100644 --- a/ydb/docs/en/core/toc_i.yaml +++ b/ydb/docs/en/core/toc_i.yaml @@ -37,6 +37,11 @@ items: include: mode: link path: recipes/toc_p.yaml +- name: Troubleshooting + href: troubleshooting/index.md + include: + mode: link + path: troubleshooting/toc_p.yaml - name: Questions and answers href: faq/index.md include: diff --git a/ydb/docs/en/core/dev/troubleshooting/index.md b/ydb/docs/en/core/troubleshooting/index.md similarity index 100% rename from ydb/docs/en/core/dev/troubleshooting/index.md rename to ydb/docs/en/core/troubleshooting/index.md diff --git a/ydb/docs/en/core/dev/troubleshooting/performance/hardware/_assets/cpu-batch-pool.png b/ydb/docs/en/core/troubleshooting/performance/hardware/_assets/cpu-batch-pool.png similarity index 100% rename from ydb/docs/en/core/dev/troubleshooting/performance/hardware/_assets/cpu-batch-pool.png rename to ydb/docs/en/core/troubleshooting/performance/hardware/_assets/cpu-batch-pool.png diff --git a/ydb/docs/en/core/dev/troubleshooting/performance/hardware/_assets/cpu-by-pool.png b/ydb/docs/en/core/troubleshooting/performance/hardware/_assets/cpu-by-pool.png similarity index 100% rename from ydb/docs/en/core/dev/troubleshooting/performance/hardware/_assets/cpu-by-pool.png rename to ydb/docs/en/core/troubleshooting/performance/hardware/_assets/cpu-by-pool.png diff --git a/ydb/docs/en/core/dev/troubleshooting/performance/hardware/_assets/cpu-ic-pool.png b/ydb/docs/en/core/troubleshooting/performance/hardware/_assets/cpu-ic-pool.png similarity index 100% rename from ydb/docs/en/core/dev/troubleshooting/performance/hardware/_assets/cpu-ic-pool.png rename to ydb/docs/en/core/troubleshooting/performance/hardware/_assets/cpu-ic-pool.png diff --git a/ydb/docs/en/core/dev/troubleshooting/performance/hardware/_assets/cpu-io-pool.png b/ydb/docs/en/core/troubleshooting/performance/hardware/_assets/cpu-io-pool.png similarity index 100% rename from ydb/docs/en/core/dev/troubleshooting/performance/hardware/_assets/cpu-io-pool.png rename to ydb/docs/en/core/troubleshooting/performance/hardware/_assets/cpu-io-pool.png diff --git a/ydb/docs/en/core/dev/troubleshooting/performance/hardware/_assets/cpu-read-only-tx-latency.png b/ydb/docs/en/core/troubleshooting/performance/hardware/_assets/cpu-read-only-tx-latency.png similarity index 100% rename from ydb/docs/en/core/dev/troubleshooting/performance/hardware/_assets/cpu-read-only-tx-latency.png rename to ydb/docs/en/core/troubleshooting/performance/hardware/_assets/cpu-read-only-tx-latency.png diff --git a/ydb/docs/en/core/dev/troubleshooting/performance/hardware/_assets/cpu-row-read-rows.png b/ydb/docs/en/core/troubleshooting/performance/hardware/_assets/cpu-row-read-rows.png similarity index 100% rename from ydb/docs/en/core/dev/troubleshooting/performance/hardware/_assets/cpu-row-read-rows.png rename to ydb/docs/en/core/troubleshooting/performance/hardware/_assets/cpu-row-read-rows.png diff --git a/ydb/docs/en/core/dev/troubleshooting/performance/hardware/_assets/cpu-system-pool.png b/ydb/docs/en/core/troubleshooting/performance/hardware/_assets/cpu-system-pool.png similarity index 100% rename from ydb/docs/en/core/dev/troubleshooting/performance/hardware/_assets/cpu-system-pool.png rename to ydb/docs/en/core/troubleshooting/performance/hardware/_assets/cpu-system-pool.png diff --git a/ydb/docs/en/core/dev/troubleshooting/performance/hardware/_assets/cpu-user-pool.png b/ydb/docs/en/core/troubleshooting/performance/hardware/_assets/cpu-user-pool.png similarity index 100% rename from ydb/docs/en/core/dev/troubleshooting/performance/hardware/_assets/cpu-user-pool.png rename to ydb/docs/en/core/troubleshooting/performance/hardware/_assets/cpu-user-pool.png diff --git a/ydb/docs/en/core/dev/troubleshooting/performance/hardware/_assets/disk-time-available--disk-cost.png b/ydb/docs/en/core/troubleshooting/performance/hardware/_assets/disk-time-available--disk-cost.png similarity index 100% rename from ydb/docs/en/core/dev/troubleshooting/performance/hardware/_assets/disk-time-available--disk-cost.png rename to ydb/docs/en/core/troubleshooting/performance/hardware/_assets/disk-time-available--disk-cost.png diff --git a/ydb/docs/en/core/dev/troubleshooting/performance/hardware/_assets/embedded-ui-cpu-system-pool.png b/ydb/docs/en/core/troubleshooting/performance/hardware/_assets/embedded-ui-cpu-system-pool.png similarity index 100% rename from ydb/docs/en/core/dev/troubleshooting/performance/hardware/_assets/embedded-ui-cpu-system-pool.png rename to ydb/docs/en/core/troubleshooting/performance/hardware/_assets/embedded-ui-cpu-system-pool.png diff --git a/ydb/docs/en/core/dev/troubleshooting/performance/hardware/_assets/microbursts.png b/ydb/docs/en/core/troubleshooting/performance/hardware/_assets/microbursts.png similarity index 100% rename from ydb/docs/en/core/dev/troubleshooting/performance/hardware/_assets/microbursts.png rename to ydb/docs/en/core/troubleshooting/performance/hardware/_assets/microbursts.png diff --git a/ydb/docs/en/core/dev/troubleshooting/performance/hardware/_assets/request-size.png b/ydb/docs/en/core/troubleshooting/performance/hardware/_assets/request-size.png similarity index 100% rename from ydb/docs/en/core/dev/troubleshooting/performance/hardware/_assets/request-size.png rename to ydb/docs/en/core/troubleshooting/performance/hardware/_assets/request-size.png diff --git a/ydb/docs/en/core/dev/troubleshooting/performance/hardware/_assets/requests.png b/ydb/docs/en/core/troubleshooting/performance/hardware/_assets/requests.png similarity index 100% rename from ydb/docs/en/core/dev/troubleshooting/performance/hardware/_assets/requests.png rename to ydb/docs/en/core/troubleshooting/performance/hardware/_assets/requests.png diff --git a/ydb/docs/en/core/dev/troubleshooting/performance/hardware/_assets/response-size.png b/ydb/docs/en/core/troubleshooting/performance/hardware/_assets/response-size.png similarity index 100% rename from ydb/docs/en/core/dev/troubleshooting/performance/hardware/_assets/response-size.png rename to ydb/docs/en/core/troubleshooting/performance/hardware/_assets/response-size.png diff --git a/ydb/docs/en/core/dev/troubleshooting/performance/hardware/_assets/storage-groups-disk-space.png b/ydb/docs/en/core/troubleshooting/performance/hardware/_assets/storage-groups-disk-space.png similarity index 100% rename from ydb/docs/en/core/dev/troubleshooting/performance/hardware/_assets/storage-groups-disk-space.png rename to ydb/docs/en/core/troubleshooting/performance/hardware/_assets/storage-groups-disk-space.png diff --git a/ydb/docs/en/core/dev/troubleshooting/performance/hardware/_includes/cpu-bottleneck.md b/ydb/docs/en/core/troubleshooting/performance/hardware/_includes/cpu-bottleneck.md similarity index 79% rename from ydb/docs/en/core/dev/troubleshooting/performance/hardware/_includes/cpu-bottleneck.md rename to ydb/docs/en/core/troubleshooting/performance/hardware/_includes/cpu-bottleneck.md index 15a27e005e32..d2e89b0c5acb 100644 --- a/ydb/docs/en/core/dev/troubleshooting/performance/hardware/_includes/cpu-bottleneck.md +++ b/ydb/docs/en/core/troubleshooting/performance/hardware/_includes/cpu-bottleneck.md @@ -1,6 +1,6 @@ -1. Use **Diagnostics** in the [Embedded UI](../../../../../reference/embedded-ui/index.md) to analyze CPU utilization in all pools: +1. Use **Diagnostics** in the [Embedded UI](../../../../reference/embedded-ui/index.md) to analyze CPU utilization in all pools: - 1. In the [Embedded UI](../../../../../reference/embedded-ui/index.md), go to the **Databases** tab and click on the database. + 1. In the [Embedded UI](../../../../reference/embedded-ui/index.md), go to the **Databases** tab and click on the database. 1. On the **Navigation** tab, ensure the required database is selected. @@ -12,7 +12,7 @@ 1. Use Grafana charts to analyze CPU utilization in all pools: - 1. Open the **[CPU](../../../../../reference/observability/metrics/grafana-dashboards.md#cpu)** dashboard in Grafana. + 1. Open the **[CPU](../../../../reference/observability/metrics/grafana-dashboards.md#cpu)** dashboard in Grafana. 1. See if the following charts show any spikes: diff --git a/ydb/docs/en/core/dev/troubleshooting/performance/hardware/_includes/io-bandwidth.md b/ydb/docs/en/core/troubleshooting/performance/hardware/_includes/io-bandwidth.md similarity index 86% rename from ydb/docs/en/core/dev/troubleshooting/performance/hardware/_includes/io-bandwidth.md rename to ydb/docs/en/core/troubleshooting/performance/hardware/_includes/io-bandwidth.md index 9fc44a99178b..83fda0d37845 100644 --- a/ydb/docs/en/core/dev/troubleshooting/performance/hardware/_includes/io-bandwidth.md +++ b/ydb/docs/en/core/troubleshooting/performance/hardware/_includes/io-bandwidth.md @@ -1,4 +1,4 @@ -1. Open the **[Distributed Storage Overview](../../../../../reference/observability/metrics/grafana-dashboards.md)** dashboard in Grafana. +1. Open the **[Distributed Storage Overview](../../../../reference/observability/metrics/grafana-dashboards.md)** dashboard in Grafana. 1. On the **DiskTimeAvailable and total Cost relation** chart, see if the **Total Cost** spikes cross the **DiskTimeAvailable** level. diff --git a/ydb/docs/en/core/troubleshooting/performance/hardware/cpu-bottleneck.md b/ydb/docs/en/core/troubleshooting/performance/hardware/cpu-bottleneck.md new file mode 100644 index 000000000000..41d5c9799c12 --- /dev/null +++ b/ydb/docs/en/core/troubleshooting/performance/hardware/cpu-bottleneck.md @@ -0,0 +1,14 @@ +# CPU bottleneck + +High CPU usage can lead to slow query processing and increased response times. When CPU resources are constrained, the database may have difficulty handling complex queries or large transaction volumes. + +{{ ydb-short-name }} nodes primarily consume CPU resources for running [actors](../../../concepts/glossary.md#actor). On each node, actors are executed using multiple [actor system pools](../../../concepts/glossary.md#actor-system-pools). The resource consumption of each pool is measured separately which allows to identify what kind of activity changed its behavior. + +## Diagnostics + + +{% include notitle [#](_includes/cpu-bottleneck.md) %} + +## Recommendation + +Add additional [database nodes](../../../concepts/glossary.md#database-node) to the cluster or allocate more CPU cores to the existing nodes. If that's not possible, consider distributing CPU cores between pools differently. diff --git a/ydb/docs/en/core/troubleshooting/performance/hardware/disk-space.md b/ydb/docs/en/core/troubleshooting/performance/hardware/disk-space.md new file mode 100644 index 000000000000..112c944c0e1b --- /dev/null +++ b/ydb/docs/en/core/troubleshooting/performance/hardware/disk-space.md @@ -0,0 +1,29 @@ +# Disk space + +A lack of available disk space can prevent the database from storing new data, resulting in the database becoming read-only. This can also cause slowdowns as the system tries to reclaim disk space by compacting existing data more aggressively. + +## Diagnostics + +1. See if the **[DB overview > Storage](../../../reference/observability/metrics/grafana-dashboards.md#dboverview)** charts in Grafana show any spikes. + +1. In [Embedded UI](../../../reference/embedded-ui/index.md), on the **Storage** tab, analyze the list of available storage groups and nodes and their disk usage. + + {% note tip %} + + Use the **Out of Space** filter to list only the storage groups with full disks. + + {% endnote %} + + ![](_assets/storage-groups-disk-space.png) + +{% note info %} + +It is also recommended to use the [Healthcheck API](../../../reference/ydb-sdk/health-check-api.md) to get this information. + +{% endnote %} + +## Recommendations + +Add more [storage groups](../../../concepts/glossary.md#storage-group) to the database. + +If the cluster doesn't have spare storage groups, configure them first. Add additional [storage nodes](../../../concepts/glossary.md#storage-node), if necessary. diff --git a/ydb/docs/en/core/dev/troubleshooting/performance/hardware/insufficient-memory.md b/ydb/docs/en/core/troubleshooting/performance/hardware/insufficient-memory.md similarity index 90% rename from ydb/docs/en/core/dev/troubleshooting/performance/hardware/insufficient-memory.md rename to ydb/docs/en/core/troubleshooting/performance/hardware/insufficient-memory.md index 9325368f0127..59b19c1e380d 100644 --- a/ydb/docs/en/core/dev/troubleshooting/performance/hardware/insufficient-memory.md +++ b/ydb/docs/en/core/troubleshooting/performance/hardware/insufficient-memory.md @@ -18,7 +18,7 @@ Additionally, which components within the {{ ydb-short-name }} process consume 1. Determine whether any {{ ydb-short-name }} nodes recently restarted for unknown reasons. Exclude cases of {{ ydb-short-name }} version upgrades and other planned maintenance. This could reveal nodes terminated by OOM killer and restarted by `systemd`. - 1. Open [Embedded UI](../../../../reference/embedded-ui/index.md). + 1. Open [Embedded UI](../../../reference/embedded-ui/index.md). 1. On the **Nodes** tab, look for nodes that have low uptime. @@ -36,11 +36,11 @@ Additionally, which components within the {{ ydb-short-name }} process consume 1. Determine whether memory usage reached 100% of capacity. - 1. Open the **[DB overview](../../../../reference/observability/metrics/grafana-dashboards.md#dboverview)** dashboard in Grafana. + 1. Open the **[DB overview](../../../reference/observability/metrics/grafana-dashboards.md#dboverview)** dashboard in Grafana. 1. Analyze the charts in the **Memory** section. -1. Determine whether the user load on {{ ydb-short-name }} has increased. Analyze the following charts on the **[DB overview](../../../../reference/observability/metrics/grafana-dashboards.md#dboverview)** dashboard in Grafana: +1. Determine whether the user load on {{ ydb-short-name }} has increased. Analyze the following charts on the **[DB overview](../../../reference/observability/metrics/grafana-dashboards.md#dboverview)** dashboard in Grafana: - **Requests** chart - **Request size** chart @@ -54,4 +54,4 @@ Consider the following solutions for addressing insufficient memory: - If the load on {{ ydb-short-name }} has increased due to new usage patterns or increased query rate, try optimizing the application to reduce the load on {{ ydb-short-name }} or add more {{ ydb-short-name }} nodes. -- If the load on {{ ydb-short-name }} has not changed but nodes are still restarting, consider adding more {{ ydb-short-name }} nodes or raising the hard memory limit for the nodes. For more information about memory management in {{ ydb-short-name }}, see [{#T}](../../../../reference/configuration/index.md#memory-controller). +- If the load on {{ ydb-short-name }} has not changed but nodes are still restarting, consider adding more {{ ydb-short-name }} nodes or raising the hard memory limit for the nodes. For more information about memory management in {{ ydb-short-name }}, see [{#T}](../../../reference/configuration/index.md#memory-controller). diff --git a/ydb/docs/en/core/dev/troubleshooting/performance/hardware/io-bandwidth.md b/ydb/docs/en/core/troubleshooting/performance/hardware/io-bandwidth.md similarity index 84% rename from ydb/docs/en/core/dev/troubleshooting/performance/hardware/io-bandwidth.md rename to ydb/docs/en/core/troubleshooting/performance/hardware/io-bandwidth.md index 4ffc98ba479f..a21a4238e24a 100644 --- a/ydb/docs/en/core/dev/troubleshooting/performance/hardware/io-bandwidth.md +++ b/ydb/docs/en/core/troubleshooting/performance/hardware/io-bandwidth.md @@ -9,7 +9,7 @@ A high rate of read and write operations can overwhelm the disk subsystem, leadi ## Recommendations -Add more [storage groups](../../../../concepts/glossary.md#storage-group) to the database. +Add more [storage groups](../../../concepts/glossary.md#storage-group) to the database. In cases of high microburst rates, balancing the load across storage groups might help. diff --git a/ydb/docs/en/core/dev/troubleshooting/performance/hardware/toc_p.yaml b/ydb/docs/en/core/troubleshooting/performance/hardware/toc_p.yaml similarity index 100% rename from ydb/docs/en/core/dev/troubleshooting/performance/hardware/toc_p.yaml rename to ydb/docs/en/core/troubleshooting/performance/hardware/toc_p.yaml diff --git a/ydb/docs/en/core/dev/troubleshooting/performance/index.md b/ydb/docs/en/core/troubleshooting/performance/index.md similarity index 86% rename from ydb/docs/en/core/dev/troubleshooting/performance/index.md rename to ydb/docs/en/core/troubleshooting/performance/index.md index 1d16912ce186..71544d3acbd3 100644 --- a/ydb/docs/en/core/dev/troubleshooting/performance/index.md +++ b/ydb/docs/en/core/troubleshooting/performance/index.md @@ -6,15 +6,15 @@ Addressing database performance issues often requires a holistic approach, which Troubleshooting performance issues in {{ ydb-short-name }} involves the following tools: -- [{{ ydb-short-name }} metrics](../../../reference/observability/metrics/index.md) +- [{{ ydb-short-name }} metrics](../../reference/observability/metrics/index.md) - Diagnistic steps for most performance issues involve analyzing [Grafana dashboards](../../../reference/observability/metrics/grafana-dashboards.md) that use {{ ydb-short-name }} metrics collected by Prometheus. For information on installing Grafana and Prometheus, see [{#T}](../../../devops/manual/monitoring.md). + Diagnistic steps for most performance issues involve analyzing [Grafana dashboards](../../reference/observability/metrics/grafana-dashboards.md) that use {{ ydb-short-name }} metrics collected by Prometheus For information on installing Grafana and Prometheus, see [{#T}](../../devops/manual/monitoring.md). -- [{{ ydb-short-name }} logs](../../../devops/manual/logging.md) -- [Tracing](../../../reference/observability/tracing/setup.md) -- [{{ ydb-short-name }} CLI](../../../reference/ydb-cli/index.md) -- [Embedded UI](../../../reference/embedded-ui/index.md) -- [Query plans](../../query-plans-optimization.md) +- [{{ ydb-short-name }} logs](../../devops/manual/logging.md) +- [Tracing](../../reference/observability/tracing/setup.md) +- [{{ ydb-short-name }} CLI](../../reference/ydb-cli/index.md) +- [Embedded UI](../../reference/embedded-ui/index.md) +- [Query plans](../../dev/query-plans-optimization.md) - Third-party observability tools ## Classification of {{ ydb-short-name }} performance issues @@ -33,7 +33,7 @@ Database performance issues can be classified into several categories based on t ### Insufficient resource issues -These issues refer to situations when the workload demands more physical resources — such as CPU, memory, disk space, and network bandwidth — than allocated to a database. In some cases, suboptimal allocation of resources, for example misconfigured [control groups (cgroups)](https://en.wikipedia.org/wiki/Cgroups) or [actor system pools](../../../concepts/glossary.md#actor-system-pool), may also result in insufficient resources for {{ ydb-short-name }} and increase query latencies even though physical hardware resources are still available on the database server. +These issues refer to situations when the workload demands more physical resources — such as CPU, memory, disk space, and network bandwidth — than allocated to a database. In some cases, suboptimal allocation of resources, for example misconfigured [control groups (cgroups)](https://en.wikipedia.org/wiki/Cgroups) or [actor system pools](../../concepts/glossary.md#actor-system-pool), may also result in insufficient resources for {{ ydb-short-name }} and increase query latencies even though physical hardware resources are still available on the database server. - **[CPU bottlenecks](hardware/cpu-bottleneck.md)**. High CPU usage can result in slow query processing and increased response times. When CPU resources are limited, the database may struggle to handle complex queries or large transaction loads. @@ -41,7 +41,7 @@ These issues refer to situations when the workload demands more physical resourc - **[Insufficient memory (RAM)](hardware/insufficient-memory.md)**. Queries require memory to temporarily store various intermediate data during execution. A lack of available memory can negatively impact database performance in multiple ways. -- **[Insufficient disk I/O bandwidth](hardware/io-bandwidth.md)**. A high rate of read/write operations can overwhelm the disk subsystem, causing increased data access latencies. When the [distributed storage](../../../concepts/glossary.md#distributed-storage) cannot read or write data quickly enough, queries requiring disk access will take longer to execute. +- **[Insufficient disk I/O bandwidth](hardware/io-bandwidth.md)**. A high rate of read/write operations can overwhelm the disk subsystem, causing increased data access latencies. When the [distributed storage](../../concepts/glossary.md#distributed-storage) cannot read or write data quickly enough, queries requiring disk access will take longer to execute. ### Operating system issues diff --git a/ydb/docs/en/core/dev/troubleshooting/performance/infrastructure/_assets/cluster-nodes.png b/ydb/docs/en/core/troubleshooting/performance/infrastructure/_assets/cluster-nodes.png similarity index 100% rename from ydb/docs/en/core/dev/troubleshooting/performance/infrastructure/_assets/cluster-nodes.png rename to ydb/docs/en/core/troubleshooting/performance/infrastructure/_assets/cluster-nodes.png diff --git a/ydb/docs/en/core/dev/troubleshooting/performance/infrastructure/_assets/diagnostics-network.png b/ydb/docs/en/core/troubleshooting/performance/infrastructure/_assets/diagnostics-network.png similarity index 100% rename from ydb/docs/en/core/dev/troubleshooting/performance/infrastructure/_assets/diagnostics-network.png rename to ydb/docs/en/core/troubleshooting/performance/infrastructure/_assets/diagnostics-network.png diff --git a/ydb/docs/en/core/dev/troubleshooting/performance/infrastructure/_includes/dc-outage.md b/ydb/docs/en/core/troubleshooting/performance/infrastructure/_includes/dc-outage.md similarity index 76% rename from ydb/docs/en/core/dev/troubleshooting/performance/infrastructure/_includes/dc-outage.md rename to ydb/docs/en/core/troubleshooting/performance/infrastructure/_includes/dc-outage.md index cb838ca971e1..a2dce5fe4322 100644 --- a/ydb/docs/en/core/dev/troubleshooting/performance/infrastructure/_includes/dc-outage.md +++ b/ydb/docs/en/core/troubleshooting/performance/infrastructure/_includes/dc-outage.md @@ -1,8 +1,8 @@ To determine if one of the data centers of the {{ ydb-short-name }} cluster is not available, follow these steps: -1. Open [Embedded UI](../../../../../reference/embedded-ui/index.md). +1. Open [Embedded UI](../../../../reference/embedded-ui/index.md). -1. On the **Nodes** tab, analyze the [health indicators](../../../../../reference/embedded-ui/ydb-monitoring.md#colored_indicator) in the **Host** and **DC** columns. +1. On the **Nodes** tab, analyze the [health indicators](../../../../reference/embedded-ui/ydb-monitoring.md#colored_indicator) in the **Host** and **DC** columns. ![](../_assets/cluster-nodes.png) diff --git a/ydb/docs/en/core/dev/troubleshooting/performance/infrastructure/_includes/network.md b/ydb/docs/en/core/troubleshooting/performance/infrastructure/_includes/network.md similarity index 80% rename from ydb/docs/en/core/dev/troubleshooting/performance/infrastructure/_includes/network.md rename to ydb/docs/en/core/troubleshooting/performance/infrastructure/_includes/network.md index bdd7fecc7676..e150b13e652b 100644 --- a/ydb/docs/en/core/dev/troubleshooting/performance/infrastructure/_includes/network.md +++ b/ydb/docs/en/core/troubleshooting/performance/infrastructure/_includes/network.md @@ -1,6 +1,6 @@ -To diagnose network issues, use the healthcheck in the [Embedded UI](../../../../../reference/embedded-ui/index.md): +To diagnose network issues, use the healthcheck in the [Embedded UI](../../../../reference/embedded-ui/index.md): -1. Open the [Embedded UI](../../../../../reference/embedded-ui/index.md): +1. Open the [Embedded UI](../../../../reference/embedded-ui/index.md): 1. Navigate to the **Databases** tab and click on the desired database. diff --git a/ydb/docs/en/core/dev/troubleshooting/performance/infrastructure/dc-drills.md b/ydb/docs/en/core/troubleshooting/performance/infrastructure/dc-drills.md similarity index 100% rename from ydb/docs/en/core/dev/troubleshooting/performance/infrastructure/dc-drills.md rename to ydb/docs/en/core/troubleshooting/performance/infrastructure/dc-drills.md diff --git a/ydb/docs/en/core/dev/troubleshooting/performance/infrastructure/dc-outage.md b/ydb/docs/en/core/troubleshooting/performance/infrastructure/dc-outage.md similarity index 100% rename from ydb/docs/en/core/dev/troubleshooting/performance/infrastructure/dc-outage.md rename to ydb/docs/en/core/troubleshooting/performance/infrastructure/dc-outage.md diff --git a/ydb/docs/en/core/dev/troubleshooting/performance/infrastructure/hardware.md b/ydb/docs/en/core/troubleshooting/performance/infrastructure/hardware.md similarity index 89% rename from ydb/docs/en/core/dev/troubleshooting/performance/infrastructure/hardware.md rename to ydb/docs/en/core/troubleshooting/performance/infrastructure/hardware.md index 36414bcfdde0..ef7385ea77a1 100644 --- a/ydb/docs/en/core/dev/troubleshooting/performance/infrastructure/hardware.md +++ b/ydb/docs/en/core/troubleshooting/performance/infrastructure/hardware.md @@ -6,7 +6,7 @@ Malfunctioning storage drives and network cards, until replaced, significantly i Use the hardware monitoring tools that your operating system and data center provide to diagnose hardware issues. -You can also use the **Healthcheck** in [Embedded UI](../../../../reference/embedded-ui/index.md) to diagnose some hardware issues: +You can also use the **Healthcheck** in [Embedded UI](../../../reference/embedded-ui/index.md) to diagnose some hardware issues: - **Storage issues** diff --git a/ydb/docs/en/core/dev/troubleshooting/performance/infrastructure/network.md b/ydb/docs/en/core/troubleshooting/performance/infrastructure/network.md similarity index 100% rename from ydb/docs/en/core/dev/troubleshooting/performance/infrastructure/network.md rename to ydb/docs/en/core/troubleshooting/performance/infrastructure/network.md diff --git a/ydb/docs/en/core/dev/troubleshooting/performance/infrastructure/toc_p.yaml b/ydb/docs/en/core/troubleshooting/performance/infrastructure/toc_p.yaml similarity index 100% rename from ydb/docs/en/core/dev/troubleshooting/performance/infrastructure/toc_p.yaml rename to ydb/docs/en/core/troubleshooting/performance/infrastructure/toc_p.yaml diff --git a/ydb/docs/en/core/dev/troubleshooting/performance/queries/_assets/soft-errors.png b/ydb/docs/en/core/troubleshooting/performance/queries/_assets/soft-errors.png similarity index 100% rename from ydb/docs/en/core/dev/troubleshooting/performance/queries/_assets/soft-errors.png rename to ydb/docs/en/core/troubleshooting/performance/queries/_assets/soft-errors.png diff --git a/ydb/docs/en/core/dev/troubleshooting/performance/queries/_assets/transactions-locks-invalidation.png b/ydb/docs/en/core/troubleshooting/performance/queries/_assets/transactions-locks-invalidation.png similarity index 100% rename from ydb/docs/en/core/dev/troubleshooting/performance/queries/_assets/transactions-locks-invalidation.png rename to ydb/docs/en/core/troubleshooting/performance/queries/_assets/transactions-locks-invalidation.png diff --git a/ydb/docs/en/core/dev/troubleshooting/performance/queries/_includes/overloaded-errors.md b/ydb/docs/en/core/troubleshooting/performance/queries/_includes/overloaded-errors.md similarity index 70% rename from ydb/docs/en/core/dev/troubleshooting/performance/queries/_includes/overloaded-errors.md rename to ydb/docs/en/core/troubleshooting/performance/queries/_includes/overloaded-errors.md index 21eaf89dc575..6c4874d0d63e 100644 --- a/ydb/docs/en/core/dev/troubleshooting/performance/queries/_includes/overloaded-errors.md +++ b/ydb/docs/en/core/troubleshooting/performance/queries/_includes/overloaded-errors.md @@ -1,4 +1,4 @@ -1. Open the **[DB overview](../../../../../reference/observability/metrics/grafana-dashboards.md#dboverview)** Grafana dashboard. +1. Open the **[DB overview](../../../../reference/observability/metrics/grafana-dashboards.md#dboverview)** Grafana dashboard. 1. In the **API details** section, see if the **Soft errors (retriable)** chart shows any spikes in the rate of queries with the `OVERLOADED` status. @@ -6,7 +6,7 @@ 1. To check if the spikes in overloaded errors were caused by exceeding the limit of 15000 queries in table partition queues: - 1. In the [Embedded UI](../../../../../reference/embedded-ui/index.md), go to the **Databases** tab and click on the database. + 1. In the [Embedded UI](../../../../reference/embedded-ui/index.md), go to the **Databases** tab and click on the database. 1. On the **Navigation** tab, ensure the required database is selected. @@ -18,6 +18,6 @@ 1. To check if the spikes in overloaded errors were caused by tablet splits and merges, see [{#T}](../../schemas/splits-merges.md). -1. To check if the spikes in overloaded errors were caused by exceeding the 1000 limit of open sessions, in the Grafana **[DB status](../../../../../reference/observability/metrics/grafana-dashboards.md#dbstatus)** dashboard, see the **Session count by host** chart. +1. To check if the spikes in overloaded errors were caused by exceeding the 1000 limit of open sessions, in the Grafana **[DB status](../../../../reference/observability/metrics/grafana-dashboards.md#dbstatus)** dashboard, see the **Session count by host** chart. 1. See the [overloaded shards](../../schemas/overloaded-shards.md) issue. diff --git a/ydb/docs/en/core/dev/troubleshooting/performance/queries/_includes/transaction-lock-invalidation.md b/ydb/docs/en/core/troubleshooting/performance/queries/_includes/transaction-lock-invalidation.md similarity index 65% rename from ydb/docs/en/core/dev/troubleshooting/performance/queries/_includes/transaction-lock-invalidation.md rename to ydb/docs/en/core/troubleshooting/performance/queries/_includes/transaction-lock-invalidation.md index b22fa0d40d73..44d61d08e29f 100644 --- a/ydb/docs/en/core/dev/troubleshooting/performance/queries/_includes/transaction-lock-invalidation.md +++ b/ydb/docs/en/core/troubleshooting/performance/queries/_includes/transaction-lock-invalidation.md @@ -1,4 +1,4 @@ -1. Open the **[DB overview](../../../../../reference/observability/metrics/grafana-dashboards.md#dboverview)** Grafana dashboard. +1. Open the **[DB overview](../../../../reference/observability/metrics/grafana-dashboards.md#dboverview)** Grafana dashboard. 1. See if the **Transaction Locks Invalidation** chart shows any spikes. diff --git a/ydb/docs/en/core/dev/troubleshooting/performance/queries/overloaded-errors.md b/ydb/docs/en/core/troubleshooting/performance/queries/overloaded-errors.md similarity index 81% rename from ydb/docs/en/core/dev/troubleshooting/performance/queries/overloaded-errors.md rename to ydb/docs/en/core/troubleshooting/performance/queries/overloaded-errors.md index 4fe82a2dee8d..dfa241d7d547 100644 --- a/ydb/docs/en/core/dev/troubleshooting/performance/queries/overloaded-errors.md +++ b/ydb/docs/en/core/troubleshooting/performance/queries/overloaded-errors.md @@ -4,7 +4,7 @@ * Overloaded table partitions with over 15000 queries in their queue. -* The outbound [CDC](../../../../concepts/glossary.md#cdc) queue exceeds the limit of 10000 elements or 125 MB. +* The outbound [CDC](../../../concepts/glossary.md#cdc) queue exceeds the limit of 10000 elements or 125 MB. * Table partitions in states other than normal, for example partitions in the process of splitting or merging. @@ -17,6 +17,6 @@ ## Recommendations -If a YQL query returns an `OVERLOADED` error, retry the query using a randomized exponential back-off strategy. The YDB SDK provides a built-in mechanism for handling temporary failures. For more information, see [{#T}](../../../../reference/ydb-sdk/error_handling.md). +If a YQL query returns an `OVERLOADED` error, retry the query using a randomized exponential back-off strategy. The YDB SDK provides a built-in mechanism for handling temporary failures. For more information, see [{#T}](../../../reference/ydb-sdk/error_handling.md). Exceeding the limit of open sessions per node may indicate a problem in the application logic. diff --git a/ydb/docs/en/core/dev/troubleshooting/performance/queries/toc_p.yaml b/ydb/docs/en/core/troubleshooting/performance/queries/toc_p.yaml similarity index 100% rename from ydb/docs/en/core/dev/troubleshooting/performance/queries/toc_p.yaml rename to ydb/docs/en/core/troubleshooting/performance/queries/toc_p.yaml diff --git a/ydb/docs/en/core/dev/troubleshooting/performance/queries/transaction-lock-invalidation.md b/ydb/docs/en/core/troubleshooting/performance/queries/transaction-lock-invalidation.md similarity index 81% rename from ydb/docs/en/core/dev/troubleshooting/performance/queries/transaction-lock-invalidation.md rename to ydb/docs/en/core/troubleshooting/performance/queries/transaction-lock-invalidation.md index 3d3383419a68..3daee078f591 100644 --- a/ydb/docs/en/core/dev/troubleshooting/performance/queries/transaction-lock-invalidation.md +++ b/ydb/docs/en/core/troubleshooting/performance/queries/transaction-lock-invalidation.md @@ -4,7 +4,7 @@ {% note info %} -The YDB SDK provides a built-in mechanism for handling temporary failures. For more information, see [{#T}](../../../../reference/ydb-sdk/error_handling.md). +The YDB SDK provides a built-in mechanism for handling temporary failures. For more information, see [{#T}](../../../reference/ydb-sdk/error_handling.md). {% endnote %} @@ -20,7 +20,7 @@ Consider the following recommendations: - The longer a transaction lasts, the higher the likelihood of encountering a **transaction locks invalidated** error. - If possible, avoid [interactive transactions](../../../../concepts/glossary.md#interactive-transaction). A better approach is to use a single YQL query with `begin;` and `commit;` to select data, update data, and commit the transaction. + If possible, avoid [interactive transactions](../../../concepts/glossary.md#interactive-transaction). A better approach is to use a single YQL query with `begin;` and `commit;` to select data, update data, and commit the transaction. If you do need interactive transactions, perform `commit` in the last query in the transaction. diff --git a/ydb/docs/en/core/dev/troubleshooting/performance/schemas/_assets/describe.png b/ydb/docs/en/core/troubleshooting/performance/schemas/_assets/describe.png similarity index 100% rename from ydb/docs/en/core/dev/troubleshooting/performance/schemas/_assets/describe.png rename to ydb/docs/en/core/troubleshooting/performance/schemas/_assets/describe.png diff --git a/ydb/docs/en/core/dev/troubleshooting/performance/schemas/_assets/node-tablet-monitor-data-shard.png b/ydb/docs/en/core/troubleshooting/performance/schemas/_assets/node-tablet-monitor-data-shard.png similarity index 100% rename from ydb/docs/en/core/dev/troubleshooting/performance/schemas/_assets/node-tablet-monitor-data-shard.png rename to ydb/docs/en/core/troubleshooting/performance/schemas/_assets/node-tablet-monitor-data-shard.png diff --git a/ydb/docs/en/core/dev/troubleshooting/performance/schemas/_assets/overloaded-shards-dashboard.png b/ydb/docs/en/core/troubleshooting/performance/schemas/_assets/overloaded-shards-dashboard.png similarity index 100% rename from ydb/docs/en/core/dev/troubleshooting/performance/schemas/_assets/overloaded-shards-dashboard.png rename to ydb/docs/en/core/troubleshooting/performance/schemas/_assets/overloaded-shards-dashboard.png diff --git a/ydb/docs/en/core/dev/troubleshooting/performance/schemas/_assets/partitions-by-cpu.png b/ydb/docs/en/core/troubleshooting/performance/schemas/_assets/partitions-by-cpu.png similarity index 100% rename from ydb/docs/en/core/dev/troubleshooting/performance/schemas/_assets/partitions-by-cpu.png rename to ydb/docs/en/core/troubleshooting/performance/schemas/_assets/partitions-by-cpu.png diff --git a/ydb/docs/en/core/dev/troubleshooting/performance/schemas/_assets/splits-merges-tablets-devui.png b/ydb/docs/en/core/troubleshooting/performance/schemas/_assets/splits-merges-tablets-devui.png similarity index 100% rename from ydb/docs/en/core/dev/troubleshooting/performance/schemas/_assets/splits-merges-tablets-devui.png rename to ydb/docs/en/core/troubleshooting/performance/schemas/_assets/splits-merges-tablets-devui.png diff --git a/ydb/docs/en/core/dev/troubleshooting/performance/schemas/_assets/splits-merges.png b/ydb/docs/en/core/troubleshooting/performance/schemas/_assets/splits-merges.png similarity index 100% rename from ydb/docs/en/core/dev/troubleshooting/performance/schemas/_assets/splits-merges.png rename to ydb/docs/en/core/troubleshooting/performance/schemas/_assets/splits-merges.png diff --git a/ydb/docs/en/core/dev/troubleshooting/performance/schemas/_includes/overloaded-shards-diagnostics.md b/ydb/docs/en/core/troubleshooting/performance/schemas/_includes/overloaded-shards-diagnostics.md similarity index 83% rename from ydb/docs/en/core/dev/troubleshooting/performance/schemas/_includes/overloaded-shards-diagnostics.md rename to ydb/docs/en/core/troubleshooting/performance/schemas/_includes/overloaded-shards-diagnostics.md index f142849a3017..2303fd268d24 100644 --- a/ydb/docs/en/core/dev/troubleshooting/performance/schemas/_includes/overloaded-shards-diagnostics.md +++ b/ydb/docs/en/core/troubleshooting/performance/schemas/_includes/overloaded-shards-diagnostics.md @@ -1,6 +1,6 @@ 1. Use the Embedded UI or Grafana to see if the {{ ydb-short-name }} nodes are overloaded: - - In the **[DB overview](../../../../../reference/observability/metrics/grafana-dashboards.md#dboverview)** Grafana dashboard, analyze the **Overloaded shard count** chart. + - In the **[DB overview](../../../../reference/observability/metrics/grafana-dashboards.md#dboverview)** Grafana dashboard, analyze the **Overloaded shard count** chart. ![](../_assets/overloaded-shards-dashboard.png) @@ -13,7 +13,7 @@ {% endnote %} - - In the [Embedded UI](../../../../../reference/embedded-ui/index.md): + - In the [Embedded UI](../../../../reference/embedded-ui/index.md): 1. Go to the **Databases** tab and click on the database. @@ -27,11 +27,11 @@ ![](../_assets/partitions-by-cpu.png) - Additionally, the information about overloaded shards is provided as a system table. For more information, see [{#T}](../../../../system-views.md#top-overload-partitions). + Additionally, the information about overloaded shards is provided as a system table. For more information, see [{#T}](../../../../dev/system-views.md#top-overload-partitions). -1. To pinpoint the schema issue, use the [Embedded UI](../../../../../reference/embedded-ui/index.md) or [{{ ydb-short-name }} CLI](../../../../../reference/ydb-cli/index.md): +1. To pinpoint the schema issue, use the [Embedded UI](../../../../reference/embedded-ui/index.md) or [{{ ydb-short-name }} CLI](../../../../reference/ydb-cli/index.md): - - In the [Embedded UI](../../../../../reference/embedded-ui/index.md): + - In the [Embedded UI](../../../../reference/embedded-ui/index.md): 1. On the **Databases** tab, click on the database. @@ -58,7 +58,7 @@ {% endnote %} - - In the [{{ ydb-short-name }} CLI](../../../../../reference/ydb-cli/index.md): + - In the [{{ ydb-short-name }} CLI](../../../../reference/ydb-cli/index.md): 1. To retrieve information about the problematic table, run the following command: diff --git a/ydb/docs/en/core/dev/troubleshooting/performance/schemas/_includes/splits-merges.md b/ydb/docs/en/core/troubleshooting/performance/schemas/_includes/splits-merges.md similarity index 83% rename from ydb/docs/en/core/dev/troubleshooting/performance/schemas/_includes/splits-merges.md rename to ydb/docs/en/core/troubleshooting/performance/schemas/_includes/splits-merges.md index bcf05b765eae..b2b2a298a069 100644 --- a/ydb/docs/en/core/dev/troubleshooting/performance/schemas/_includes/splits-merges.md +++ b/ydb/docs/en/core/troubleshooting/performance/schemas/_includes/splits-merges.md @@ -1,4 +1,4 @@ -1. See if the **Split / Merge partitions** chart in the **[DB status](../../../../../reference/observability/metrics/grafana-dashboards.md#dbstatus)** Grafana dashboard shows any spikes. +1. See if the **Split / Merge partitions** chart in the **[DB status](../../../../reference/observability/metrics/grafana-dashboards.md#dbstatus)** Grafana dashboard shows any spikes. ![](../_assets/splits-merges.png) @@ -17,7 +17,7 @@ 1. To identify recently split or merged tablets, follow these steps: - 1. In the [Embedded UI](../../../../../reference/embedded-ui/index.md), click the **Developer UI** link in the upper right corner. + 1. In the [Embedded UI](../../../../reference/embedded-ui/index.md), click the **Developer UI** link in the upper right corner. 1. Navigate to **Node Table Monitor** > **All tablets of the cluster**. @@ -35,7 +35,7 @@ 1. To pinpoint the schema issue, follow these steps: - 1. Retrieve information about the problematic table using the [{{ ydb-short-name }} CLI](../../../../../reference/ydb-cli/index.md). Run the following command: + 1. Retrieve information about the problematic table using the [{{ ydb-short-name }} CLI](../../../../reference/ydb-cli/index.md). Run the following command: ```bash ydb scheme describe diff --git a/ydb/docs/en/core/dev/troubleshooting/performance/schemas/overloaded-shards.md b/ydb/docs/en/core/troubleshooting/performance/schemas/overloaded-shards.md similarity index 71% rename from ydb/docs/en/core/dev/troubleshooting/performance/schemas/overloaded-shards.md rename to ydb/docs/en/core/troubleshooting/performance/schemas/overloaded-shards.md index 2fff992972e5..46ab26e67a25 100644 --- a/ydb/docs/en/core/dev/troubleshooting/performance/schemas/overloaded-shards.md +++ b/ydb/docs/en/core/troubleshooting/performance/schemas/overloaded-shards.md @@ -1,8 +1,8 @@ # Overloaded shards -[Data shards](../../../../concepts/glossary.md#data-shard) serving [row-oriented tables](../../../../concepts/datamodel/table.md#row-oriented-tables) may become overloaded for the following reasons: +[Data shards](../../../concepts/glossary.md#data-shard) serving [row-oriented tables](../../../concepts/datamodel/table.md#row-oriented-tables) may become overloaded for the following reasons: -* A table is created without the [AUTO_PARTITIONING_BY_LOAD](../../../../concepts/datamodel/table.md#AUTO_PARTITIONING_BY_LOAD) clause. +* A table is created without the [AUTO_PARTITIONING_BY_LOAD](../../../concepts/datamodel/table.md#AUTO_PARTITIONING_BY_LOAD) clause. In this case, {{ ydb-short-name }} does not split overloaded shards. @@ -10,9 +10,9 @@ If a data shard already has 10000 operations in its queue, new queries will return an "overloaded" error. Retry such queries using a randomized exponential back-off strategy. For more information, see [{#T}](../queries/overloaded-errors.md). -* A table was created with the [AUTO_PARTITIONING_MAX_PARTITIONS_COUNT](../../../../concepts/datamodel/table.md#AUTO_PARTITIONING_MAX_PARTITIONS_COUNT) setting and has already reached its partition limit. +* A table was created with the [AUTO_PARTITIONING_MAX_PARTITIONS_COUNT](../../../concepts/datamodel/table.md#AUTO_PARTITIONING_MAX_PARTITIONS_COUNT) setting and has already reached its partition limit. -* An inefficient [primary key](../../../../concepts/glossary.md#primary-key) that causes an imbalance in the distribution of queries across shards. A typical example is ingestion with a monotonically increasing primary key, which may lead to the overloaded "last" partition. For example, this could occur with an autoincrementing primary key using the serial data type. +* An inefficient [primary key](../../../concepts/glossary.md#primary-key) that causes an imbalance in the distribution of queries across shards. A typical example is ingestion with a monotonically increasing primary key, which may lead to the overloaded "last" partition. For example, this could occur with an autoincrementing primary key using the serial data type. ## Diagnostics @@ -42,7 +42,7 @@ Consider the following solutions to address shard overload: {% endnote %} -Both operations can be performed by executing an [`ALTER TABLE ... SET`](../../../../yql/reference/syntax/alter_table/set.md) query. +Both operations can be performed by executing an [`ALTER TABLE ... SET`](../../../yql/reference/syntax/alter_table/set.md) query. ### For the imbalanced primary key {#pk-recommendations} diff --git a/ydb/docs/en/core/troubleshooting/performance/schemas/splits-merges.md b/ydb/docs/en/core/troubleshooting/performance/schemas/splits-merges.md new file mode 100644 index 000000000000..06249046bd3f --- /dev/null +++ b/ydb/docs/en/core/troubleshooting/performance/schemas/splits-merges.md @@ -0,0 +1,32 @@ +# Excessive tablet splits and merges + +{% if oss == true and backend_name == "YDB" %} + +{% include [OLAP_not_allow_note](../../../_includes/not_allow_for_olap_note.md) %} + +{% endif %} + +Each [row-oriented table](../../../concepts/datamodel/table.md#row-oriented-tables) partition in {{ ydb-short-name }} is processed by a [data shard](../../../concepts/glossary.md#data-shard) tablet. {{ ydb-short-name }} supports automatic [splitting and merging](../../../concepts/datamodel/table.md#partitioning) of data shards which allows it to seamlessly adapt to changes in workloads. However, these operations are not free and might have a short-term negative impact on query latencies. + +When {{ ydb-short-name }} splits a partition, it replaces the original partition with two new partitions covering the same range of primary keys. Now, two data shards process the range of primary keys that was previously handled by a single data shard, thereby adding more computing resources for the table. + +By default, {{ ydb-short-name }} splits a table partition when it reaches 2 GB in size. However, it's recommended to also enable partitioning by load, allowing {{ ydb-short-name }} to split overloaded partitions even if they are smaller than 2 GB. + +A [scheme shard](../../../concepts/glossary.md#scheme-shard) takes approximately 15 seconds to assess whether a data shard requires splitting. By default, the CPU usage threshold for splitting a data shard is set at 50%. + +When {{ ydb-short-name }} merges adjacent partitions in a row-oriented table, they are replaced with a single partition that covers their range of primary keys. TThe corresponding data shards are also consolidated into a single data shard to manage the new partition. + +For merging to occur, data shards must have existed for at least 10 minutes, and their CPU usage over the last hour must not exceed 35%. + +When configuring [table partitioning](../../../concepts/datamodel/table.md#partitioning), you can also set limits for the [minimum](../../../concepts/datamodel/table.md#auto_partitioning_min_partitions_count) and [maximum number of partitions](../../../concepts/datamodel/table.md#auto_partitioning_max_partitions_count). If the difference between the minimum and maximum limits exceeds 20% and the table load varies significantly over time, [Hive](../../../concepts/glossary.md#hive) may start splitting overloaded tables and then merging them back during periods of low load. + +## Diagnostics + + +{% include notitle [#](_includes/splits-merges.md) %} + +## Recommendations + +If the user load on {{ ydb-short-name }} has not changed, consider adjusting the gap between the min and max limits for the number of table partitions to the recommended 20% difference. Use the [`ALTER TABLE table_name SET (key = value)`](../../../yql/reference/syntax/alter_table/set.md) YQL statement to update the [`AUTO_PARTITIONING_MIN_PARTITIONS_COUNT`](../../../concepts/datamodel/table.md#auto_partitioning_min_partitions_count) and [`AUTO_PARTITIONING_MAX_PARTITIONS_COUNT`](../../../concepts/datamodel/table.md#auto_partitioning_max_partitions_count) parameters. + +If you want to avoid splitting and merging data shards, you can set the min limit to the max limit value or disable partitioning by load. diff --git a/ydb/docs/en/core/dev/troubleshooting/performance/schemas/toc_p.yaml b/ydb/docs/en/core/troubleshooting/performance/schemas/toc_p.yaml similarity index 100% rename from ydb/docs/en/core/dev/troubleshooting/performance/schemas/toc_p.yaml rename to ydb/docs/en/core/troubleshooting/performance/schemas/toc_p.yaml diff --git a/ydb/docs/en/core/dev/troubleshooting/performance/system/_assets/healthcheck-clock-drift.png b/ydb/docs/en/core/troubleshooting/performance/system/_assets/healthcheck-clock-drift.png similarity index 100% rename from ydb/docs/en/core/dev/troubleshooting/performance/system/_assets/healthcheck-clock-drift.png rename to ydb/docs/en/core/troubleshooting/performance/system/_assets/healthcheck-clock-drift.png diff --git a/ydb/docs/en/core/dev/troubleshooting/performance/system/system-clock-drift.md b/ydb/docs/en/core/troubleshooting/performance/system/system-clock-drift.md similarity index 66% rename from ydb/docs/en/core/dev/troubleshooting/performance/system/system-clock-drift.md rename to ydb/docs/en/core/troubleshooting/performance/system/system-clock-drift.md index db70cae51d02..2af1efe073af 100644 --- a/ydb/docs/en/core/dev/troubleshooting/performance/system/system-clock-drift.md +++ b/ydb/docs/en/core/troubleshooting/performance/system/system-clock-drift.md @@ -9,17 +9,17 @@ It is important to keep system clocks on the {{ ydb-short-name }} servers in syn {% endnote %} -If the system clocks of the nodes running the [coordinator](../../../../concepts/glossary.md#coordinator) tablets differ, transaction latencies increase by the time difference between the fastest and slowest system clocks. This occurs because a transaction planned on a node with a faster system clock can only be executed once the coordinator with the slowest clock reaches the same time. +If the system clocks of the nodes running the [coordinator](../../../concepts/glossary.md#coordinator) tablets differ, transaction latencies increase by the time difference between the fastest and slowest system clocks. This occurs because a transaction planned on a node with a faster system clock can only be executed once the coordinator with the slowest clock reaches the same time. -Furthermore, if the system clock drift exceeds 30 seconds, {{ ydb-short-name }} will refuse to process distributed transactions. Before coordinators start planning a transaction, affected [Data shards](../../../../concepts/glossary.md#data-shard) determine an acceptable range of timestamps for the transaction. The start of this range is the current time of the mediator tablet's clock, while the 30-second planning timeout determines the end. If the coordinator's system clock exceeds this time range, it cannot plan a distributed transaction, resulting in errors for such queries. +Furthermore, if the system clock drift exceeds 30 seconds, {{ ydb-short-name }} will refuse to process distributed transactions. Before coordinators start planning a transaction, affected [Data shards](../../../concepts/glossary.md#data-shard) determine an acceptable range of timestamps for the transaction. The start of this range is the current time of the mediator tablet's clock, while the 30-second planning timeout determines the end. If the coordinator's system clock exceeds this time range, it cannot plan a distributed transaction, resulting in errors for such queries. ## Diagnostics To diagnose the system clock drift, use the following methods: -1. Use **Healthcheck** in the [Embedded UI](../../../../reference/embedded-ui/index.md): +1. Use **Healthcheck** in the [Embedded UI](../../../reference/embedded-ui/index.md): - 1. In the [Embedded UI](../../../../reference/embedded-ui/index.md), go to the **Databases** tab and click on the database. + 1. In the [Embedded UI](../../../reference/embedded-ui/index.md), go to the **Databases** tab and click on the database. 1. On the **Navigation** tab, ensure the required database is selected. @@ -37,12 +37,12 @@ To diagnose the system clock drift, use the following methods: {% note info %} - For more information, see [{#T}](../../../../reference/ydb-sdk/health-check-api.md) + For more information, see [{#T}](../../../reference/ydb-sdk/health-check-api.md) {% endnote %} -1. Open the [Interconnect overview](../../../../reference/embedded-ui/interconnect-overview.md) page of the [Embedded UI](../../../../reference/embedded-ui/index.md). +1. Open the [Interconnect overview](../../../reference/embedded-ui/interconnect-overview.md) page of the [Embedded UI](../../../reference/embedded-ui/index.md). 1. Use such tools as `pssh` or `ansible` to run the command (for example, `date +%s%N`) on all {{ ydb-short-name }} nodes to display the system clock value. diff --git a/ydb/docs/en/core/dev/troubleshooting/performance/system/toc_p.yaml b/ydb/docs/en/core/troubleshooting/performance/system/toc_p.yaml similarity index 100% rename from ydb/docs/en/core/dev/troubleshooting/performance/system/toc_p.yaml rename to ydb/docs/en/core/troubleshooting/performance/system/toc_p.yaml diff --git a/ydb/docs/en/core/dev/troubleshooting/performance/toc_p.yaml b/ydb/docs/en/core/troubleshooting/performance/toc_p.yaml similarity index 100% rename from ydb/docs/en/core/dev/troubleshooting/performance/toc_p.yaml rename to ydb/docs/en/core/troubleshooting/performance/toc_p.yaml diff --git a/ydb/docs/en/core/dev/troubleshooting/performance/ydb/_assets/cpu-balancer.jpg b/ydb/docs/en/core/troubleshooting/performance/ydb/_assets/cpu-balancer.jpg similarity index 100% rename from ydb/docs/en/core/dev/troubleshooting/performance/ydb/_assets/cpu-balancer.jpg rename to ydb/docs/en/core/troubleshooting/performance/ydb/_assets/cpu-balancer.jpg diff --git a/ydb/docs/en/core/dev/troubleshooting/performance/ydb/_assets/hive-app.png b/ydb/docs/en/core/troubleshooting/performance/ydb/_assets/hive-app.png similarity index 100% rename from ydb/docs/en/core/dev/troubleshooting/performance/ydb/_assets/hive-app.png rename to ydb/docs/en/core/troubleshooting/performance/ydb/_assets/hive-app.png diff --git a/ydb/docs/en/core/dev/troubleshooting/performance/ydb/_assets/tablets-moved.png b/ydb/docs/en/core/troubleshooting/performance/ydb/_assets/tablets-moved.png similarity index 100% rename from ydb/docs/en/core/dev/troubleshooting/performance/ydb/_assets/tablets-moved.png rename to ydb/docs/en/core/troubleshooting/performance/ydb/_assets/tablets-moved.png diff --git a/ydb/docs/en/core/dev/troubleshooting/performance/ydb/_assets/updates.png b/ydb/docs/en/core/troubleshooting/performance/ydb/_assets/updates.png similarity index 100% rename from ydb/docs/en/core/dev/troubleshooting/performance/ydb/_assets/updates.png rename to ydb/docs/en/core/troubleshooting/performance/ydb/_assets/updates.png diff --git a/ydb/docs/en/core/dev/troubleshooting/performance/ydb/_includes/tablets-moved.md b/ydb/docs/en/core/troubleshooting/performance/ydb/_includes/tablets-moved.md similarity index 78% rename from ydb/docs/en/core/dev/troubleshooting/performance/ydb/_includes/tablets-moved.md rename to ydb/docs/en/core/troubleshooting/performance/ydb/_includes/tablets-moved.md index d1bb87c34319..2cca37bfa090 100644 --- a/ydb/docs/en/core/dev/troubleshooting/performance/ydb/_includes/tablets-moved.md +++ b/ydb/docs/en/core/troubleshooting/performance/ydb/_includes/tablets-moved.md @@ -1,4 +1,4 @@ -1. See if the **Tablets moved by Hive** chart in the **[DB status](../../../../../reference/observability/metrics/grafana-dashboards.md#dbstatus)** Grafana dashboard shows any spikes. +1. See if the **Tablets moved by Hive** chart in the **[DB status](../../../../reference/observability/metrics/grafana-dashboards.md#dbstatus)** Grafana dashboard shows any spikes. ![](../_assets/tablets-moved.png) @@ -6,7 +6,7 @@ 1. See the Hive balancer stats. - 1. Open [Embedded UI](../../../../../reference/embedded-ui/index.md). + 1. Open [Embedded UI](../../../../reference/embedded-ui/index.md). 1. Click **Developer UI** in the upper right corner of the Embedded UI. diff --git a/ydb/docs/en/core/dev/troubleshooting/performance/ydb/tablets-moved.md b/ydb/docs/en/core/troubleshooting/performance/ydb/tablets-moved.md similarity index 92% rename from ydb/docs/en/core/dev/troubleshooting/performance/ydb/tablets-moved.md rename to ydb/docs/en/core/troubleshooting/performance/ydb/tablets-moved.md index 892a47546903..26c483483ba1 100644 --- a/ydb/docs/en/core/dev/troubleshooting/performance/ydb/tablets-moved.md +++ b/ydb/docs/en/core/troubleshooting/performance/ydb/tablets-moved.md @@ -1,6 +1,6 @@ # Frequent tablet moves between nodes -{{ ydb-short-name }} automatically balances the load by moving tablets from overloaded nodes to other nodes. This process is managed by [Hive](../../../../concepts/glossary.md#hive). When Hive moves tablets, queries affecting those tablets might experience increased latencies while they wait for the tablet to get initialized on the new node. +{{ ydb-short-name }} automatically balances the load by moving tablets from overloaded nodes to other nodes. This process is managed by [Hive](../../../concepts/glossary.md#hive). When Hive moves tablets, queries affecting those tablets might experience increased latencies while they wait for the tablet to get initialized on the new node. {{ ydb-short-name }} considers usage of the following hardware resources for balancing nodes: @@ -42,7 +42,7 @@ Autobalancing occurs in the following cases: Adjust Hive balancer settings: -1. Open [Embedded UI](../../../../reference/embedded-ui/index.md). +1. Open [Embedded UI](../../../reference/embedded-ui/index.md). 1. Click **Developer UI** in the upper right corner of the Embedded UI. diff --git a/ydb/docs/en/core/dev/troubleshooting/performance/ydb/toc_p.yaml b/ydb/docs/en/core/troubleshooting/performance/ydb/toc_p.yaml similarity index 100% rename from ydb/docs/en/core/dev/troubleshooting/performance/ydb/toc_p.yaml rename to ydb/docs/en/core/troubleshooting/performance/ydb/toc_p.yaml diff --git a/ydb/docs/en/core/dev/troubleshooting/performance/ydb/ydb-updates.md b/ydb/docs/en/core/troubleshooting/performance/ydb/ydb-updates.md similarity index 80% rename from ydb/docs/en/core/dev/troubleshooting/performance/ydb/ydb-updates.md rename to ydb/docs/en/core/troubleshooting/performance/ydb/ydb-updates.md index 19503db96b49..c52d433a651e 100644 --- a/ydb/docs/en/core/dev/troubleshooting/performance/ydb/ydb-updates.md +++ b/ydb/docs/en/core/troubleshooting/performance/ydb/ydb-updates.md @@ -1,8 +1,8 @@ # Rolling restart -{{ ydb-short-name }} clusters can be updated without downtime, which is possible because {{ ydb-short-name }} normally has redundant components and supports rolling restart procedure. To ensure continuous data availability, {{ ydb-short-name }} includes [Cluster Management System (CMS)](../../../../concepts/glossary.md#cms) that tracks all outages and nodes taken offline for maintenance, such as restarts. CMS halts new maintenance requests if they might risk data availability. +{{ ydb-short-name }} clusters can be updated without downtime, which is possible because {{ ydb-short-name }} normally has redundant components and supports rolling restart procedure. To ensure continuous data availability, {{ ydb-short-name }} includes [Cluster Management System (CMS)](../../../concepts/glossary.md#cms) that tracks all outages and nodes taken offline for maintenance, such as restarts. CMS halts new maintenance requests if they might risk data availability. -However, even if data is always available, the restart of all nodes in a relatively short period of time might have a noticeable impact on overall performance. Each [tablet](../../../../concepts/glossary.md#tablet) running on a restarted node is relaunched on a different node. Moving a tablet between nodes takes time and may affect latencies of queries involving it. See recommendations [for rolling restart](#rolling-restart). +However, even if data is always available, the restart of all nodes in a relatively short period of time might have a noticeable impact on overall performance. Each [tablet](../../../concepts/glossary.md#tablet) running on a restarted node is relaunched on a different node. Moving a tablet between nodes takes time and may affect latencies of queries involving it. See recommendations [for rolling restart](#rolling-restart). Furthermore, a new {{ ydb-short-name }} version may handle queries differently. While performance generally improves with each update, certain corner cases may occasionally end up with degraded performance. See recommendations [for new version performance](#version-performance). @@ -16,7 +16,7 @@ Diagnostics of {{ ydb-short-name }} rolling restarts and updates relies only on To check if the {{ ydb-short-name }} cluster is currently being updated: -1. Open [Embedded UI](../../../../reference/embedded-ui/index.md). +1. Open [Embedded UI](../../../reference/embedded-ui/index.md). 1. On the **Nodes** tab, see if {{ ydb-short-name }} versions of the nodes differ. @@ -45,7 +45,7 @@ If the ongoing {{ ydb-short-name }} cluster rolling restart significantly impact The goal is to detect any negative performance impacts from the new {{ ydb-short-name }} version on specific queries in your particular workload as early as possible: -1. Review the [{{ ydb-short-name }} server changelog](../../../../changelog-server.md) for any performance-related notes relevant to your workload. +1. Review the [{{ ydb-short-name }} server changelog](../../../changelog-server.md) for any performance-related notes relevant to your workload. 2. Use a dedicated pre-production and/or testing {{ ydb-short-name }} cluster to run a workload that closely mirrors your production workload. Always deploy the new {{ ydb-short-name }} version to these clusters first. Monitor both client-side latencies and server-side metrics to identify any potential performance issues. 3. Implement canary deployment by updating only one node initially to observe any changes in its behavior. If everything appears stable, gradually expand the update to more nodes, such as an entire server rack or data center, and repeat checks for anomalies. If any issues arise, immediately roll back to the previous version and attempt to reproduce the issue in a non-production environment. diff --git a/ydb/docs/en/core/dev/troubleshooting/toc_p.yaml b/ydb/docs/en/core/troubleshooting/toc_p.yaml similarity index 100% rename from ydb/docs/en/core/dev/troubleshooting/toc_p.yaml rename to ydb/docs/en/core/troubleshooting/toc_p.yaml diff --git a/ydb/docs/ru/core/dev/toc_p.yaml b/ydb/docs/ru/core/dev/toc_p.yaml index 6bf21c38a0df..c99721d2dd0c 100644 --- a/ydb/docs/ru/core/dev/toc_p.yaml +++ b/ydb/docs/ru/core/dev/toc_p.yaml @@ -18,11 +18,6 @@ items: path: primary-key/toc_p.yaml - name: Вторичные индексы href: secondary-indexes.md -- name: Диагностика проблем - href: troubleshooting/index.md - include: - mode: link - path: troubleshooting/toc_p.yaml - name: Оптимизация планов запросов href: query-plans-optimization.md - name: Пакетная загрузка diff --git a/ydb/docs/ru/core/reference/observability/metrics/grafana-dashboards.md b/ydb/docs/ru/core/reference/observability/metrics/grafana-dashboards.md index 5dbdc76a7a43..03c73b84c1bb 100644 --- a/ydb/docs/ru/core/reference/observability/metrics/grafana-dashboards.md +++ b/ydb/docs/ru/core/reference/observability/metrics/grafana-dashboards.md @@ -6,6 +6,21 @@ Общий дашборд базы данных. +## DB overview {#dboverview} + +Общий дашборд базы данных по категориям: + +- Health +- API +- API details +- CPU +- CPU pools +- Memory +- Storage +- DataShard +- DataShard details +- Latency + ## Actors {#actors} Потребление CPU в актор-системе. diff --git a/ydb/docs/ru/core/toc_i.yaml b/ydb/docs/ru/core/toc_i.yaml index 70b42f63a668..5bc9e08cbd55 100644 --- a/ydb/docs/ru/core/toc_i.yaml +++ b/ydb/docs/ru/core/toc_i.yaml @@ -37,6 +37,11 @@ items: include: mode: link path: recipes/toc_p.yaml +- name: Диагностика проблем + href: troubleshooting/index.md + include: + mode: link + path: troubleshooting/toc_p.yaml - name: Вопросы и ответы href: faq/index.md include: diff --git a/ydb/docs/ru/core/dev/troubleshooting/index.md b/ydb/docs/ru/core/troubleshooting/index.md similarity index 100% rename from ydb/docs/ru/core/dev/troubleshooting/index.md rename to ydb/docs/ru/core/troubleshooting/index.md diff --git a/ydb/docs/ru/core/dev/troubleshooting/performance/hardware/_assets/cpu-batch-pool.png b/ydb/docs/ru/core/troubleshooting/performance/hardware/_assets/cpu-batch-pool.png similarity index 100% rename from ydb/docs/ru/core/dev/troubleshooting/performance/hardware/_assets/cpu-batch-pool.png rename to ydb/docs/ru/core/troubleshooting/performance/hardware/_assets/cpu-batch-pool.png diff --git a/ydb/docs/ru/core/dev/troubleshooting/performance/hardware/_assets/cpu-by-pool.png b/ydb/docs/ru/core/troubleshooting/performance/hardware/_assets/cpu-by-pool.png similarity index 100% rename from ydb/docs/ru/core/dev/troubleshooting/performance/hardware/_assets/cpu-by-pool.png rename to ydb/docs/ru/core/troubleshooting/performance/hardware/_assets/cpu-by-pool.png diff --git a/ydb/docs/ru/core/dev/troubleshooting/performance/hardware/_assets/cpu-ic-pool.png b/ydb/docs/ru/core/troubleshooting/performance/hardware/_assets/cpu-ic-pool.png similarity index 100% rename from ydb/docs/ru/core/dev/troubleshooting/performance/hardware/_assets/cpu-ic-pool.png rename to ydb/docs/ru/core/troubleshooting/performance/hardware/_assets/cpu-ic-pool.png diff --git a/ydb/docs/ru/core/dev/troubleshooting/performance/hardware/_assets/cpu-io-pool.png b/ydb/docs/ru/core/troubleshooting/performance/hardware/_assets/cpu-io-pool.png similarity index 100% rename from ydb/docs/ru/core/dev/troubleshooting/performance/hardware/_assets/cpu-io-pool.png rename to ydb/docs/ru/core/troubleshooting/performance/hardware/_assets/cpu-io-pool.png diff --git a/ydb/docs/ru/core/dev/troubleshooting/performance/hardware/_assets/cpu-read-only-tx-latency.png b/ydb/docs/ru/core/troubleshooting/performance/hardware/_assets/cpu-read-only-tx-latency.png similarity index 100% rename from ydb/docs/ru/core/dev/troubleshooting/performance/hardware/_assets/cpu-read-only-tx-latency.png rename to ydb/docs/ru/core/troubleshooting/performance/hardware/_assets/cpu-read-only-tx-latency.png diff --git a/ydb/docs/ru/core/dev/troubleshooting/performance/hardware/_assets/cpu-row-read-rows.png b/ydb/docs/ru/core/troubleshooting/performance/hardware/_assets/cpu-row-read-rows.png similarity index 100% rename from ydb/docs/ru/core/dev/troubleshooting/performance/hardware/_assets/cpu-row-read-rows.png rename to ydb/docs/ru/core/troubleshooting/performance/hardware/_assets/cpu-row-read-rows.png diff --git a/ydb/docs/ru/core/dev/troubleshooting/performance/hardware/_assets/cpu-system-pool.png b/ydb/docs/ru/core/troubleshooting/performance/hardware/_assets/cpu-system-pool.png similarity index 100% rename from ydb/docs/ru/core/dev/troubleshooting/performance/hardware/_assets/cpu-system-pool.png rename to ydb/docs/ru/core/troubleshooting/performance/hardware/_assets/cpu-system-pool.png diff --git a/ydb/docs/ru/core/dev/troubleshooting/performance/hardware/_assets/cpu-user-pool.png b/ydb/docs/ru/core/troubleshooting/performance/hardware/_assets/cpu-user-pool.png similarity index 100% rename from ydb/docs/ru/core/dev/troubleshooting/performance/hardware/_assets/cpu-user-pool.png rename to ydb/docs/ru/core/troubleshooting/performance/hardware/_assets/cpu-user-pool.png diff --git a/ydb/docs/ru/core/dev/troubleshooting/performance/hardware/_assets/disk-time-available--disk-cost.png b/ydb/docs/ru/core/troubleshooting/performance/hardware/_assets/disk-time-available--disk-cost.png similarity index 100% rename from ydb/docs/ru/core/dev/troubleshooting/performance/hardware/_assets/disk-time-available--disk-cost.png rename to ydb/docs/ru/core/troubleshooting/performance/hardware/_assets/disk-time-available--disk-cost.png diff --git a/ydb/docs/ru/core/dev/troubleshooting/performance/hardware/_assets/embedded-ui-cpu-system-pool.png b/ydb/docs/ru/core/troubleshooting/performance/hardware/_assets/embedded-ui-cpu-system-pool.png similarity index 100% rename from ydb/docs/ru/core/dev/troubleshooting/performance/hardware/_assets/embedded-ui-cpu-system-pool.png rename to ydb/docs/ru/core/troubleshooting/performance/hardware/_assets/embedded-ui-cpu-system-pool.png diff --git a/ydb/docs/ru/core/dev/troubleshooting/performance/hardware/_assets/microbursts.png b/ydb/docs/ru/core/troubleshooting/performance/hardware/_assets/microbursts.png similarity index 100% rename from ydb/docs/ru/core/dev/troubleshooting/performance/hardware/_assets/microbursts.png rename to ydb/docs/ru/core/troubleshooting/performance/hardware/_assets/microbursts.png diff --git a/ydb/docs/ru/core/dev/troubleshooting/performance/hardware/_assets/request-size.png b/ydb/docs/ru/core/troubleshooting/performance/hardware/_assets/request-size.png similarity index 100% rename from ydb/docs/ru/core/dev/troubleshooting/performance/hardware/_assets/request-size.png rename to ydb/docs/ru/core/troubleshooting/performance/hardware/_assets/request-size.png diff --git a/ydb/docs/ru/core/dev/troubleshooting/performance/hardware/_assets/requests.png b/ydb/docs/ru/core/troubleshooting/performance/hardware/_assets/requests.png similarity index 100% rename from ydb/docs/ru/core/dev/troubleshooting/performance/hardware/_assets/requests.png rename to ydb/docs/ru/core/troubleshooting/performance/hardware/_assets/requests.png diff --git a/ydb/docs/ru/core/dev/troubleshooting/performance/hardware/_assets/response-size.png b/ydb/docs/ru/core/troubleshooting/performance/hardware/_assets/response-size.png similarity index 100% rename from ydb/docs/ru/core/dev/troubleshooting/performance/hardware/_assets/response-size.png rename to ydb/docs/ru/core/troubleshooting/performance/hardware/_assets/response-size.png diff --git a/ydb/docs/ru/core/dev/troubleshooting/performance/hardware/_assets/storage-groups-disk-space.png b/ydb/docs/ru/core/troubleshooting/performance/hardware/_assets/storage-groups-disk-space.png similarity index 100% rename from ydb/docs/ru/core/dev/troubleshooting/performance/hardware/_assets/storage-groups-disk-space.png rename to ydb/docs/ru/core/troubleshooting/performance/hardware/_assets/storage-groups-disk-space.png diff --git a/ydb/docs/ru/core/dev/troubleshooting/performance/hardware/_includes/cpu-bottleneck.md b/ydb/docs/ru/core/troubleshooting/performance/hardware/_includes/cpu-bottleneck.md similarity index 83% rename from ydb/docs/ru/core/dev/troubleshooting/performance/hardware/_includes/cpu-bottleneck.md rename to ydb/docs/ru/core/troubleshooting/performance/hardware/_includes/cpu-bottleneck.md index 6566e34b0643..6decce4950ea 100644 --- a/ydb/docs/ru/core/dev/troubleshooting/performance/hardware/_includes/cpu-bottleneck.md +++ b/ydb/docs/ru/core/troubleshooting/performance/hardware/_includes/cpu-bottleneck.md @@ -1,6 +1,6 @@ -1. Используйте вкладку **Diagnostics** во [встроенном UI](../../../../../reference/embedded-ui/index.md) для анализа загрузки процессора во всех пулах ресурсов: +1. Используйте вкладку **Diagnostics** во [встроенном UI](../../../../reference/embedded-ui/index.md) для анализа загрузки процессора во всех пулах ресурсов: - 1. Откройте [встроенный UI](../../../../../reference/embedded-ui/index.md), перейдите на вкладку **Databases** и нажмите на требуемую базу данных. + 1. Откройте [встроенный UI](../../../../reference/embedded-ui/index.md), перейдите на вкладку **Databases** и нажмите на требуемую базу данных. 1. На вкладке **Navigation** убедитесь, что требуемая база данных выбрана. @@ -11,7 +11,7 @@ ![](../_assets/embedded-ui-cpu-system-pool.png) 1. Проанализируйте загрузку процессора во всех пулах ресурсов на графиках Grafana: - 1. Откройте панель мониторинга **[CPU](../../../../../reference/observability/metrics/grafana-dashboards.md#cpu)** в Grafana. + 1. Откройте панель мониторинга **[CPU](../../../../reference/observability/metrics/grafana-dashboards.md#cpu)** в Grafana. 1. Проверьте наличие скачков на следующих графиках: diff --git a/ydb/docs/ru/core/dev/troubleshooting/performance/hardware/_includes/io-bandwidth.md b/ydb/docs/ru/core/troubleshooting/performance/hardware/_includes/io-bandwidth.md similarity index 93% rename from ydb/docs/ru/core/dev/troubleshooting/performance/hardware/_includes/io-bandwidth.md rename to ydb/docs/ru/core/troubleshooting/performance/hardware/_includes/io-bandwidth.md index 1a594ff31acd..45f129b0a3be 100644 --- a/ydb/docs/ru/core/dev/troubleshooting/performance/hardware/_includes/io-bandwidth.md +++ b/ydb/docs/ru/core/troubleshooting/performance/hardware/_includes/io-bandwidth.md @@ -1,4 +1,4 @@ -1. Откройте панель мониторинга **[Distributed Storage Overview](../../../../../reference/observability/metrics/grafana-dashboards.md)** в Grafana. +1. Откройте панель мониторинга **[Distributed Storage Overview](../../../../reference/observability/metrics/grafana-dashboards.md)** в Grafana. 1. На графике **DiskTimeAvailable and total Cost relation** проверьте, пересекают ли всплески **Total Cost** уровень **DiskTimeAvailable**. diff --git a/ydb/docs/ru/core/dev/troubleshooting/performance/hardware/cpu-bottleneck.md b/ydb/docs/ru/core/troubleshooting/performance/hardware/cpu-bottleneck.md similarity index 51% rename from ydb/docs/ru/core/dev/troubleshooting/performance/hardware/cpu-bottleneck.md rename to ydb/docs/ru/core/troubleshooting/performance/hardware/cpu-bottleneck.md index 364ba59da4dd..04aec99331f8 100644 --- a/ydb/docs/ru/core/dev/troubleshooting/performance/hardware/cpu-bottleneck.md +++ b/ydb/docs/ru/core/troubleshooting/performance/hardware/cpu-bottleneck.md @@ -2,7 +2,7 @@ Высокая нагрузка на процессор может привести к медленному выполнению запросов и увеличению задержек. В условиях ограниченного ресурса процессора база данных может с трудом справляться со сложными запросами или высоконагруженными транзакционными сценариями использования. -Узлы {{ ydb-short-name }} в основном используют ресурсы процессора на выполнение [акторов](../../../../concepts/glossary.md#actor). На каждом узле акторы выполняются с использованием ресурсов одного из [пулов акторной системы](../../../../concepts/glossary.md#actor-system-pools). Потребление ресурсов каждого пула измеряется отдельно, что позволяет точнее отслеживать изменения в потреблении ресурсов. +Узлы {{ ydb-short-name }} в основном используют ресурсы процессора на выполнение [акторов](../../../concepts/glossary.md#actor). На каждом узле акторы выполняются с использованием ресурсов одного из [пулов акторной системы](../../../concepts/glossary.md#actor-system-pools). Потребление ресурсов каждого пула измеряется отдельно, что позволяет точнее отслеживать изменения в потреблении ресурсов. ## Диагностика @@ -11,4 +11,4 @@ ## Рекомендации -Добавьте дополнительные [узлы базы данных](../../../../concepts/glossary.md#database-node) в кластер или выделите больше процессорных ядер существующим узлам. Если это невозможно, рассмотрите возможность перераспределения ядер процессора между пулами ресурсов. +Добавьте дополнительные [узлы базы данных](../../../concepts/glossary.md#database-node) в кластер или выделите больше процессорных ядер существующим узлам. Если это невозможно, рассмотрите возможность перераспределения ядер процессора между пулами ресурсов. diff --git a/ydb/docs/ru/core/dev/troubleshooting/performance/hardware/disk-space.md b/ydb/docs/ru/core/troubleshooting/performance/hardware/disk-space.md similarity index 73% rename from ydb/docs/ru/core/dev/troubleshooting/performance/hardware/disk-space.md rename to ydb/docs/ru/core/troubleshooting/performance/hardware/disk-space.md index 901313927205..efba752ede4c 100644 --- a/ydb/docs/ru/core/dev/troubleshooting/performance/hardware/disk-space.md +++ b/ydb/docs/ru/core/troubleshooting/performance/hardware/disk-space.md @@ -4,9 +4,9 @@ ## Диагностика -1. Проверьте наличие скачков на графиках панели мониторинга **[DB overview > Storage](../../../../reference/observability/metrics/grafana-dashboards.md#dboverview)** в Grafana. +1. Проверьте наличие скачков на графиках панели мониторинга **[DB overview > Storage](../../../reference/observability/metrics/grafana-dashboards.md#dboverview)** в Grafana. -1. Во [встроенном UI](../../../../reference/embedded-ui/index.md) на вкладке **Storage** проанализируйте список доступных групп хранения и их потребление места на диске. +1. Во [встроенном UI](../../../reference/embedded-ui/index.md) на вкладке **Storage** проанализируйте список доступных групп хранения и их потребление места на диске. {% note tip %} @@ -18,12 +18,12 @@ {% note info %} -Чтобы получить эту информацию, можно также использовать [Healthcheck API](../../../../reference/ydb-sdk/health-check-api.md). +Чтобы получить эту информацию, можно также использовать [Healthcheck API](../../../reference/ydb-sdk/health-check-api.md). {% endnote %} ## Рекомендации -Добавьте больше [групп хранения](../../../../concepts/glossary.md#storage-group) в базу данных. +Добавьте больше [групп хранения](../../../concepts/glossary.md#storage-group) в базу данных. -Если у кластера нет свободных групп хранения, необходимо их предварительно сконфигурировать. При необходимости добавьте дополнительные [узлы хранения](../../../../concepts/glossary.md#storage-node). +Если у кластера нет свободных групп хранения, необходимо их предварительно сконфигурировать. При необходимости добавьте дополнительные [узлы хранения](../../../concepts/glossary.md#storage-node). diff --git a/ydb/docs/ru/core/dev/troubleshooting/performance/hardware/insufficient-memory.md b/ydb/docs/ru/core/troubleshooting/performance/hardware/insufficient-memory.md similarity index 94% rename from ydb/docs/ru/core/dev/troubleshooting/performance/hardware/insufficient-memory.md rename to ydb/docs/ru/core/troubleshooting/performance/hardware/insufficient-memory.md index f97df4bc90f1..46688ee8ea27 100644 --- a/ydb/docs/ru/core/dev/troubleshooting/performance/hardware/insufficient-memory.md +++ b/ydb/docs/ru/core/troubleshooting/performance/hardware/insufficient-memory.md @@ -18,7 +18,7 @@ 1. Определите, не перезапускались ли недавно по неизвестной причине какие-либо узлы {{ ydb-short-name }}. Исключите случаи обновления версии {{ ydb-short-name }} и другого планового обслуживания. Это может помочь обнаружить узлы, которые были завершены из-за нехватки памяти и перезапущены системой `systemd`. - 1. Откройте [встроенный UI](../../../../reference/embedded-ui/index.md). + 1. Откройте [встроенный UI](../../../reference/embedded-ui/index.md). 1. На вкладке **Nodes** обратите внимание на узлы с низким значением uptime. @@ -36,11 +36,11 @@ 1. Определите, используется ли память на 100%. - 1. Откройте панель мониторинга **[DB overview](../../../../reference/observability/metrics/grafana-dashboards.md#dboverview)** в Grafana. + 1. Откройте панель мониторинга **[DB overview](../../../reference/observability/metrics/grafana-dashboards.md#dboverview)** в Grafana. 1. Проанализируйте графики в секции **Memory**. -1. Определите, увеличилась ли пользовательская нагрузка на {{ ydb-short-name }}. Проанализируйте следующие графики на панели мониторинга **[DB overview](../../../../reference/observability/metrics/grafana-dashboards.md#dboverview)** в Grafana: +1. Определите, увеличилась ли пользовательская нагрузка на {{ ydb-short-name }}. Проанализируйте следующие графики на панели мониторинга **[DB overview](../../../reference/observability/metrics/grafana-dashboards.md#dboverview)** в Grafana: - **Requests**; - **Request size**; @@ -54,4 +54,4 @@ - Если нагрузка на {{ ydb-short-name }} увеличилась из-за новых методов использования или возросшей частоты запросов, попробуйте оптимизировать приложение, чтобы снизить нагрузку на {{ ydb-short-name }}, или добавьте больше узлов {{ ydb-short-name }}. -- Если нагрузка на {{ ydb-short-name }} не изменилась, но узлы всё равно перезапускаются, рассмотрите возможность добавления большего количества узлов {{ ydb-short-name }} или увеличения жёсткого лимита памяти для узлов. Дополнительную информацию об управлении памятью в {{ ydb-short-name }} см. в статье [{#T}](../../../../reference/configuration/index.md#memory-controller). +- Если нагрузка на {{ ydb-short-name }} не изменилась, но узлы всё равно перезапускаются, рассмотрите возможность добавления большего количества узлов {{ ydb-short-name }} или увеличения жёсткого лимита памяти для узлов. Дополнительную информацию об управлении памятью в {{ ydb-short-name }} см. в статье [{#T}](../../../reference/configuration/index.md#memory-controller). diff --git a/ydb/docs/ru/core/dev/troubleshooting/performance/hardware/io-bandwidth.md b/ydb/docs/ru/core/troubleshooting/performance/hardware/io-bandwidth.md similarity index 92% rename from ydb/docs/ru/core/dev/troubleshooting/performance/hardware/io-bandwidth.md rename to ydb/docs/ru/core/troubleshooting/performance/hardware/io-bandwidth.md index dc5c4c5a7385..8cd961f708e4 100644 --- a/ydb/docs/ru/core/dev/troubleshooting/performance/hardware/io-bandwidth.md +++ b/ydb/docs/ru/core/troubleshooting/performance/hardware/io-bandwidth.md @@ -9,6 +9,6 @@ ## Рекомендации -Добавьте в базу данных дополнительные [группы хранения](../../../../concepts/glossary.md#storage-group). +Добавьте в базу данных дополнительные [группы хранения](../../../concepts/glossary.md#storage-group). В случае с частыми микровсплесками нагрузки может помочь балансировка нагрузки по группам хранения. diff --git a/ydb/docs/ru/core/dev/troubleshooting/performance/hardware/toc_p.yaml b/ydb/docs/ru/core/troubleshooting/performance/hardware/toc_p.yaml similarity index 100% rename from ydb/docs/ru/core/dev/troubleshooting/performance/hardware/toc_p.yaml rename to ydb/docs/ru/core/troubleshooting/performance/hardware/toc_p.yaml diff --git a/ydb/docs/ru/core/dev/troubleshooting/performance/index.md b/ydb/docs/ru/core/troubleshooting/performance/index.md similarity index 88% rename from ydb/docs/ru/core/dev/troubleshooting/performance/index.md rename to ydb/docs/ru/core/troubleshooting/performance/index.md index cf225b1b7db8..23d51fb3bb70 100644 --- a/ydb/docs/ru/core/dev/troubleshooting/performance/index.md +++ b/ydb/docs/ru/core/troubleshooting/performance/index.md @@ -6,15 +6,15 @@ Для диагностики проблем с производительностью {{ ydb-short-name }} используются следующие инструменты: -- [{{ ydb-short-name }} метрики](../../../reference/observability/metrics/index.md). +- [{{ ydb-short-name }} метрики](../../reference/observability/metrics/index.md). - Диагностика большинства проблем с производительностью включает анализ [дашбордов в Grafana](../../../reference/observability/metrics/grafana-dashboards.md), которые используют метрики {{ ydb-short-name }}, собираемые с помощью Prometheus. Описание установки Grafana и Prometheus см. в разделе [{#T}](../../../devops/manual/monitoring.md); + Диагностика большинства проблем с производительностью включает анализ [дашбордов в Grafana](../../reference/observability/metrics/grafana-dashboards.md), которые используют метрики {{ ydb-short-name }}, собираемые с помощью Prometheus. Описание установки Grafana и Prometheus см. в разделе [{#T}](../../devops/manual/monitoring.md); -- [Логи {{ ydb-short-name }}](../../../devops/manual/logging.md); -- [Трассировка](../../../reference/observability/tracing/setup.md); -- [{{ ydb-short-name }} CLI](../../../reference/ydb-cli/index.md); -- [Встроенный UI](../../../reference/embedded-ui/index.md); -- [Планы запросов](../../query-plans-optimization.md); +- [Логи {{ ydb-short-name }}](../../devops/manual/logging.md); +- [Трассировка](../../reference/observability/tracing/setup.md); +- [{{ ydb-short-name }} CLI](../../reference/ydb-cli/index.md); +- [Встроенный UI](../../reference/embedded-ui/index.md); +- [Планы запросов](../../dev/query-plans-optimization.md); - Сторонние инструменты мониторинга. ## Классификация проблем с производительностью {{ ydb-short-name }} @@ -33,7 +33,7 @@ ### Проблемы с нехваткой аппаратных ресурсов -Такие проблемы возникают, когда нагрузка на базу данных требует больше аппаратных ресурсов — таких как процессор, память, дисковое пространство или пропускная способность сети — чем было выделено. В некоторых случаях неоптимальное выделение ресурсов, например неправильная настройка [контрольных групп (cgroups)](https://ru.wikipedia.org/wiki/Контрольная_группа_(Linux)) или [пулов ресурсов акторной системы](../../../concepts/glossary.md#actor-system-pool), может привести к нехватке аппаратных ресурсов для {{ ydb-short-name }} и увеличить задержки запросов, даже если для сервера баз данных было выделено достаточно аппаратных ресурсов. +Такие проблемы возникают, когда нагрузка на базу данных требует больше аппаратных ресурсов — таких как процессор, память, дисковое пространство или пропускная способность сети — чем было выделено. В некоторых случаях неоптимальное выделение ресурсов, например неправильная настройка [контрольных групп (cgroups)](https://ru.wikipedia.org/wiki/Контрольная_группа_(Linux)) или [пулов ресурсов акторной системы](../../concepts/glossary.md#actor-system-pool), может привести к нехватке аппаратных ресурсов для {{ ydb-short-name }} и увеличить задержки запросов, даже если для сервера баз данных было выделено достаточно аппаратных ресурсов. - **[Недостаточное быстродействие процессора](hardware/cpu-bottleneck.md)**. Высокая нагрузка на процессор может привести к медленному выполнению запросов и увеличению задержек. В условиях ограниченного ресурса процессора база данных может с трудом справляться со сложными запросами или интенсивным потоком транзакционных запросов. @@ -41,7 +41,7 @@ - **[Недостаточный объём памяти (RAM)](hardware/insufficient-memory.md)**. Обработка запросов требует памяти для временного хранения промежуточных данных. Недостаток свободной памяти может негативно повлиять на производительность базы данных. -- **[Недостаточная пропускная способность](hardware/io-bandwidth.md)**. Высокая скорость операций чтения/записи может перегрузить дисковую систему и приводить к увеличению задержек доступа к данным. Когда [распределённое хранилище](../../../concepts/glossary.md#distributed-storage) не может читать или записывать данные с достаточной скоростью, запросы к базе данных, требующие доступа к диску, могут замедляться. +- **[Недостаточная пропускная способность](hardware/io-bandwidth.md)**. Высокая скорость операций чтения/записи может перегрузить дисковую систему и приводить к увеличению задержек доступа к данным. Когда [распределённое хранилище](../../concepts/glossary.md#distributed-storage) не может читать или записывать данные с достаточной скоростью, запросы к базе данных, требующие доступа к диску, могут замедляться. ### Проблемы на уровне операционной системы diff --git a/ydb/docs/ru/core/dev/troubleshooting/performance/infrastructure/_assets/cluster-nodes.png b/ydb/docs/ru/core/troubleshooting/performance/infrastructure/_assets/cluster-nodes.png similarity index 100% rename from ydb/docs/ru/core/dev/troubleshooting/performance/infrastructure/_assets/cluster-nodes.png rename to ydb/docs/ru/core/troubleshooting/performance/infrastructure/_assets/cluster-nodes.png diff --git a/ydb/docs/ru/core/dev/troubleshooting/performance/infrastructure/_assets/diagnostics-network.png b/ydb/docs/ru/core/troubleshooting/performance/infrastructure/_assets/diagnostics-network.png similarity index 100% rename from ydb/docs/ru/core/dev/troubleshooting/performance/infrastructure/_assets/diagnostics-network.png rename to ydb/docs/ru/core/troubleshooting/performance/infrastructure/_assets/diagnostics-network.png diff --git a/ydb/docs/ru/core/dev/troubleshooting/performance/infrastructure/_includes/dc-outage.md b/ydb/docs/ru/core/troubleshooting/performance/infrastructure/_includes/dc-outage.md similarity index 79% rename from ydb/docs/ru/core/dev/troubleshooting/performance/infrastructure/_includes/dc-outage.md rename to ydb/docs/ru/core/troubleshooting/performance/infrastructure/_includes/dc-outage.md index 344467bd7539..4c7475e92cd8 100644 --- a/ydb/docs/ru/core/dev/troubleshooting/performance/infrastructure/_includes/dc-outage.md +++ b/ydb/docs/ru/core/troubleshooting/performance/infrastructure/_includes/dc-outage.md @@ -1,8 +1,8 @@ Чтобы установить недоступность одного из датацентров кластера {{ ydb-short-name }}, выполните следующие шаги: -1. Откройте [Встроенный UI](../../../../../reference/embedded-ui/index.md). +1. Откройте [Встроенный UI](../../../../reference/embedded-ui/index.md). -1. На вкладке **Nodes** проанализируйте [индикаторы состояния](../../../../../reference/embedded-ui/ydb-monitoring.md#colored_indicator) в колонках **Host** и **DC**. +1. На вкладке **Nodes** проанализируйте [индикаторы состояния](../../../../reference/embedded-ui/ydb-monitoring.md#colored_indicator) в колонках **Host** и **DC**. ![](../_assets/cluster-nodes.png) diff --git a/ydb/docs/ru/core/dev/troubleshooting/performance/infrastructure/_includes/network.md b/ydb/docs/ru/core/troubleshooting/performance/infrastructure/_includes/network.md similarity index 84% rename from ydb/docs/ru/core/dev/troubleshooting/performance/infrastructure/_includes/network.md rename to ydb/docs/ru/core/troubleshooting/performance/infrastructure/_includes/network.md index 7d40cab9e8e1..5bfc56fbcb71 100644 --- a/ydb/docs/ru/core/dev/troubleshooting/performance/infrastructure/_includes/network.md +++ b/ydb/docs/ru/core/troubleshooting/performance/infrastructure/_includes/network.md @@ -1,6 +1,6 @@ -Для диагностики сетевых проблем используйте опцию healthcheck во [Встроенном UI](../../../../../reference/embedded-ui/index.md): +Для диагностики сетевых проблем используйте опцию healthcheck во [Встроенном UI](../../../../reference/embedded-ui/index.md): -1. Откройте [Встроенный UI](../../../../../reference/embedded-ui/index.md): +1. Откройте [Встроенный UI](../../../../reference/embedded-ui/index.md): 1. Перейдите во вкладку **Databases** и выберите необходимую базу данных. diff --git a/ydb/docs/ru/core/dev/troubleshooting/performance/infrastructure/dc-drills.md b/ydb/docs/ru/core/troubleshooting/performance/infrastructure/dc-drills.md similarity index 100% rename from ydb/docs/ru/core/dev/troubleshooting/performance/infrastructure/dc-drills.md rename to ydb/docs/ru/core/troubleshooting/performance/infrastructure/dc-drills.md diff --git a/ydb/docs/ru/core/dev/troubleshooting/performance/infrastructure/dc-outage.md b/ydb/docs/ru/core/troubleshooting/performance/infrastructure/dc-outage.md similarity index 100% rename from ydb/docs/ru/core/dev/troubleshooting/performance/infrastructure/dc-outage.md rename to ydb/docs/ru/core/troubleshooting/performance/infrastructure/dc-outage.md diff --git a/ydb/docs/ru/core/dev/troubleshooting/performance/infrastructure/hardware.md b/ydb/docs/ru/core/troubleshooting/performance/infrastructure/hardware.md similarity index 92% rename from ydb/docs/ru/core/dev/troubleshooting/performance/infrastructure/hardware.md rename to ydb/docs/ru/core/troubleshooting/performance/infrastructure/hardware.md index bc02228c5c55..95cf8f35dd5f 100644 --- a/ydb/docs/ru/core/dev/troubleshooting/performance/infrastructure/hardware.md +++ b/ydb/docs/ru/core/troubleshooting/performance/infrastructure/hardware.md @@ -6,7 +6,7 @@ Используйте инструменты аппаратного мониторинга, предоставляемые операционной системой и датацентром, для диагностики аппаратных неисправностей. -Также используйте опцию **Healthcheck** во [Встроенном UI](../../../../reference/embedded-ui/index.md) для диагностики некоторых аппаратных проблем: +Также используйте опцию **Healthcheck** во [Встроенном UI](../../../reference/embedded-ui/index.md) для диагностики некоторых аппаратных проблем: - **Проблемы с дисками** diff --git a/ydb/docs/ru/core/dev/troubleshooting/performance/infrastructure/network.md b/ydb/docs/ru/core/troubleshooting/performance/infrastructure/network.md similarity index 100% rename from ydb/docs/ru/core/dev/troubleshooting/performance/infrastructure/network.md rename to ydb/docs/ru/core/troubleshooting/performance/infrastructure/network.md diff --git a/ydb/docs/ru/core/dev/troubleshooting/performance/infrastructure/toc_p.yaml b/ydb/docs/ru/core/troubleshooting/performance/infrastructure/toc_p.yaml similarity index 100% rename from ydb/docs/ru/core/dev/troubleshooting/performance/infrastructure/toc_p.yaml rename to ydb/docs/ru/core/troubleshooting/performance/infrastructure/toc_p.yaml diff --git a/ydb/docs/ru/core/dev/troubleshooting/performance/queries/_assets/soft-errors.png b/ydb/docs/ru/core/troubleshooting/performance/queries/_assets/soft-errors.png similarity index 100% rename from ydb/docs/ru/core/dev/troubleshooting/performance/queries/_assets/soft-errors.png rename to ydb/docs/ru/core/troubleshooting/performance/queries/_assets/soft-errors.png diff --git a/ydb/docs/ru/core/dev/troubleshooting/performance/queries/_assets/transactions-locks-invalidation.png b/ydb/docs/ru/core/troubleshooting/performance/queries/_assets/transactions-locks-invalidation.png similarity index 100% rename from ydb/docs/ru/core/dev/troubleshooting/performance/queries/_assets/transactions-locks-invalidation.png rename to ydb/docs/ru/core/troubleshooting/performance/queries/_assets/transactions-locks-invalidation.png diff --git a/ydb/docs/ru/core/dev/troubleshooting/performance/queries/_includes/overloaded-errors.md b/ydb/docs/ru/core/troubleshooting/performance/queries/_includes/overloaded-errors.md similarity index 79% rename from ydb/docs/ru/core/dev/troubleshooting/performance/queries/_includes/overloaded-errors.md rename to ydb/docs/ru/core/troubleshooting/performance/queries/_includes/overloaded-errors.md index db06683dd1c8..50fef63d1ef9 100644 --- a/ydb/docs/ru/core/dev/troubleshooting/performance/queries/_includes/overloaded-errors.md +++ b/ydb/docs/ru/core/troubleshooting/performance/queries/_includes/overloaded-errors.md @@ -1,4 +1,4 @@ -1. Откройте панель мониторинга Grafana **[DB overview](../../../../../reference/observability/metrics/grafana-dashboards.md#dboverview)**. +1. Откройте панель мониторинга Grafana **[DB overview](../../../../reference/observability/metrics/grafana-dashboards.md#dboverview)**. 1. В разделе **API details** проверьте, есть ли всплески частоты запросов со статусом `OVERLOADED` на диаграмме **Soft errors (retriable)**. @@ -6,7 +6,7 @@ 1. Чтобы проверить, не связаны ли всплески ошибок `OVERLOADED` с превышением лимита в 15000 запросов на партицию таблицы: - 1. Во [Встроенном UI](../../../../../reference/embedded-ui/index.md) перейдите на вкладку **Databases** и нажмите на базу данных. + 1. Во [Встроенном UI](../../../../reference/embedded-ui/index.md) перейдите на вкладку **Databases** и нажмите на базу данных. 1. На вкладке **Navigation** убедитесь, что требуемая база данных выбрана. @@ -18,6 +18,6 @@ 1. Чтобы проверить, не связаны ли всплески ошибок `OVERLOADED` со слишком частыми слияниями и разделениями таблеток, см. [{#T}](../../schemas/splits-merges.md). -1. Чтобы проверить, не связаны ли всплески ошибок `OVERLOADED` с превышением лимита в 1000 открытых сессий, см. диаграмму **Session count by host** на панели мониторинга Grafana **[DB status](../../../../../reference/observability/metrics/grafana-dashboards.md#dbstatus)**. +1. Чтобы проверить, не связаны ли всплески ошибок `OVERLOADED` с превышением лимита в 1000 открытых сессий, см. диаграмму **Session count by host** на панели мониторинга Grafana **[DB status](../../../../reference/observability/metrics/grafana-dashboards.md#dbstatus)**. 1. См. статью [{#T}](../../schemas/overloaded-shards.md). diff --git a/ydb/docs/ru/core/dev/troubleshooting/performance/queries/_includes/transaction-lock-invalidation.md b/ydb/docs/ru/core/troubleshooting/performance/queries/_includes/transaction-lock-invalidation.md similarity index 81% rename from ydb/docs/ru/core/dev/troubleshooting/performance/queries/_includes/transaction-lock-invalidation.md rename to ydb/docs/ru/core/troubleshooting/performance/queries/_includes/transaction-lock-invalidation.md index 566c1a477043..80cdbc5ebbcb 100644 --- a/ydb/docs/ru/core/dev/troubleshooting/performance/queries/_includes/transaction-lock-invalidation.md +++ b/ydb/docs/ru/core/troubleshooting/performance/queries/_includes/transaction-lock-invalidation.md @@ -1,4 +1,4 @@ -1. Откройте панель мониторинга **[DB overview](../../../../../reference/observability/metrics/grafana-dashboards.md#dboverview)** в Grafana. +1. Откройте панель мониторинга **[DB overview](../../../../reference/observability/metrics/grafana-dashboards.md#dboverview)** в Grafana. 1. Проверьте, есть ли всплески количества ошибок на диаграмме **Transaction Locks Invalidation**. diff --git a/ydb/docs/ru/core/dev/troubleshooting/performance/queries/overloaded-errors.md b/ydb/docs/ru/core/troubleshooting/performance/queries/overloaded-errors.md similarity index 88% rename from ydb/docs/ru/core/dev/troubleshooting/performance/queries/overloaded-errors.md rename to ydb/docs/ru/core/troubleshooting/performance/queries/overloaded-errors.md index 71c4048a718c..4b5397d163e8 100644 --- a/ydb/docs/ru/core/dev/troubleshooting/performance/queries/overloaded-errors.md +++ b/ydb/docs/ru/core/troubleshooting/performance/queries/overloaded-errors.md @@ -4,7 +4,7 @@ * Перегруженные партиции таблиц, у которых в очереди на выполнение более 15000 запросов. -* Превышен лимит размера выходной очереди [CDC](../../../../concepts/glossary.md#cdc) в 10000 элементов или 125 МБ. +* Превышен лимит размера выходной очереди [CDC](../../../concepts/glossary.md#cdc) в 10000 элементов или 125 МБ. * Партиции таблиц не находятся в нормальном состоянии, например, разделяются/объединяются. @@ -17,6 +17,6 @@ ## Рекомендации -Если YQL-запрос возвращает ошибку `OVERLOADED`, выполните запрос повторно с экспоненциальной задержкой. {{ ydb-short-name }} SDK предлагает встроенный механизм для обработки временных ошибок. Для получения дополнительной информации см. [{#T}](../../../../reference/ydb-sdk/error_handling.md). +Если YQL-запрос возвращает ошибку `OVERLOADED`, выполните запрос повторно с экспоненциальной задержкой. {{ ydb-short-name }} SDK предлагает встроенный механизм для обработки временных ошибок. Для получения дополнительной информации см. [{#T}](../../../reference/ydb-sdk/error_handling.md). Превышение лимита открытых сессий на узле может указывать на проблему в логике приложения. diff --git a/ydb/docs/ru/core/dev/troubleshooting/performance/queries/toc_p.yaml b/ydb/docs/ru/core/troubleshooting/performance/queries/toc_p.yaml similarity index 100% rename from ydb/docs/ru/core/dev/troubleshooting/performance/queries/toc_p.yaml rename to ydb/docs/ru/core/troubleshooting/performance/queries/toc_p.yaml diff --git a/ydb/docs/ru/core/dev/troubleshooting/performance/queries/transaction-lock-invalidation.md b/ydb/docs/ru/core/troubleshooting/performance/queries/transaction-lock-invalidation.md similarity index 83% rename from ydb/docs/ru/core/dev/troubleshooting/performance/queries/transaction-lock-invalidation.md rename to ydb/docs/ru/core/troubleshooting/performance/queries/transaction-lock-invalidation.md index 94e2fe6ff92e..c8a1a00c46fc 100644 --- a/ydb/docs/ru/core/dev/troubleshooting/performance/queries/transaction-lock-invalidation.md +++ b/ydb/docs/ru/core/troubleshooting/performance/queries/transaction-lock-invalidation.md @@ -4,7 +4,7 @@ {% note info %} -{{ ydb-short-name }} SDK предоставляет встроенный механизм обработки временных ошибок. Подробнее см. [{#T}](../../../../reference/ydb-sdk/error_handling.md). +{{ ydb-short-name }} SDK предоставляет встроенный механизм обработки временных ошибок. Подробнее см. [{#T}](../../../reference/ydb-sdk/error_handling.md). {% endnote %} @@ -20,7 +20,7 @@ - Чем дольше длится транзакция, тем выше вероятность возникновения ошибки `transaction locks invalidated`. - По возможности избегайте [интерактивных транзакций](../../../../concepts/glossary.md#interactive-transaction). Лучшим подходом является использование одного YQL-запроса с командами `BEGIN;` и `COMMIT;` для выбора данных, обновления данных и выполнения коммита транзакции. + По возможности избегайте [интерактивных транзакций](../../../concepts/glossary.md#interactive-transaction). Лучшим подходом является использование одного YQL-запроса с командами `BEGIN;` и `COMMIT;` для выбора данных, обновления данных и выполнения коммита транзакции. Если без интерактивных транзакций не обойтись, выполняйте коммит транзакции в последнем запросе. diff --git a/ydb/docs/ru/core/dev/troubleshooting/performance/schemas/_assets/describe.png b/ydb/docs/ru/core/troubleshooting/performance/schemas/_assets/describe.png similarity index 100% rename from ydb/docs/ru/core/dev/troubleshooting/performance/schemas/_assets/describe.png rename to ydb/docs/ru/core/troubleshooting/performance/schemas/_assets/describe.png diff --git a/ydb/docs/ru/core/dev/troubleshooting/performance/schemas/_assets/node-tablet-monitor-data-shard.png b/ydb/docs/ru/core/troubleshooting/performance/schemas/_assets/node-tablet-monitor-data-shard.png similarity index 100% rename from ydb/docs/ru/core/dev/troubleshooting/performance/schemas/_assets/node-tablet-monitor-data-shard.png rename to ydb/docs/ru/core/troubleshooting/performance/schemas/_assets/node-tablet-monitor-data-shard.png diff --git a/ydb/docs/ru/core/dev/troubleshooting/performance/schemas/_assets/overloaded-shards-dashboard.png b/ydb/docs/ru/core/troubleshooting/performance/schemas/_assets/overloaded-shards-dashboard.png similarity index 100% rename from ydb/docs/ru/core/dev/troubleshooting/performance/schemas/_assets/overloaded-shards-dashboard.png rename to ydb/docs/ru/core/troubleshooting/performance/schemas/_assets/overloaded-shards-dashboard.png diff --git a/ydb/docs/ru/core/dev/troubleshooting/performance/schemas/_assets/partitions-by-cpu.png b/ydb/docs/ru/core/troubleshooting/performance/schemas/_assets/partitions-by-cpu.png similarity index 100% rename from ydb/docs/ru/core/dev/troubleshooting/performance/schemas/_assets/partitions-by-cpu.png rename to ydb/docs/ru/core/troubleshooting/performance/schemas/_assets/partitions-by-cpu.png diff --git a/ydb/docs/ru/core/dev/troubleshooting/performance/schemas/_assets/splits-merges-tablets-devui.png b/ydb/docs/ru/core/troubleshooting/performance/schemas/_assets/splits-merges-tablets-devui.png similarity index 100% rename from ydb/docs/ru/core/dev/troubleshooting/performance/schemas/_assets/splits-merges-tablets-devui.png rename to ydb/docs/ru/core/troubleshooting/performance/schemas/_assets/splits-merges-tablets-devui.png diff --git a/ydb/docs/ru/core/dev/troubleshooting/performance/schemas/_assets/splits-merges.png b/ydb/docs/ru/core/troubleshooting/performance/schemas/_assets/splits-merges.png similarity index 100% rename from ydb/docs/ru/core/dev/troubleshooting/performance/schemas/_assets/splits-merges.png rename to ydb/docs/ru/core/troubleshooting/performance/schemas/_assets/splits-merges.png diff --git a/ydb/docs/ru/core/dev/troubleshooting/performance/schemas/_includes/overloaded-shards-diagnostics.md b/ydb/docs/ru/core/troubleshooting/performance/schemas/_includes/overloaded-shards-diagnostics.md similarity index 88% rename from ydb/docs/ru/core/dev/troubleshooting/performance/schemas/_includes/overloaded-shards-diagnostics.md rename to ydb/docs/ru/core/troubleshooting/performance/schemas/_includes/overloaded-shards-diagnostics.md index c83b8322acad..0ccc8150667c 100644 --- a/ydb/docs/ru/core/dev/troubleshooting/performance/schemas/_includes/overloaded-shards-diagnostics.md +++ b/ydb/docs/ru/core/troubleshooting/performance/schemas/_includes/overloaded-shards-diagnostics.md @@ -1,6 +1,6 @@ 1. Используйте Встроенный UI или Grafana, чтобы проверить, не перегружены ли узлы {{ ydb-short-name }}: - - На панели мониторинга Grafana **[DB overview](../../../../../reference/observability/metrics/grafana-dashboards.md#dboverview)** проанализируйте диаграмму **Overloaded shard count**. + - На панели мониторинга Grafana **[DB overview](../../../../reference/observability/metrics/grafana-dashboards.md#dboverview)** проанализируйте диаграмму **Overloaded shard count**. ![](../_assets/overloaded-shards-dashboard.png) @@ -12,7 +12,7 @@ {% endnote %} - - Во [Встроенном UI](../../../../../reference/embedded-ui/index.md): + - Во [Встроенном UI](../../../../reference/embedded-ui/index.md): 1. Перейдите на вкладку **Databases** и выберите базу данных. @@ -26,11 +26,11 @@ ![](../_assets/partitions-by-cpu.png) - Кроме того, информация о перегруженных таблетках представлена в виде системной таблицы. Дополнительные сведения см. в разделе [{#T}](../../../../system-views.md#top-overload-partitions). + Кроме того, информация о перегруженных таблетках представлена в виде системной таблицы. Дополнительные сведения см. в разделе [{#T}](../../../../dev/system-views.md#top-overload-partitions). -1. Чтобы точно определить проблему со схемой, используйте [Встроенный UI](../../../../../reference/embedded-ui/index.md) или [{{ ydb-short-name }} CLI](../../../../../reference/ydb-cli/index.md): +1. Чтобы точно определить проблему со схемой, используйте [Встроенный UI](../../../../reference/embedded-ui/index.md) или [{{ ydb-short-name }} CLI](../../../../reference/ydb-cli/index.md): - - Во [Встроенном UI](../../../../../reference/embedded-ui/index.md): + - Во [Встроенном UI](../../../../reference/embedded-ui/index.md): 1. На вкладке **Databases** нажмите на базу данных. @@ -57,7 +57,7 @@ {% endnote %} - - В [{{ ydb-short-name }} CLI](../../../../../reference/ydb-cli/index.md): + - В [{{ ydb-short-name }} CLI](../../../../reference/ydb-cli/index.md): 1. Чтобы получить информацию о проблемной таблице, выполните следующую команду: diff --git a/ydb/docs/ru/core/dev/troubleshooting/performance/schemas/_includes/splits-merges.md b/ydb/docs/ru/core/troubleshooting/performance/schemas/_includes/splits-merges.md similarity index 88% rename from ydb/docs/ru/core/dev/troubleshooting/performance/schemas/_includes/splits-merges.md rename to ydb/docs/ru/core/troubleshooting/performance/schemas/_includes/splits-merges.md index de8bbc67a6cb..f66dd486f06a 100644 --- a/ydb/docs/ru/core/dev/troubleshooting/performance/schemas/_includes/splits-merges.md +++ b/ydb/docs/ru/core/troubleshooting/performance/schemas/_includes/splits-merges.md @@ -1,4 +1,4 @@ -1. Посмотрите, есть ли всплески на графике **Split / Merge partitions** на панели мониторинга Grafana **[DB status](../../../../../reference/observability/metrics/grafana-dashboards.md#dbstatus)**. +1. Посмотрите, есть ли всплески на графике **Split / Merge partitions** на панели мониторинга Grafana **[DB status](../../../../reference/observability/metrics/grafana-dashboards.md#dbstatus)**. ![](../_assets/splits-merges.png) @@ -17,7 +17,7 @@ 1. Чтобы определить недавно разделённые или слитые таблетки, выполните следующие шаги: - 1. Во [Встроенном UI](../../../../../reference/embedded-ui/index.md) нажмите на ссылку **Developer UI** в правом верхнем углу. + 1. Во [Встроенном UI](../../../../reference/embedded-ui/index.md) нажмите на ссылку **Developer UI** в правом верхнем углу. 1. Перейдите на страницу **Node Table Monitor** > **All tablets of the cluster**. @@ -35,7 +35,7 @@ 1. Чтобы определить, связана ли проблема с неправильной схемой таблицы, выполните следующие шаги: - 1. Получите информацию о проблемной таблице с помощью [{{ ydb-short-name }} CLI](../../../../../reference/ydb-cli/index.md). Выполните следующую команду: + 1. Получите информацию о проблемной таблице с помощью [{{ ydb-short-name }} CLI](../../../../reference/ydb-cli/index.md). Выполните следующую команду: ```bash ydb scheme describe <имя_таблицы> diff --git a/ydb/docs/ru/core/dev/troubleshooting/performance/schemas/overloaded-shards.md b/ydb/docs/ru/core/troubleshooting/performance/schemas/overloaded-shards.md similarity index 72% rename from ydb/docs/ru/core/dev/troubleshooting/performance/schemas/overloaded-shards.md rename to ydb/docs/ru/core/troubleshooting/performance/schemas/overloaded-shards.md index 66dcb3530c7c..65b025baf341 100644 --- a/ydb/docs/ru/core/dev/troubleshooting/performance/schemas/overloaded-shards.md +++ b/ydb/docs/ru/core/troubleshooting/performance/schemas/overloaded-shards.md @@ -1,8 +1,8 @@ # Перегруженные таблетки data shard -Таблетки [data shard](../../../../concepts/glossary.md#data-shard), обслуживающие [строковые таблицы](../../../../concepts/datamodel/table.md#row-oriented-tables), могут быть перегружены по следующим причинам: +Таблетки [data shard](../../../concepts/glossary.md#data-shard), обслуживающие [строковые таблицы](../../../concepts/datamodel/table.md#row-oriented-tables), могут быть перегружены по следующим причинам: -* Таблица создана без указания параметра [AUTO_PARTITIONING_BY_LOAD](../../../../concepts/datamodel/table.md#AUTO_PARTITIONING_BY_LOAD). +* Таблица создана без указания параметра [AUTO_PARTITIONING_BY_LOAD](../../../concepts/datamodel/table.md#AUTO_PARTITIONING_BY_LOAD). В этом случае {{ ydb-short-name }} не разбивает перегруженные таблетки data shard. @@ -10,9 +10,9 @@ Если таблетка data shard уже содержит 10000 операций в своей очереди, новые запросы будут возвращать ошибку «overloaded». Повторите такие запросы, используя экспоненциально растущий перерыв, см. [{#T}](../queries/overloaded-errors.md). -* Таблица создана с параметром [AUTO_PARTITIONING_MAX_PARTITIONS_COUNT](../../../../concepts/datamodel/table.md#AUTO_PARTITIONING_MAX_PARTITIONS_COUNT) и уже достигла лимита на число партиций. +* Таблица создана с параметром [AUTO_PARTITIONING_MAX_PARTITIONS_COUNT](../../../concepts/datamodel/table.md#AUTO_PARTITIONING_MAX_PARTITIONS_COUNT) и уже достигла лимита на число партиций. -* Неэффективный [первичный ключ](../../../../concepts/glossary.md#primary-key), который вызывает дисбаланс в распределении запросов по таблеткам data shard. Типичным примером является использование монотонно увеличивающегося первичного ключа при загрузке данных, что может привести к перегрузке «последней» партиции. Например, это может произойти с автоматически увеличивающимся первичным ключом, использующим тип данных [Serial](../../../../yql/reference/types/serial.md). +* Неэффективный [первичный ключ](../../../concepts/glossary.md#primary-key), который вызывает дисбаланс в распределении запросов по таблеткам data shard. Типичным примером является использование монотонно увеличивающегося первичного ключа при загрузке данных, что может привести к перегрузке «последней» партиции. Например, это может произойти с автоматически увеличивающимся первичным ключом, использующим тип данных [Serial](../../../yql/reference/types/serial.md). ## Диагностика @@ -41,7 +41,7 @@ {% endnote %} -Обе операции можно выполнить с помощью запроса [`ALTER TABLE ... SET`](../../../../yql/reference/syntax/alter_table/set.md). +Обе операции можно выполнить с помощью запроса [`ALTER TABLE ... SET`](../../../yql/reference/syntax/alter_table/set.md). ### Для несбалансированного первичного ключа {#pk-recommendations} diff --git a/ydb/docs/ru/core/dev/troubleshooting/performance/schemas/splits-merges.md b/ydb/docs/ru/core/troubleshooting/performance/schemas/splits-merges.md similarity index 52% rename from ydb/docs/ru/core/dev/troubleshooting/performance/schemas/splits-merges.md rename to ydb/docs/ru/core/troubleshooting/performance/schemas/splits-merges.md index 813408b12338..0c54a01ac596 100644 --- a/ydb/docs/ru/core/dev/troubleshooting/performance/schemas/splits-merges.md +++ b/ydb/docs/ru/core/troubleshooting/performance/schemas/splits-merges.md @@ -2,23 +2,23 @@ {% if oss == true and backend_name == "YDB" %} -{% include [OLAP_not_allow_note](../../../../_includes/not_allow_for_olap_note.md) %} +{% include [OLAP_not_allow_note](../../../_includes/not_allow_for_olap_note.md) %} {% endif %} -Каждая партиция [строковой таблицы](../../../../concepts/datamodel/table.md#row-oriented-tables) в {{ ydb-short-name }} обрабатывается таблеткой [data shard](../../../../concepts/glossary.md#data-shard). {{ ydb-short-name }} поддерживает автоматическое [разделение и слияние](../../../../concepts/datamodel/table.md#partitioning) таблеток data shard, что позволяет легко адаптироваться к изменениям в рабочих нагрузках. Однако эти операции не являются бесплатными и могут оказать кратковременное негативное влияние на задержки запросов. +Каждая партиция [строковой таблицы](../../../concepts/datamodel/table.md#row-oriented-tables) в {{ ydb-short-name }} обрабатывается таблеткой [data shard](../../../concepts/glossary.md#data-shard). {{ ydb-short-name }} поддерживает автоматическое [разделение и слияние](../../../concepts/datamodel/table.md#partitioning) таблеток data shard, что позволяет легко адаптироваться к изменениям в рабочих нагрузках. Однако эти операции не являются бесплатными и могут оказать кратковременное негативное влияние на задержки запросов. Когда {{ ydb-short-name }} разбивает партицию, исходная партиция заменяется двумя новыми партициями, охватывающими тот же диапазон первичных ключей. Теперь две таблетки data shard обрабатывают диапазон первичных ключей, который ранее обрабатывался одной таблеткой data shard, тем самым добавляя больше вычислительных ресурсов для таблицы. По умолчанию {{ ydb-short-name }} разделяет партицию таблицы, когда её размер достигает 2 ГБ. Однако рекомендуется также включить разделение по загрузке, что позволит {{ ydb-short-name }} разделять перегруженные партиции, даже если их размер меньше 2 ГБ. -У [scheme shard](../../../../concepts/glossary.md#scheme-shard) уходит примерно 15 секунд на принятие решения о разделении таблетки data shard. По умолчанию пороговое значение потребления процессора для разделения таблетки data shard установлено в 50%. +У [scheme shard](../../../concepts/glossary.md#scheme-shard) уходит примерно 15 секунд на принятие решения о разделении таблетки data shard. По умолчанию пороговое значение потребления процессора для разделения таблетки data shard установлено в 50%. Когда {{ ydb-short-name }} объединяет соседние партиции в строковой таблице, они заменяются одной партицией, которая охватывает их диапазон первичных ключей. Соответствующие таблетки data shard также объединяются в одну таблетку для управления новой партицией. Для того чтобы произошло слияние, таблетки data shard должны существовать не менее 10 минут, а их загрузка процессора за последний час не должна превышать 35%. -При настройке [партиционирования таблицы](../../../../concepts/datamodel/table.md#partitioning) вы можете также установить лимиты на [минимальное](../../../../concepts/datamodel/table.md#auto_partitioning_min_partitions_count) и [максимальное количество партиций](../../../../concepts/datamodel/table.md#auto_partitioning_max_partitions_count). Если разница между минимальным и максимальным пределами превышает 20%, а загрузка таблицы значительно меняется с течением времени, [Hive](../../../../concepts/glossary.md#hive) может начать разделять перегруженные таблицы, а затем объединять их обратно в периоды низкой загрузки. +При настройке [партиционирования таблицы](../../../concepts/datamodel/table.md#partitioning) вы можете также установить лимиты на [минимальное](../../../concepts/datamodel/table.md#auto_partitioning_min_partitions_count) и [максимальное количество партиций](../../../concepts/datamodel/table.md#auto_partitioning_max_partitions_count). Если разница между минимальным и максимальным пределами превышает 20%, а загрузка таблицы значительно меняется с течением времени, [Hive](../../../concepts/glossary.md#hive) может начать разделять перегруженные таблицы, а затем объединять их обратно в периоды низкой загрузки. ## Диагностика @@ -27,6 +27,6 @@ ## Рекомендации -Если пользовательская нагрузка на {{ ydb-short-name }} не изменилась, рассмотрите возможность изменения интервала между минимальным и максимальным лимитами на количество партиций таблицы до рекомендуемой разницы в 20%. Используйте инструкцию YQL [`ALTER TABLE имя_таблицы SET (ключ = значение)`](../../../../yql/reference/syntax/alter_table/set.md) для обновления параметров [`AUTO_PARTITIONING_MIN_PARTITIONS_COUNT`](../../../../concepts/datamodel/table.md#auto_partitioning_min_partitions_count) и [`AUTO_PARTITIONING_MAX_PARTITIONS_COUNT`](../../../../concepts/datamodel/table.md#auto_partitioning_max_partitions_count). +Если пользовательская нагрузка на {{ ydb-short-name }} не изменилась, рассмотрите возможность изменения интервала между минимальным и максимальным лимитами на количество партиций таблицы до рекомендуемой разницы в 20%. Используйте инструкцию YQL [`ALTER TABLE имя_таблицы SET (ключ = значение)`](../../../yql/reference/syntax/alter_table/set.md) для обновления параметров [`AUTO_PARTITIONING_MIN_PARTITIONS_COUNT`](../../../concepts/datamodel/table.md#auto_partitioning_min_partitions_count) и [`AUTO_PARTITIONING_MAX_PARTITIONS_COUNT`](../../../concepts/datamodel/table.md#auto_partitioning_max_partitions_count). -Если вы хотите избежать разделения и слияния таблеток data shard, вы можете выставить одинаковые значения для параметров [`AUTO_PARTITIONING_MIN_PARTITIONS_COUNT`](../../../../concepts/datamodel/table.md#auto_partitioning_min_partitions_count) и [`AUTO_PARTITIONING_MAX_PARTITIONS_COUNT`](../../../../concepts/datamodel/table.md#auto_partitioning_max_partitions_count) или отключить разделение по загрузке. +Если вы хотите избежать разделения и слияния таблеток data shard, вы можете выставить одинаковые значения для параметров [`AUTO_PARTITIONING_MIN_PARTITIONS_COUNT`](../../../concepts/datamodel/table.md#auto_partitioning_min_partitions_count) и [`AUTO_PARTITIONING_MAX_PARTITIONS_COUNT`](../../../concepts/datamodel/table.md#auto_partitioning_max_partitions_count) или отключить разделение по загрузке. diff --git a/ydb/docs/ru/core/dev/troubleshooting/performance/schemas/toc_p.yaml b/ydb/docs/ru/core/troubleshooting/performance/schemas/toc_p.yaml similarity index 100% rename from ydb/docs/ru/core/dev/troubleshooting/performance/schemas/toc_p.yaml rename to ydb/docs/ru/core/troubleshooting/performance/schemas/toc_p.yaml diff --git a/ydb/docs/ru/core/dev/troubleshooting/performance/system/_assets/healthcheck-clock-drift.png b/ydb/docs/ru/core/troubleshooting/performance/system/_assets/healthcheck-clock-drift.png similarity index 100% rename from ydb/docs/ru/core/dev/troubleshooting/performance/system/_assets/healthcheck-clock-drift.png rename to ydb/docs/ru/core/troubleshooting/performance/system/_assets/healthcheck-clock-drift.png diff --git a/ydb/docs/ru/core/dev/troubleshooting/performance/system/system-clock-drift.md b/ydb/docs/ru/core/troubleshooting/performance/system/system-clock-drift.md similarity index 69% rename from ydb/docs/ru/core/dev/troubleshooting/performance/system/system-clock-drift.md rename to ydb/docs/ru/core/troubleshooting/performance/system/system-clock-drift.md index 5574ed1f58d2..0fe45469e859 100644 --- a/ydb/docs/ru/core/dev/troubleshooting/performance/system/system-clock-drift.md +++ b/ydb/docs/ru/core/troubleshooting/performance/system/system-clock-drift.md @@ -8,17 +8,17 @@ {% endnote %} -Если системное время узлов, на которых запущены [таблетки-координаторы](../../../../concepts/glossary.md#coordinator), отличается друг от друга, задержки транзакций увеличиваются на величину разницы во времени между самыми быстрыми и самыми отстающими системными часами. Это происходит потому, что транзакция, запланированная на узле с более быстрыми системными часами, может быть выполнена только после того, как координатор с самыми отстающими часами достигнет того же времени. +Если системное время узлов, на которых запущены [таблетки-координаторы](../../../concepts/glossary.md#coordinator), отличается друг от друга, задержки транзакций увеличиваются на величину разницы во времени между самыми быстрыми и самыми отстающими системными часами. Это происходит потому, что транзакция, запланированная на узле с более быстрыми системными часами, может быть выполнена только после того, как координатор с самыми отстающими часами достигнет того же времени. -Более того, если отклонение во времени превысит 30 секунд, система {{ ydb-short-name }} откажется обрабатывать распределённые транзакции. Перед тем как координаторы приступят к планированию транзакции, задействованные [data shards](../../../../concepts/glossary.md#data-shard) определяют допустимый диапазон временных меток (timestamps) для транзакции. Начало этого диапазона — текущее системное время таблетки-медиатора, а конец определяет тайм-аут планирования в 30 секунд. Если системное время координатора выходит за пределы этого временного диапазона, он не может запланировать распределённую транзакцию, что приводит к ошибкам в таких запросах. +Более того, если отклонение во времени превысит 30 секунд, система {{ ydb-short-name }} откажется обрабатывать распределённые транзакции. Перед тем как координаторы приступят к планированию транзакции, задействованные [data shards](../../../concepts/glossary.md#data-shard) определяют допустимый диапазон временных меток (timestamps) для транзакции. Начало этого диапазона — текущее системное время таблетки-медиатора, а конец определяет тайм-аут планирования в 30 секунд. Если системное время координатора выходит за пределы этого временного диапазона, он не может запланировать распределённую транзакцию, что приводит к ошибкам в таких запросах. ## Диагностика Чтобы диагностировать расхождение в системном времени серверов {{ ydb-short-name }}, используйте следующие методы: -1. Используйте **Healthcheck** во [Встроенном UI](../../../../reference/embedded-ui/index.md): +1. Используйте **Healthcheck** во [Встроенном UI](../../../reference/embedded-ui/index.md): - 1. Во [Встроенном UI](../../../../reference/embedded-ui/index.md) перейдите на вкладку **Databases** и нажмите на наименование базы данных. + 1. Во [Встроенном UI](../../../reference/embedded-ui/index.md) перейдите на вкладку **Databases** и нажмите на наименование базы данных. 1. На вкладке **Navigation** убедитесь, что требуемая база данных выбрана. @@ -36,12 +36,12 @@ {% note info %} - Для получения дополнительной информации см. [{#T}](../../../../reference/ydb-sdk/health-check-api.md) + Для получения дополнительной информации см. [{#T}](../../../reference/ydb-sdk/health-check-api.md) {% endnote %} -1. Откройте страницу [Interconnect overview](../../../../reference/embedded-ui/interconnect-overview.md) во [Встроенном UI](../../../../reference/embedded-ui/index.md). +1. Откройте страницу [Interconnect overview](../../../reference/embedded-ui/interconnect-overview.md) во [Встроенном UI](../../../reference/embedded-ui/index.md). 1. Используйте такие инструменты, как `pssh` или `ansible`, чтобы выполнить команду (например, `date +%s%N`) на всех узлах {{ ydb-short-name }} и отобразить значение системных часов. diff --git a/ydb/docs/ru/core/dev/troubleshooting/performance/system/toc_p.yaml b/ydb/docs/ru/core/troubleshooting/performance/system/toc_p.yaml similarity index 100% rename from ydb/docs/ru/core/dev/troubleshooting/performance/system/toc_p.yaml rename to ydb/docs/ru/core/troubleshooting/performance/system/toc_p.yaml diff --git a/ydb/docs/ru/core/dev/troubleshooting/performance/toc_p.yaml b/ydb/docs/ru/core/troubleshooting/performance/toc_p.yaml similarity index 100% rename from ydb/docs/ru/core/dev/troubleshooting/performance/toc_p.yaml rename to ydb/docs/ru/core/troubleshooting/performance/toc_p.yaml diff --git a/ydb/docs/ru/core/dev/troubleshooting/performance/ydb/_assets/cpu-balancer.jpg b/ydb/docs/ru/core/troubleshooting/performance/ydb/_assets/cpu-balancer.jpg similarity index 100% rename from ydb/docs/ru/core/dev/troubleshooting/performance/ydb/_assets/cpu-balancer.jpg rename to ydb/docs/ru/core/troubleshooting/performance/ydb/_assets/cpu-balancer.jpg diff --git a/ydb/docs/ru/core/dev/troubleshooting/performance/ydb/_assets/hive-app.png b/ydb/docs/ru/core/troubleshooting/performance/ydb/_assets/hive-app.png similarity index 100% rename from ydb/docs/ru/core/dev/troubleshooting/performance/ydb/_assets/hive-app.png rename to ydb/docs/ru/core/troubleshooting/performance/ydb/_assets/hive-app.png diff --git a/ydb/docs/ru/core/dev/troubleshooting/performance/ydb/_assets/tablets-moved.png b/ydb/docs/ru/core/troubleshooting/performance/ydb/_assets/tablets-moved.png similarity index 100% rename from ydb/docs/ru/core/dev/troubleshooting/performance/ydb/_assets/tablets-moved.png rename to ydb/docs/ru/core/troubleshooting/performance/ydb/_assets/tablets-moved.png diff --git a/ydb/docs/ru/core/dev/troubleshooting/performance/ydb/_assets/updates.png b/ydb/docs/ru/core/troubleshooting/performance/ydb/_assets/updates.png similarity index 100% rename from ydb/docs/ru/core/dev/troubleshooting/performance/ydb/_assets/updates.png rename to ydb/docs/ru/core/troubleshooting/performance/ydb/_assets/updates.png diff --git a/ydb/docs/ru/core/dev/troubleshooting/performance/ydb/_includes/tablets-moved.md b/ydb/docs/ru/core/troubleshooting/performance/ydb/_includes/tablets-moved.md similarity index 88% rename from ydb/docs/ru/core/dev/troubleshooting/performance/ydb/_includes/tablets-moved.md rename to ydb/docs/ru/core/troubleshooting/performance/ydb/_includes/tablets-moved.md index 72bcafd457fe..ddb82937b3fd 100644 --- a/ydb/docs/ru/core/dev/troubleshooting/performance/ydb/_includes/tablets-moved.md +++ b/ydb/docs/ru/core/troubleshooting/performance/ydb/_includes/tablets-moved.md @@ -1,4 +1,4 @@ -1. Посмотрите, отображаются ли всплески на графике **Tablets moved by Hive** на панели мониторинга Grafana **[DB status](../../../../../reference/observability/metrics/grafana-dashboards.md#dbstatus)**. +1. Посмотрите, отображаются ли всплески на графике **Tablets moved by Hive** на панели мониторинга Grafana **[DB status](../../../../reference/observability/metrics/grafana-dashboards.md#dbstatus)**. ![](../_assets/tablets-moved.png) @@ -6,7 +6,7 @@ 1. Проанализируйте статистику балансировщика Hive. - 1. Откройте [Встроенный UI](../../../../../reference/embedded-ui/index.md). + 1. Откройте [Встроенный UI](../../../../reference/embedded-ui/index.md). 1. Нажмите на ссылку **Developer UI** в правом верхнем углу Встроенного UI. diff --git a/ydb/docs/ru/core/dev/troubleshooting/performance/ydb/tablets-moved.md b/ydb/docs/ru/core/troubleshooting/performance/ydb/tablets-moved.md similarity index 93% rename from ydb/docs/ru/core/dev/troubleshooting/performance/ydb/tablets-moved.md rename to ydb/docs/ru/core/troubleshooting/performance/ydb/tablets-moved.md index 7d8a1b73408d..da606bc66382 100644 --- a/ydb/docs/ru/core/dev/troubleshooting/performance/ydb/tablets-moved.md +++ b/ydb/docs/ru/core/troubleshooting/performance/ydb/tablets-moved.md @@ -1,6 +1,6 @@ # Частые переезды таблеток между узлами -{{ ydb-short-name }} автоматически распределяет нагрузку, перемещая таблетки с перегруженных узлов на другие узлы. Этот процесс управляется компонентом [Hive](../../../../concepts/glossary.md#hive). Когда Hive перемещает таблетки, запросы, затрагивающие эти таблетки, могут выполняться дольше из-за ожидания инициализации таблетки на новом узле. +{{ ydb-short-name }} автоматически распределяет нагрузку, перемещая таблетки с перегруженных узлов на другие узлы. Этот процесс управляется компонентом [Hive](../../../concepts/glossary.md#hive). Когда Hive перемещает таблетки, запросы, затрагивающие эти таблетки, могут выполняться дольше из-за ожидания инициализации таблетки на новом узле. Для балансировки нагрузки между узлами {{ ydb-short-name }} учитывает использование следующих аппаратных ресурсов: @@ -41,7 +41,7 @@ Измените настройки балансировщика Hive: -1. Откройте [Встроенный UI](../../../../reference/embedded-ui/index.md). +1. Откройте [Встроенный UI](../../../reference/embedded-ui/index.md). 1. Нажмите на ссылку **Developer UI** в правом верхнем углу Встроенного UI. diff --git a/ydb/docs/ru/core/dev/troubleshooting/performance/ydb/toc_p.yaml b/ydb/docs/ru/core/troubleshooting/performance/ydb/toc_p.yaml similarity index 100% rename from ydb/docs/ru/core/dev/troubleshooting/performance/ydb/toc_p.yaml rename to ydb/docs/ru/core/troubleshooting/performance/ydb/toc_p.yaml diff --git a/ydb/docs/ru/core/dev/troubleshooting/performance/ydb/ydb-updates.md b/ydb/docs/ru/core/troubleshooting/performance/ydb/ydb-updates.md similarity index 82% rename from ydb/docs/ru/core/dev/troubleshooting/performance/ydb/ydb-updates.md rename to ydb/docs/ru/core/troubleshooting/performance/ydb/ydb-updates.md index 57791f5dcd1a..a1a46c8d3300 100644 --- a/ydb/docs/ru/core/dev/troubleshooting/performance/ydb/ydb-updates.md +++ b/ydb/docs/ru/core/troubleshooting/performance/ydb/ydb-updates.md @@ -1,8 +1,8 @@ # Процедура последовательного перезапуска -Кластеры {{ ydb-short-name }} могут обновляться без простоев, поскольку обычно они содержат избыточные компоненты и поддерживает процедуру последовательного перезапуска. Чтобы обеспечить постоянную доступность данных, {{ ydb-short-name }} включает в себя [систему управления кластером (CMS)](../../../../concepts/glossary.md#cms), которая отслеживает все сбои в работе и узлы, отключённые для обслуживания, например при перезапуске. CMS останавливает новые запросы на обслуживание, если они могут поставить под угрозу доступность данных. +Кластеры {{ ydb-short-name }} могут обновляться без простоев, поскольку обычно они содержат избыточные компоненты и поддерживает процедуру последовательного перезапуска. Чтобы обеспечить постоянную доступность данных, {{ ydb-short-name }} включает в себя [систему управления кластером (CMS)](../../../concepts/glossary.md#cms), которая отслеживает все сбои в работе и узлы, отключённые для обслуживания, например при перезапуске. CMS останавливает новые запросы на обслуживание, если они могут поставить под угрозу доступность данных. -Однако, даже если данные всегда доступны, перезапуск всех узлов за относительно короткий промежуток времени может оказать заметное влияние на общую производительность. Каждая [таблетка](../../../../concepts/glossary.md#tablet), которая выполнялась на перезапускаемом узле, будет запущена снова на другом узле. Перемещение таблетки между узлами требует времени и может повлиять на задержки запросов, связанных с ней. Смотрите рекомендации [по процедуре последовательного перезапуска](#rolling-restart). +Однако, даже если данные всегда доступны, перезапуск всех узлов за относительно короткий промежуток времени может оказать заметное влияние на общую производительность. Каждая [таблетка](../../../concepts/glossary.md#tablet), которая выполнялась на перезапускаемом узле, будет запущена снова на другом узле. Перемещение таблетки между узлами требует времени и может повлиять на задержки запросов, связанных с ней. Смотрите рекомендации [по процедуре последовательного перезапуска](#rolling-restart). Кроме того, новая версия {{ ydb-short-name }} может обрабатывать запросы по-другому. Хотя производительность, как правило, повышается с каждым обновлением, в некоторых пограничных случаях производительность может снижаться. Смотрите рекомендации [по производительности разных версий](#version-performance). @@ -16,7 +16,7 @@ Чтобы проверить, не обновляется ли {{ ydb-short-name }} кластер: -1. Откройте [Встроенный UI](../../../../reference/embedded-ui/index.md). +1. Откройте [Встроенный UI](../../../reference/embedded-ui/index.md). 1. На вкладке **Nodes** проверьте, отличаются ли версии {{ ydb-short-name }} на узлах. @@ -46,7 +46,7 @@ Целью является как можно более раннее обнаружение негативного влияния новой версии {{ ydb-short-name }} на скорость выполнения определённых запросов в конкретной пользовательской нагрузке: -1. Просмотрите [список изменений {{ ydb-short-name }} сервера](../../../../changelog-server.md), обращая особое внимание на изменения, связанные с производительностью и относящиеся к вашей рабочей нагрузке. +1. Просмотрите [список изменений {{ ydb-short-name }} сервера](../../../changelog-server.md), обращая особое внимание на изменения, связанные с производительностью и относящиеся к вашей рабочей нагрузке. 1. Используйте выделенный кластер для тестирования {{ ydb-short-name }} под нагрузкой, которая настолько точно соответствует вашей рабочей нагрузке, насколько это возможно. Всегда сначала развёртывайте новую версию {{ ydb-short-name }} на таких кластерах, чтобы оценить влияние новой версии в тестовом окружении. Отслеживайте задержки как на стороне клиента, так и на стороне сервера, чтобы выявить любые потенциальные проблемы с производительностью. diff --git a/ydb/docs/ru/core/dev/troubleshooting/toc_p.yaml b/ydb/docs/ru/core/troubleshooting/toc_p.yaml similarity index 100% rename from ydb/docs/ru/core/dev/troubleshooting/toc_p.yaml rename to ydb/docs/ru/core/troubleshooting/toc_p.yaml