ydb-platform · blinkov · May 15, 2024 · May 3, 2024 · May 3, 2024 · May 3, 2024
@@ -23,4 +23,6 @@ items:
   href: ../../maintenance/manual/cms.md
 - name: System views
   href: system-views.md
+- name: Maintenance without downtime
+  href: ../../maintenance/manual/maintenance-without-downtime.md
 
diff --git a/ydb/docs/en/core/maintenance/manual/maintenance-without-downtime.md b/ydb/docs/en/core/maintenance/manual/maintenance-without-downtime.md
@@ -0,0 +1,98 @@
+# Maintenance without downtime
+
+Periodically, the {{ ydb-short-name }} cluster needs to be maintained, such as upgrading its version or replacing broken disks. Maintenance can cause the cluster or its databases to become unavailable due to:
+- Exceeding the failure model of the affected [storage groups](../../concepts/databases.md#storage-groups).
+- Exceeding the [State Storage](../../deploy/configuration/config.md#domains-state) failure model.
+- Lack of computational resources due to stopping too many [dynamic nodes](../../concepts/cluster/common_scheme_ydb.md#nodes).
+
+To avoid such situations, {{ ydb-short-name }} has a system [tablet](../../concepts/cluster/common_scheme_ydb.md#tablets) that monitors the state of the cluster - the *Cluster Management System (CMS)*. The CMS allows you to answer the question of whether a {{ ydb-short-name }} node or host running {{ ydb-short-name }} nodes can be safely taken out for maintenance. To do this, create a [maintenance task](#maintenance-task) in the CMS and specify in it to acquire exclusive locks on the nodes or hosts that will be involved in the maintenance. The cluster components on which the locks are acquired are considered unavailable from the CMS perspective and can be safely maintained. The CMS will [check](#checking-algorithm) the current state of the cluster and acquire locks only if the maintenance comply with the [availability mode](#availability-mode) and [unavailable node limits](#unavailable-node-limits).
+
+{% note warning "Faults during maintenance" %}
+
+During maintenance activities whose safety is guaranteed by the CMS, faults unrelated to those activities may occur in the cluster. If the faults threaten the availability of the cluster, urgent completion of the maintenance can help mitigate the risk of loss of availability.
+
+{% endnote %}
+
+## Maintenance task {#maintenance-task}
+
+A *maintenance task* is a set of *actions* that the user asks the CMS to perform for safe maintenance.
+
+Supported actions:
+- Acquiring an exclusive lock on a cluster component — node or host.
+
+In a task, actions are divided into groups. Actions from the same group are performed atomically. Currently, groups can consist of only one action.
+
+If it's not possible to perform an action at the time of the request, the CMS informs you of the reason and the time when it is worth *refreshing* the task, and sets the action status to *pending*. When the task is refreshed, the CMS attempts to perform the pending actions again.
+
+*Performed* actions have a deadline after which they are considered *completed* and stop having an effect on the cluster. For example, an exclusive lock is released. An action can be completed early.
+
+{% note info "Protracted maintenance" %}
+
+If maintenance continues after the actions that were performed to make it safe have been completed, this is considered a fault in the cluster.
+
+{% endnote %}
+
+Completed actions are automatically removed from the task.
+
+### Availability mode {#availability-mode}
+
+In a maintenance task, you need to specify the availability mode of the cluster to be complied when checking whether actions can be performed. The following modes are supported:
+- **Strong** - a mode that minimizes the risk of availability loss.
+    - No more than one unavailable [VDisk](../../concepts/cluster/distributed_storage.md#storage-groups) is allowed in each affected storage group.
+    - No more than one unavailable State Storage ring is allowed.
+- **Weak** - a mode that does not allow exceeding the failure model.
+    - No more than two unavailable VDisks are allowed for affected storage groups with the [block-4-2](../../deploy/configuration/config.md#reliability) scheme.
+    - No more than four unavailable VDisks, three of which must be in the same data center, are allowed for affected storage groups with the [mirror-3-dc](../../deploy/configuration/config.md#reliability) scheme. 
+    - No more than `(nto_select - 1) / 2` unavailable State Storage rings are allowed.
+- **Force** - forced mode, the failure model is ignored. Not recommended for use.
+
+### Priority {#priority}
+
+You can specify the priority of a maintenance task. A lower value means a higher priority.
+
+The actions of the task cannot be performed until all conflicting actions from tasks with a higher priority are completed. Tasks with the same priority have no advantage over each other.
+
+## Unavailable node limits {#unavailable-node-limits}
+
+In the CMS configuration, you can configure limits on the number of unavailable nodes for a database (tenant) or for the cluster as a whole. Relative and absolute limits are supported.
+
+By default, no more than 10% of unavailable nodes are allowed for each database and the cluster as a whole.
+
+## Checking algorithm {#checking-algorithm}
+
+To check if the actions of a maintenance task can be performed, the CMS sequentially goes through each action group in the task and checks the action from the group:
+- If the object of the action is a host, the CMS checks whether the action can be performed with all nodes running on the host. 
+- If the object of the action is a node, the CMS checks:
+    - Whether there is a lock on the node.
+    - Whether it's possible to lock the node according to the limits of unavailable nodes.
+    - Whether it's possible to lock all VDisks of the node according to the availability mode.
+    - Whether it's possible to lock the State Storage ring of the node according to the availability mode.
+    - Whether it's possible to lock the node according to the limit of unavailable nodes on which cluster system tablets can run.
+
+If the checks are successful, the action can be performed and a temporary locks are acquired on the checked nodes. The CMS then considers the next group of actions. Temporary locks help to understand whether the actions requested in different groups conflict with each other. Once the check is complete, the temporary locks are released.
+
+## Examples {#examples}
+
+The [ydbops](https://github.com/ydb-platform/ydbops) utility tool uses CMS for cluster maintenance without downtime. You can also use the CMS directly through the [gRPC API](https://github.com/ydb-platform/ydb/blob/main/ydb/public/api/grpc/draft/ydb_maintenance_v1.proto).
+
+### Take out a node for maintenance {#node-maintenance}
+
+{% note info "Functionality in development" %}
+
+Functionality is expected in upcoming versions of ydbops.
+
+{% endnote %}
+
+To take out a node for maintenance, you can use the command:
+```
+$ ydbops node maintenance --host <node_fqdn>
+```
+When executing this command, ydbops will acquire an exclusive lock on the node in CMS.
+
+### Rolling restart {##rolling-restart}
+
+To perform a rolling restart of the entire cluster you can use the command:
+```
+$ ydbops restart --endpoint grpc://<cluster-fqdn> --availability-mode strong
+```
+The ydbops utility will automatically create a maintenance task to restart the entire cluster using the given availability mode. As it progresses, ydbops will refresh the maintenance task and acquire exclusive locks on the nodes in the CMS until all nodes are restarted.
@@ -11,5 +11,7 @@ items:
     include: { mode: link, path: manual/toc_p.yaml }
   - name: Changing an actor system's configuration
     href: manual/change_actorsystem_configs.md
+  - name: Maintenance without downtime
+    href: manual/maintenance-without-downtime.md
   - name: Updating configurations via CMS
     href: manual/cms.md
@@ -35,3 +35,5 @@ items:
     href: ../../maintenance/manual/cms.md
 - name: Системные таблицы
   href: system-views.md
+- name: Обслуживание без потери доступности
+  href: ../../maintenance/manual/maintenance-without-downtime.md
diff --git a/...aintenance/maintenance-without-outages.md → ...ce/manual/maintenance-without-downtime.md b/...aintenance/maintenance-without-outages.md → ...ce/manual/maintenance-without-downtime.md
@@ -1,11 +1,11 @@
 # Обслуживание кластера без потери доступности
 
 Периодически кластер {{ ydb-short-name }} необходимо обслуживать, например, обновлять его версию или заменять сломавшиеся диски. Работы по обслуживанию могут привести к недоступности кластера или имеющихся баз данных из-за:
-- Превышения модели отказа затронутых [групп хранения](../concepts/databases.md#storage-groups).
-- Превышения модели отказа [State Storage](../deploy/configuration/config.md#domains-state).
-- Недостатка вычислительных ресурсов вследствие остановки слишком большого количества [динамических узлов](../concepts/cluster/common_scheme_ydb.md#nodes).
+- Превышения модели отказа затронутых [групп хранения](../../concepts/databases.md#storage-groups).
+- Превышения модели отказа [State Storage](../../deploy/configuration/config.md#domains-state).
+- Недостатка вычислительных ресурсов вследствие остановки слишком большого количества [динамических узлов](../../concepts/cluster/common_scheme_ydb.md#nodes).
 
-Для избежания таких ситуаций в {{ ydb-short-name }} есть системная [таблетка](../concepts/cluster/common_scheme_ydb.md#tablets), которая следит за состоянием кластера — *Cluster Management System (CMS)*. CMS позволяет ответить на вопрос можно ли безопасно вывести в обслуживание узел {{ ydb-short-name }} или хост, на котором работают узлы {{ ydb-short-name }}. Для этого необходимо создать [задачу обслуживания](#maintenance-task) в CMS и указать в ней взятие эксклюзивных блокировок на узлы или хосты, которые будут задействованы в обслуживании. Компоненты кластера, на которые взяты блокировки, считаются недоступными с точки зрения CMS, и их можно безопасно обслуживать. CMS [проверит](#check-task-actions-algorithm) текущее состояние кластера и возьмет блокировки, только если работы по обслуживанию соответствуют ограничениям [режима доступности](#availability-mode) и [лимитам недоступных узлов](#unavailable-node-limits).
+Для избежания таких ситуаций в {{ ydb-short-name }} есть системная [таблетка](../../concepts/cluster/common_scheme_ydb.md#tablets), которая следит за состоянием кластера — *Cluster Management System (CMS)*. CMS позволяет ответить на вопрос можно ли безопасно вывести в обслуживание узел {{ ydb-short-name }} или хост, на котором работают узлы {{ ydb-short-name }}. Для этого необходимо создать [задачу обслуживания](#maintenance-task) в CMS и указать в ней взятие эксклюзивных блокировок на узлы или хосты, которые будут задействованы в обслуживании. Компоненты кластера, на которые взяты блокировки, считаются недоступными с точки зрения CMS, и их можно безопасно обслуживать. CMS [проверит](#checking-algorithm) текущее состояние кластера и возьмет блокировки, только если работы по обслуживанию соответствуют ограничениям [режима доступности](#availability-mode) и [лимитам недоступных узлов](#unavailable-node-limits).
 
 {% note warning "Поломки во время проведения работ" %}
 
@@ -38,11 +38,11 @@
 
 В задаче обслуживания необходимо указать режим доступности кластера, который должен соблюдаться при проверке возможности выполнения действий. Поддерживаются следующие режимы:
 - **Strong** — режим, минимизирующий риск потери доступности.
-    - Допускается не более одного недоступного [VDisk](../concepts/cluster/distributed_storage.md#storage-groups) в каждой из затрагиваемых групп хранения.
+    - Допускается не более одного недоступного [VDisk](../../concepts/cluster/distributed_storage.md#storage-groups) в каждой из затрагиваемых групп хранения.
     - Допускается не более одного недоступного кольца State Storage.
 - **Weak** — режим, не позволяющий превысить модель отказа.
-    - Допускается не более двух недоступных VDisk-ов для затрагиваемых групп хранения со схемой [block-4-2](../administration/production-storage-config.md#reliability).
-    - Допускается не более четырех недоступных VDisk-ов, три из которых должны находиться в одном датацентре, для затрагиваемых групп хранения со схемой [mirror-3-dc](../administration/production-storage-config.md#reliability). 
+    - Допускается не более двух недоступных VDisk-ов для затрагиваемых групп хранения со схемой [block-4-2](../../deploy/configuration/config.md#reliability).
+    - Допускается не более четырех недоступных VDisk-ов, три из которых должны находиться в одном датацентре, для затрагиваемых групп хранения со схемой [mirror-3-dc](../../deploy/configuration/config.md#reliability). 
     - Допускается не более `(nto_select - 1) / 2` недоступных колец State Storage.
 - **Force** — принудительный режим, модель отказа игнорируется. Не рекомендуется к использованию.
 
@@ -58,7 +58,7 @@
 
 По умолчанию допускается не более 10% недоступных узлов для каждой базы данных и кластера в целом.
 
-## Алгоритм проверки действий задачи {#check-task-actions-algorithm}
+## Алгоритм проверки {#checking-algorithm}
 
 Для того, чтобы проверить можно ли выполнить действия задачи обслуживания, CMS последовательно идет по каждой группе действий в задаче и проверяет действие из группы:
 - Если объектом действия является хост, то CMS проверяет можно ли выполнить действие со всеми узлами, запущенными на хосте. 

@@ -1,3 +1,5 @@
+# Безопасный рестарт и выключение узлов
+
 ## Остановка/рестарт процесса ydb на узле {#restart_process}
 
 Чтобы убедиться, что процесс можно остановить, надо выполнить следующие шаги.

@@ -12,7 +12,7 @@ items:
   - name: Изменение конфигурации актор-системы
     href: manual/change_actorsystem_configs.md
   - name: Обслуживание кластера без потери доступности
-    href: maintenance-without-outages.md
+    href: manual/maintenance-without-downtime.md
   - name: Управление конфигурацией кластера
     items:
     - name: Обзор конфигурации