From f18a5e3c5fe30840c6aeef8b458a21328a29028c Mon Sep 17 00:00:00 2001 From: Ashish Singh Date: Mon, 9 Sep 2024 11:40:41 +0530 Subject: [PATCH 1/7] Create documentation for snapshots with hashed prefix path type Signed-off-by: Ashish Singh --- _api-reference/snapshots/create-repository.md | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/_api-reference/snapshots/create-repository.md b/_api-reference/snapshots/create-repository.md index ca4c04114c..39ea77bc9c 100644 --- a/_api-reference/snapshots/create-repository.md +++ b/_api-reference/snapshots/create-repository.md @@ -43,6 +43,14 @@ The following table lists parameters that can be used with both the `fs` and `s3 Request field | Description :--- | :--- `prefix_mode_verification` | When enabled, adds a hashed value of a random seed to the prefix for repository verification. For remote-store-enabled clusters, you can add the `setting.prefix_mode_verification` setting to the node attributes for the supplied repository. This field works with both new and existing repositories. Optional. +`shard_path_type` | This setting is used to control the path stucture of shard level blobs. Defaults to fixed. There are 2 other possible values - 1. `hashed_prefix` 2. `hashed_infix`. More details are shared below. Optional. + +##### shard_path_type values +1. FIXED - This keep the path structure in the existing hierarchical manner. eg - `//indices//0/` +2. HASHED_PREFIX - This prepends a hashed prefix at the start of path for each unique shard id. eg - `///indices//0/` +3. HASHED_INFIX - This appends a hashed prefix after the base path for each unique shard id. eg - `///indices//0/` + +Note - The hash method used is `fnv_1a_composite_1`. It uses the FNV1a hash function and generates a custom encoded hash value that scales well with most remote store options. The FNV1a function generates 64-bit value. The custom encoding uses the most significant 6 bits to create a url-safe base64 character and the next 14 bits to create a binary string. ### fs repository From 656459a5b2ee5153f79caca19824566f0ab55828 Mon Sep 17 00:00:00 2001 From: Ashish Singh Date: Mon, 9 Sep 2024 13:22:19 +0530 Subject: [PATCH 2/7] Add documentation on new cluster settings for fixed prefix Signed-off-by: Ashish Singh --- .../configuring-opensearch/index-settings.md | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/_install-and-configure/configuring-opensearch/index-settings.md b/_install-and-configure/configuring-opensearch/index-settings.md index a1894a0d2c..141c030903 100644 --- a/_install-and-configure/configuring-opensearch/index-settings.md +++ b/_install-and-configure/configuring-opensearch/index-settings.md @@ -73,6 +73,12 @@ OpenSearch supports the following dynamic cluster-level index settings: - `cluster.remote_store.segment.transfer_timeout` (Time unit): Controls the maximum amount of time to wait for all new segments to update after refresh to the remote store. If the upload does not complete within a specified amount of time, it throws a `SegmentUploadFailedException` error. Default is `30m`. It has a minimum constraint of `10m`. +- `cluster.remote_store.translog.path.prefix` (String): Controls the fixed path prefix for translog data on a remote store enabled cluster. This setting is effective when `cluster.remote_store.index.path.type` setting is either `hashed_prefix` or `hashed_infix`. This defaults to `""` (empty string). + +- `cluster.remote_store.segments.path.prefix` (String): Controls the fixed path prefix for segments data on a remote store enabled cluster. This setting is effective when `cluster.remote_store.index.path.type` setting is either `hashed_prefix` or `hashed_infix`. This defaults to `""` (empty string). + +- `cluster.snapshot.shard.path.prefix` (String): Controls the fixed path prefix for snapshot shard level blobs. This setting is effective when repository `shard_path_type` setting is either `hashed_prefix` or `hashed_infix`. This defaults to `""` (empty string). + ## Index-level index settings You can specify index settings at index creation. There are two types of index settings: From ce762a03a2c748dbd7551317825d8fa2b7b02aac Mon Sep 17 00:00:00 2001 From: Naarcha-AWS <97990722+Naarcha-AWS@users.noreply.github.com> Date: Tue, 10 Sep 2024 10:46:19 -0500 Subject: [PATCH 3/7] Update create-repository.md --- _api-reference/snapshots/create-repository.md | 13 +++++++------ 1 file changed, 7 insertions(+), 6 deletions(-) diff --git a/_api-reference/snapshots/create-repository.md b/_api-reference/snapshots/create-repository.md index 39ea77bc9c..cf813052d7 100644 --- a/_api-reference/snapshots/create-repository.md +++ b/_api-reference/snapshots/create-repository.md @@ -43,14 +43,15 @@ The following table lists parameters that can be used with both the `fs` and `s3 Request field | Description :--- | :--- `prefix_mode_verification` | When enabled, adds a hashed value of a random seed to the prefix for repository verification. For remote-store-enabled clusters, you can add the `setting.prefix_mode_verification` setting to the node attributes for the supplied repository. This field works with both new and existing repositories. Optional. -`shard_path_type` | This setting is used to control the path stucture of shard level blobs. Defaults to fixed. There are 2 other possible values - 1. `hashed_prefix` 2. `hashed_infix`. More details are shared below. Optional. +`shard_path_type` | Controls the path stucture of shard-level blobs. Supported values are `FIXED`, `HASHED_PREFIX`, or `HASHED_INFIX`. For more information about each value, see [shard_path_type values](#shard_path_type-values)/. Default is `FIXED`. Optional. -##### shard_path_type values -1. FIXED - This keep the path structure in the existing hierarchical manner. eg - `//indices//0/` -2. HASHED_PREFIX - This prepends a hashed prefix at the start of path for each unique shard id. eg - `///indices//0/` -3. HASHED_INFIX - This appends a hashed prefix after the base path for each unique shard id. eg - `///indices//0/` +#### shard_path_type values -Note - The hash method used is `fnv_1a_composite_1`. It uses the FNV1a hash function and generates a custom encoded hash value that scales well with most remote store options. The FNV1a function generates 64-bit value. The custom encoding uses the most significant 6 bits to create a url-safe base64 character and the next 14 bits to create a binary string. +The following values are supported in the `shard_path_type` setting: + +- `FIXED`: Keeps the path structure in the existing hierarchical manner, such as, `//indices//0/` +- `HASHED_PREFIX`: Prepends a hashed prefix at the start of path for each unique shard ID, for example, `///indices//0/`. +- `HASHED_INFIX` - Appends a hashed prefix after the base path for each unique shard ID, for example, `///indices//0/`. The hash method used is `fnv_1a_composite_1, which uses the `FNV1a` hash function and generates a custom-encoded 64-bit hash value that scales well with most remote store options. `FNV1a` takes the most significant 6 bits to create a url-safe base64 character and the next 14 bits to create a binary string. ### fs repository From 2d5e231e77db9e1ec5c12e761765b6aabdeca6d1 Mon Sep 17 00:00:00 2001 From: Naarcha-AWS <97990722+Naarcha-AWS@users.noreply.github.com> Date: Tue, 10 Sep 2024 11:16:59 -0500 Subject: [PATCH 4/7] Update create-repository.md --- _api-reference/snapshots/create-repository.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/_api-reference/snapshots/create-repository.md b/_api-reference/snapshots/create-repository.md index cf813052d7..9c8de84888 100644 --- a/_api-reference/snapshots/create-repository.md +++ b/_api-reference/snapshots/create-repository.md @@ -51,7 +51,7 @@ The following values are supported in the `shard_path_type` setting: - `FIXED`: Keeps the path structure in the existing hierarchical manner, such as, `//indices//0/` - `HASHED_PREFIX`: Prepends a hashed prefix at the start of path for each unique shard ID, for example, `///indices//0/`. -- `HASHED_INFIX` - Appends a hashed prefix after the base path for each unique shard ID, for example, `///indices//0/`. The hash method used is `fnv_1a_composite_1, which uses the `FNV1a` hash function and generates a custom-encoded 64-bit hash value that scales well with most remote store options. `FNV1a` takes the most significant 6 bits to create a url-safe base64 character and the next 14 bits to create a binary string. +- `HASHED_INFIX`: Appends a hashed prefix after the base path for each unique shard ID, for example, `///indices//0/`. The hash method used is `fnv_1a_composite_1, which uses the `FNV1a` hash function and generates a custom-encoded 64-bit hash value that scales well with most remote store options. `FNV1a` takes the most significant 6 bits to create a url-safe base64 character and the next 14 bits to create a binary string. ### fs repository From 6e975008a757c7ac3a264c266c22702d9a6531af Mon Sep 17 00:00:00 2001 From: Naarcha-AWS <97990722+Naarcha-AWS@users.noreply.github.com> Date: Tue, 10 Sep 2024 11:22:15 -0500 Subject: [PATCH 5/7] Apply suggestions from code review Signed-off-by: Naarcha-AWS <97990722+Naarcha-AWS@users.noreply.github.com> --- _api-reference/snapshots/create-repository.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/_api-reference/snapshots/create-repository.md b/_api-reference/snapshots/create-repository.md index 9c8de84888..11a9a9582b 100644 --- a/_api-reference/snapshots/create-repository.md +++ b/_api-reference/snapshots/create-repository.md @@ -43,7 +43,7 @@ The following table lists parameters that can be used with both the `fs` and `s3 Request field | Description :--- | :--- `prefix_mode_verification` | When enabled, adds a hashed value of a random seed to the prefix for repository verification. For remote-store-enabled clusters, you can add the `setting.prefix_mode_verification` setting to the node attributes for the supplied repository. This field works with both new and existing repositories. Optional. -`shard_path_type` | Controls the path stucture of shard-level blobs. Supported values are `FIXED`, `HASHED_PREFIX`, or `HASHED_INFIX`. For more information about each value, see [shard_path_type values](#shard_path_type-values)/. Default is `FIXED`. Optional. +`shard_path_type` | Controls the path structure of shard-level blobs. Supported values are `FIXED`, `HASHED_PREFIX`, or `HASHED_INFIX`. For more information about each value, see [shard_path_type values](#shard_path_type-values)/. Default is `FIXED`. Optional. #### shard_path_type values From 5e9e2aabf49af02d1d22b0de0f57c7a8f20c7906 Mon Sep 17 00:00:00 2001 From: Naarcha-AWS <97990722+Naarcha-AWS@users.noreply.github.com> Date: Tue, 10 Sep 2024 14:21:15 -0500 Subject: [PATCH 6/7] Apply suggestions from code review Signed-off-by: Naarcha-AWS <97990722+Naarcha-AWS@users.noreply.github.com> --- .../configuring-opensearch/index-settings.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/_install-and-configure/configuring-opensearch/index-settings.md b/_install-and-configure/configuring-opensearch/index-settings.md index 141c030903..8754247a22 100644 --- a/_install-and-configure/configuring-opensearch/index-settings.md +++ b/_install-and-configure/configuring-opensearch/index-settings.md @@ -73,11 +73,11 @@ OpenSearch supports the following dynamic cluster-level index settings: - `cluster.remote_store.segment.transfer_timeout` (Time unit): Controls the maximum amount of time to wait for all new segments to update after refresh to the remote store. If the upload does not complete within a specified amount of time, it throws a `SegmentUploadFailedException` error. Default is `30m`. It has a minimum constraint of `10m`. -- `cluster.remote_store.translog.path.prefix` (String): Controls the fixed path prefix for translog data on a remote store enabled cluster. This setting is effective when `cluster.remote_store.index.path.type` setting is either `hashed_prefix` or `hashed_infix`. This defaults to `""` (empty string). +- `cluster.remote_store.translog.path.prefix` (String): Controls the fixed path prefix for translog data on a remote store enabled cluster. This setting only applies when the `cluster.remote_store.index.path.type` setting is either `hashed_prefix` or `hashed_infix`. Default is an empty string, `""`. -- `cluster.remote_store.segments.path.prefix` (String): Controls the fixed path prefix for segments data on a remote store enabled cluster. This setting is effective when `cluster.remote_store.index.path.type` setting is either `hashed_prefix` or `hashed_infix`. This defaults to `""` (empty string). +- `cluster.remote_store.segments.path.prefix` (String): Controls the fixed path prefix for segments data on a remote store enabled cluster. This setting only applies when the `cluster.remote_store.index.path.type` setting is either `hashed_prefix` or `hashed_infix`. Default is an empty string, `""`. -- `cluster.snapshot.shard.path.prefix` (String): Controls the fixed path prefix for snapshot shard level blobs. This setting is effective when repository `shard_path_type` setting is either `hashed_prefix` or `hashed_infix`. This defaults to `""` (empty string). +- `cluster.snapshot.shard.path.prefix` (String): Controls the fixed path prefix for snapshot shard level blobs. This setting only applies when the repository `shard_path_type` setting is either `hashed_prefix` or `hashed_infix`. Default is an empty string, `""`. ## Index-level index settings From 5e7ca6685719e441fed1ac332bc03fa2f7bdc7f2 Mon Sep 17 00:00:00 2001 From: Naarcha-AWS <97990722+Naarcha-AWS@users.noreply.github.com> Date: Wed, 11 Sep 2024 12:05:15 -0500 Subject: [PATCH 7/7] Apply suggestions from code review Co-authored-by: Nathan Bower Signed-off-by: Naarcha-AWS <97990722+Naarcha-AWS@users.noreply.github.com> --- _api-reference/snapshots/create-repository.md | 8 ++++---- .../configuring-opensearch/index-settings.md | 6 +++--- 2 files changed, 7 insertions(+), 7 deletions(-) diff --git a/_api-reference/snapshots/create-repository.md b/_api-reference/snapshots/create-repository.md index 11a9a9582b..96ae9f6b1e 100644 --- a/_api-reference/snapshots/create-repository.md +++ b/_api-reference/snapshots/create-repository.md @@ -43,15 +43,15 @@ The following table lists parameters that can be used with both the `fs` and `s3 Request field | Description :--- | :--- `prefix_mode_verification` | When enabled, adds a hashed value of a random seed to the prefix for repository verification. For remote-store-enabled clusters, you can add the `setting.prefix_mode_verification` setting to the node attributes for the supplied repository. This field works with both new and existing repositories. Optional. -`shard_path_type` | Controls the path structure of shard-level blobs. Supported values are `FIXED`, `HASHED_PREFIX`, or `HASHED_INFIX`. For more information about each value, see [shard_path_type values](#shard_path_type-values)/. Default is `FIXED`. Optional. +`shard_path_type` | Controls the path structure of shard-level blobs. Supported values are `FIXED`, `HASHED_PREFIX`, and `HASHED_INFIX`. For more information about each value, see [shard_path_type values](#shard_path_type-values)/. Default is `FIXED`. Optional. #### shard_path_type values The following values are supported in the `shard_path_type` setting: -- `FIXED`: Keeps the path structure in the existing hierarchical manner, such as, `//indices//0/` -- `HASHED_PREFIX`: Prepends a hashed prefix at the start of path for each unique shard ID, for example, `///indices//0/`. -- `HASHED_INFIX`: Appends a hashed prefix after the base path for each unique shard ID, for example, `///indices//0/`. The hash method used is `fnv_1a_composite_1, which uses the `FNV1a` hash function and generates a custom-encoded 64-bit hash value that scales well with most remote store options. `FNV1a` takes the most significant 6 bits to create a url-safe base64 character and the next 14 bits to create a binary string. +- `FIXED`: Keeps the path structure in the existing hierarchical manner, such as `//indices//0/`. +- `HASHED_PREFIX`: Prepends a hashed prefix at the start of the path for each unique shard ID, for example, `///indices//0/`. +- `HASHED_INFIX`: Appends a hashed prefix after the base path for each unique shard ID, for example, `///indices//0/`. The hash method used is `FNV_1A_COMPOSITE_1`, which uses the `FNV1a` hash function and generates a custom-encoded 64-bit hash value that scales well with most remote store options. `FNV1a` takes the most significant 6 bits to create a URL-safe Base64 character and the next 14 bits to create a binary string. ### fs repository diff --git a/_install-and-configure/configuring-opensearch/index-settings.md b/_install-and-configure/configuring-opensearch/index-settings.md index 8754247a22..bd9b9651aa 100644 --- a/_install-and-configure/configuring-opensearch/index-settings.md +++ b/_install-and-configure/configuring-opensearch/index-settings.md @@ -73,11 +73,11 @@ OpenSearch supports the following dynamic cluster-level index settings: - `cluster.remote_store.segment.transfer_timeout` (Time unit): Controls the maximum amount of time to wait for all new segments to update after refresh to the remote store. If the upload does not complete within a specified amount of time, it throws a `SegmentUploadFailedException` error. Default is `30m`. It has a minimum constraint of `10m`. -- `cluster.remote_store.translog.path.prefix` (String): Controls the fixed path prefix for translog data on a remote store enabled cluster. This setting only applies when the `cluster.remote_store.index.path.type` setting is either `hashed_prefix` or `hashed_infix`. Default is an empty string, `""`. +- `cluster.remote_store.translog.path.prefix` (String): Controls the fixed path prefix for translog data on a remote-store-enabled cluster. This setting only applies when the `cluster.remote_store.index.path.type` setting is either `HASHED_PREFIX` or `HASHED_INFIX`. Default is an empty string, `""`. -- `cluster.remote_store.segments.path.prefix` (String): Controls the fixed path prefix for segments data on a remote store enabled cluster. This setting only applies when the `cluster.remote_store.index.path.type` setting is either `hashed_prefix` or `hashed_infix`. Default is an empty string, `""`. +- `cluster.remote_store.segments.path.prefix` (String): Controls the fixed path prefix for segment data on a remote-store-enabled cluster. This setting only applies when the `cluster.remote_store.index.path.type` setting is either `HASHED_PREFIX` or `HASHED_INFIX`. Default is an empty string, `""`. -- `cluster.snapshot.shard.path.prefix` (String): Controls the fixed path prefix for snapshot shard level blobs. This setting only applies when the repository `shard_path_type` setting is either `hashed_prefix` or `hashed_infix`. Default is an empty string, `""`. +- `cluster.snapshot.shard.path.prefix` (String): Controls the fixed path prefix for snapshot shard-level blobs. This setting only applies when the repository `shard_path_type` setting is either `HASHED_PREFIX` or `HASHED_INFIX`. Default is an empty string, `""`. ## Index-level index settings