Add documentation changes for disk-based k-NN #8246

jmazanec15 · 2024-09-12T19:12:23Z

Description

Part of #8075, this PR adds documentation for the disk-based feature for OpenSearch k-NN. See opensearch-project/k-NN#1779.

First, to support this project, we had to allow space_type in the k-NN mapping to be configured in the root level mapping of the knn_vector field. So, space type can be specified in one of 2 ways:

      "my_vector_field": {
        "type": "knn_vector",
        "dimension": 8,
        "method": {
        "space_type": "l2",
        ...
        }
      }

or

      "my_vector_field": {
        "type": "knn_vector",
        "dimension": 8,
        "space_type": "l2",
        "method": {
        ...
        }
      }

I updated this.

Next, we added functionality to execute a rescore phase of the k-NN search to improve search on quantized indices. To add this:

GET my-vector-index/_search
{
  "size": 2,
  "query": {
    "knn": {
      "my_vector_field": {
        "vector": [1.5, 5.5,1.5, 5.5,1.5, 5.5,1.5, 5.5,1.5, 5.5],
        "k": 10,
        "rescore": {
           "oversample_factor": 1.2
        }
      }
    }
  }
}

I updated this.

Lastly, we introduced new parameters to the k-NN vector field mapping called mode and compression_level. These 2 parameters, when set, will configure the default parameter resolution of the field, which enables us to give strong out of box experience for multiple different work load skew. in_memory is the default mode and maps to our current defaults. on_disk is a new mode that adds default quantization and rescoring so that k-NN can run with strong recall performance in low-memory environments.

As we are close to the release, I wanted to get this PR up.

Issues Resolved

closes #8075

Version

2.17 and beyong

Frontend features

N/A

Checklist

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license and subject to the Developers Certificate of Origin.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

github-actions · 2024-09-12T19:12:37Z

Thank you for submitting your PR. The PR states are In progress (or Draft) -> Tech review -> Doc review -> Editorial review -> Merged.

Before you submit your PR for doc review, make sure the content is technically accurate. If you need help finding a tech reviewer, tag a maintainer.

When you're ready for doc review, tag the assignee of this PR. The doc reviewer may push edits to the PR directly or leave comments and editorial suggestions for you to address (let us know in a comment if you have a preference). The doc reviewer will arrange for an editorial review.

Signed-off-by: John Mazanec <jmazane@amazon.com>

shatejas · 2024-09-13T18:20:41Z

_field-types/supported-field-types/knn-vector.md

+
+Right now, 2 modes are supported:
+* `in_memory` (default) - the `in_memory` mode represents the current default for vector search in OpenSearch. By default, it will use the `nmslib` engine and not configure any compression_level. This mode should be preferred if low-latency is required for your application.
+* `on_disk` - the `on_disk` mode is used to provide low-cost vector search while maintaining strong recall. The `on_disk` mode by default uses `32x` compression via binary quantization and a default rescoring oversample factor of 2.0. This mode should be used if the workload requires a lower cost. `on_disk` is only supported for `float` vector types. Because `on_disk` mode requires quantization with re-scoring, the `1x` compression level cannot be used. 


nit: A table would be nice and consistent with existing documentation. The headings can be, mode, engines supported (highlight the default here), compression supported (highlight the default here) and then guidance.

Suggested change

* `on_disk` - the `on_disk` mode is used to provide low-cost vector search while maintaining strong recall. The `on_disk` mode by default uses `32x` compression via binary quantization and a default rescoring oversample factor of 2.0. This mode should be used if the workload requires a lower cost. `on_disk` is only supported for `float` vector types. Because `on_disk` mode requires quantization with re-scoring, the `1x` compression level cannot be used.

* `on_disk` - the `on_disk` mode is used to provide low-cost vector search while maintaining strong recall. The `on_disk` mode by default uses `32x` compression via binary quantization and a default rescoring oversample factor of 3.0. This mode should be used if the workload requires a lower cost. `on_disk` is only supported for `float` vector types. Because `on_disk` mode requires quantization with re-scoring, the `1x` compression level cannot be used.

shatejas · 2024-09-13T18:32:03Z

_search-plugins/knn/approximate-knn.md

+}
+```
+
+The `oversample_factor` is a floating point number between 0.0 and 100.0. `oversample_factor*k` will always be greater than or equal to 100 and less than or equal to 10,000.


nit: Worth mentioning the defaults again here just incase someone is skimming through and directly jumps on to this section

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

navneet1v · 2024-09-13T18:18:19Z

_field-types/supported-field-types/knn-vector.md

    }
  },
  "mappings": {
    "properties": {
      "my_vector": {
        "type": "knn_vector",
        "dimension": 3,
+        "space_type": "l2",
+        "mode": "in_memory",
+        "compression_level": "2x",


why we are putting a 2x as default compression here?

I figured Id show all the parameters and what they look like

navneet1v · 2024-09-13T18:19:03Z

_field-types/supported-field-types/knn-vector.md

-          "engine": "lucene",
          "parameters": {
            "ef_construction": 128,
            "m": 24


[See if you want to update this] : we can reduce these hyper parameter values to 100, 16.

should I just not specify?

navneet1v · 2024-09-13T18:20:08Z

_field-types/supported-field-types/knn-vector.md

    }
  },
  "mappings": {
    "properties": {
      "my_vector": {
        "type": "knn_vector",
        "dimension": 3,
+        "space_type": "l2",


I think on this example we should give a best default experience. Which is no mode, no compression, just spaceType, dim and type attributes. What you think?

sure - only thing is that I believe defaults will be picked up from index_settings in this case.

navneet1v · 2024-09-13T18:49:52Z

_field-types/supported-field-types/knn-vector.md

+`compression_level` is a string-based mapping parameter that selects a quantization encoder that will reduce the memory consumption of the vectors by the given factor. Valid values are:
+- `1x` (supported by nmslib, lucene and faiss engines)


should we put this in a table

navneet1v · 2024-09-13T18:50:26Z

_field-types/supported-field-types/knn-vector.md

+
+For example, if a `32x` `compression_level` is passed for a `float32` index of 768-dimensional vectors, the per-vector memory should drop from `4*768` = 3072 bytes to `3072/32` = 846 bytes. Internally, binary quantization (which maps a float to a bit) may be used to achieve this.
+
+If the `compression_level` parameter is set, an `encoder` cannot be specifed in the `method` mapping. `compression_level` greater than `1x` are only supported for `float` vector types.


let put this as a note.

navneet1v · 2024-09-13T18:53:38Z

_field-types/supported-field-types/knn-vector.md

@@ -47,6 +47,28 @@ PUT test-index
 ```
 {% include copy-curl.html %}

+## Vector workload modes


can we have table of mode, compression and which engine will be used in the docs?

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

Signed-off-by: John Mazanec <jmazane@amazon.com>

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

natebower

@jmazanec15 @kolchfa-aws Please see my comments and changes and let me know if you have any questions. I'd like to reread lines 237 and 354 in api.md and line 86 in knn-index.md before approving. Thanks!

_field-types/supported-field-types/knn-vector.md

_query-dsl/specialized/neural.md

_search-plugins/knn/approximate-knn.md

natebower · 2024-09-16T09:14:21Z

_search-plugins/knn/approximate-knn.md

+| `4x`              | No default rescoring             |
+| `2x`              | No default rescoring             |
+
+To explicitly apply rescoring, provide the `rescore` parameter in a query on a quantized index and specify the `oversample_factor`:


Suggested change

To explicitly apply rescoring, provide the `rescore` parameter in a query on a quantized index and specify the `oversample_factor`:

To explicitly apply rescoring, provide the `rescore` parameter in a quantized index query and specify the `oversample_factor`:

_search-plugins/knn/approximate-knn.md

_search-plugins/knn/knn-index.md

Co-authored-by: Nathan Bower <nbower@amazon.com> Signed-off-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com>

natebower

LGTM

jmazanec15 · 2024-09-16T15:34:07Z

Thanks @natebower and @kolchfa-aws!

jmazanec15 requested review from kolchfa-aws, Naarcha-AWS, vagimeli, AMoo-Miki, natebower, dlvenable, stephen-crawford and epugh as code owners September 12, 2024 19:12

github-actions bot assigned kolchfa-aws Sep 12, 2024

kolchfa-aws added release-notes PR: Include this PR in the automated release notes v2.17.0 labels Sep 12, 2024

jmazanec15 added 4 commits September 12, 2024 12:26

Add space type as top level

56e911b

Signed-off-by: John Mazanec <jmazane@amazon.com>

Add new rescore parameter

373df52

Signed-off-by: John Mazanec <jmazane@amazon.com>

Add new rescore parameter

9c2cb0f

Signed-off-by: John Mazanec <jmazane@amazon.com>

add docs for compression and mode

dd988f1

Signed-off-by: John Mazanec <jmazane@amazon.com>

jmazanec15 force-pushed the issue-8075-top-level-spacetype branch from 42b4d6b to dd988f1 Compare September 12, 2024 19:26

jmazanec15 mentioned this pull request Sep 12, 2024

[DOC] k-NN Disk Based Feature documentation #8075

Closed

4 tasks

Clean up compression docs

876691d

Signed-off-by: John Mazanec <jmazane@amazon.com>

shatejas reviewed Sep 13, 2024

View reviewed changes

Doc review

97fcc92

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

navneet1v reviewed Sep 13, 2024

View reviewed changes

kolchfa-aws and others added 3 commits September 13, 2024 15:00

Merge upstream changes

0ff19fc

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

Update a few things

e613c2e

Signed-off-by: John Mazanec <jmazane@amazon.com>

Doc review

0f9f081

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

natebower requested changes Sep 16, 2024

View reviewed changes

natebower added the 5 - Editorial review PR: Editorial review in progress label Sep 16, 2024

kolchfa-aws and others added 2 commits September 16, 2024 09:52

Apply suggestions from code review

ec92eea

Co-authored-by: Nathan Bower <nbower@amazon.com> Signed-off-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com>

Merge branch 'main' into issue-8075-top-level-spacetype

8984db0

natebower approved these changes Sep 16, 2024

View reviewed changes

kolchfa-aws merged commit 967f257 into opensearch-project:main Sep 16, 2024
6 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add documentation changes for disk-based k-NN #8246

Add documentation changes for disk-based k-NN #8246

jmazanec15 commented Sep 12, 2024 •

edited by gaiksaya

Loading

github-actions bot commented Sep 12, 2024

shatejas Sep 13, 2024 •

edited

Loading

shatejas Sep 13, 2024

navneet1v Sep 13, 2024

jmazanec15 Sep 13, 2024

navneet1v Sep 13, 2024

jmazanec15 Sep 13, 2024

navneet1v Sep 13, 2024

jmazanec15 Sep 13, 2024

navneet1v Sep 13, 2024

navneet1v Sep 13, 2024

navneet1v Sep 13, 2024

natebower left a comment

natebower Sep 16, 2024

natebower left a comment

jmazanec15 commented Sep 16, 2024

		`compression_level` is a string-based mapping parameter that selects a quantization encoder that will reduce the memory consumption of the vectors by the given factor. Valid values are:
		- `1x` (supported by nmslib, lucene and faiss engines)


		For example, if a `32x` `compression_level` is passed for a `float32` index of 768-dimensional vectors, the per-vector memory should drop from `4*768` = 3072 bytes to `3072/32` = 846 bytes. Internally, binary quantization (which maps a float to a bit) may be used to achieve this.

		If the `compression_level` parameter is set, an `encoder` cannot be specifed in the `method` mapping. `compression_level` greater than `1x` are only supported for `float` vector types.

	To explicitly apply rescoring, provide the `rescore` parameter in a query on a quantized index and specify the `oversample_factor`:
	To explicitly apply rescoring, provide the `rescore` parameter in a quantized index query and specify the `oversample_factor`:

Add documentation changes for disk-based k-NN #8246

Add documentation changes for disk-based k-NN #8246

Conversation

jmazanec15 commented Sep 12, 2024 • edited by gaiksaya Loading

Description

Issues Resolved

Version

Frontend features

Checklist

github-actions bot commented Sep 12, 2024

shatejas Sep 13, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

natebower left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

natebower left a comment

Choose a reason for hiding this comment

jmazanec15 commented Sep 16, 2024

jmazanec15 commented Sep 12, 2024 •

edited by gaiksaya

Loading

shatejas Sep 13, 2024 •

edited

Loading