Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add documentation changes for disk-based k-NN #8246

Merged

Conversation

jmazanec15
Copy link
Member

@jmazanec15 jmazanec15 commented Sep 12, 2024

Description

Part of #8075, this PR adds documentation for the disk-based feature for OpenSearch k-NN. See opensearch-project/k-NN#1779.

First, to support this project, we had to allow space_type in the k-NN mapping to be configured in the root level mapping of the knn_vector field. So, space type can be specified in one of 2 ways:

      "my_vector_field": {
        "type": "knn_vector",
        "dimension": 8,
        "method": {
        "space_type": "l2",
        ...
        }
      }

or

      "my_vector_field": {
        "type": "knn_vector",
        "dimension": 8,
        "space_type": "l2",
        "method": {
        ...
        }
      }

I updated this.

Next, we added functionality to execute a rescore phase of the k-NN search to improve search on quantized indices. To add this:

GET my-vector-index/_search
{
  "size": 2,
  "query": {
    "knn": {
      "my_vector_field": {
        "vector": [1.5, 5.5,1.5, 5.5,1.5, 5.5,1.5, 5.5,1.5, 5.5],
        "k": 10,
        "rescore": {
           "oversample_factor": 1.2
        }
      }
    }
  }
}

I updated this.

Lastly, we introduced new parameters to the k-NN vector field mapping called mode and compression_level. These 2 parameters, when set, will configure the default parameter resolution of the field, which enables us to give strong out of box experience for multiple different work load skew. in_memory is the default mode and maps to our current defaults. on_disk is a new mode that adds default quantization and rescoring so that k-NN can run with strong recall performance in low-memory environments.

As we are close to the release, I wanted to get this PR up.

Issues Resolved

closes #8075

Version

2.17 and beyong

Frontend features

N/A

Checklist

  • By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license and subject to the Developers Certificate of Origin.
    For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Copy link

Thank you for submitting your PR. The PR states are In progress (or Draft) -> Tech review -> Doc review -> Editorial review -> Merged.

Before you submit your PR for doc review, make sure the content is technically accurate. If you need help finding a tech reviewer, tag a maintainer.

When you're ready for doc review, tag the assignee of this PR. The doc reviewer may push edits to the PR directly or leave comments and editorial suggestions for you to address (let us know in a comment if you have a preference). The doc reviewer will arrange for an editorial review.

@kolchfa-aws kolchfa-aws added release-notes PR: Include this PR in the automated release notes v2.17.0 labels Sep 12, 2024
Signed-off-by: John Mazanec <jmazane@amazon.com>
Signed-off-by: John Mazanec <jmazane@amazon.com>
Signed-off-by: John Mazanec <jmazane@amazon.com>
Signed-off-by: John Mazanec <jmazane@amazon.com>
Signed-off-by: John Mazanec <jmazane@amazon.com>

Right now, 2 modes are supported:
* `in_memory` (default) - the `in_memory` mode represents the current default for vector search in OpenSearch. By default, it will use the `nmslib` engine and not configure any compression_level. This mode should be preferred if low-latency is required for your application.
* `on_disk` - the `on_disk` mode is used to provide low-cost vector search while maintaining strong recall. The `on_disk` mode by default uses `32x` compression via binary quantization and a default rescoring oversample factor of 2.0. This mode should be used if the workload requires a lower cost. `on_disk` is only supported for `float` vector types. Because `on_disk` mode requires quantization with re-scoring, the `1x` compression level cannot be used.
Copy link
Contributor

@shatejas shatejas Sep 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: A table would be nice and consistent with existing documentation. The headings can be, mode, engines supported (highlight the default here), compression supported (highlight the default here) and then guidance.

Suggested change
* `on_disk` - the `on_disk` mode is used to provide low-cost vector search while maintaining strong recall. The `on_disk` mode by default uses `32x` compression via binary quantization and a default rescoring oversample factor of 2.0. This mode should be used if the workload requires a lower cost. `on_disk` is only supported for `float` vector types. Because `on_disk` mode requires quantization with re-scoring, the `1x` compression level cannot be used.
* `on_disk` - the `on_disk` mode is used to provide low-cost vector search while maintaining strong recall. The `on_disk` mode by default uses `32x` compression via binary quantization and a default rescoring oversample factor of 3.0. This mode should be used if the workload requires a lower cost. `on_disk` is only supported for `float` vector types. Because `on_disk` mode requires quantization with re-scoring, the `1x` compression level cannot be used.

}
```

The `oversample_factor` is a floating point number between 0.0 and 100.0. `oversample_factor*k` will always be greater than or equal to 100 and less than or equal to 10,000.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Worth mentioning the defaults again here just incase someone is skimming through and directly jumps on to this section

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>
}
},
"mappings": {
"properties": {
"my_vector": {
"type": "knn_vector",
"dimension": 3,
"space_type": "l2",
"mode": "in_memory",
"compression_level": "2x",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why we are putting a 2x as default compression here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I figured Id show all the parameters and what they look like

Comment on lines 37 to 40
"engine": "lucene",
"parameters": {
"ef_construction": 128,
"m": 24
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[See if you want to update this] : we can reduce these hyper parameter values to 100, 16.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should I just not specify?

}
},
"mappings": {
"properties": {
"my_vector": {
"type": "knn_vector",
"dimension": 3,
"space_type": "l2",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think on this example we should give a best default experience. Which is no mode, no compression, just spaceType, dim and type attributes. What you think?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure - only thing is that I believe defaults will be picked up from index_settings in this case.

Comment on lines 60 to 61
`compression_level` is a string-based mapping parameter that selects a quantization encoder that will reduce the memory consumption of the vectors by the given factor. Valid values are:
- `1x` (supported by nmslib, lucene and faiss engines)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we put this in a table


For example, if a `32x` `compression_level` is passed for a `float32` index of 768-dimensional vectors, the per-vector memory should drop from `4*768` = 3072 bytes to `3072/32` = 846 bytes. Internally, binary quantization (which maps a float to a bit) may be used to achieve this.

If the `compression_level` parameter is set, an `encoder` cannot be specifed in the `method` mapping. `compression_level` greater than `1x` are only supported for `float` vector types.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let put this as a note.

@@ -47,6 +47,28 @@ PUT test-index
```
{% include copy-curl.html %}

## Vector workload modes
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we have table of mode, compression and which engine will be used in the docs?

kolchfa-aws and others added 3 commits September 13, 2024 15:00
Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>
Signed-off-by: John Mazanec <jmazane@amazon.com>
Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>
Copy link
Collaborator

@natebower natebower left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jmazanec15 @kolchfa-aws Please see my comments and changes and let me know if you have any questions. I'd like to reread lines 237 and 354 in api.md and line 86 in knn-index.md before approving. Thanks!

_field-types/supported-field-types/knn-vector.md Outdated Show resolved Hide resolved
_field-types/supported-field-types/knn-vector.md Outdated Show resolved Hide resolved
_field-types/supported-field-types/knn-vector.md Outdated Show resolved Hide resolved
_field-types/supported-field-types/knn-vector.md Outdated Show resolved Hide resolved
_query-dsl/specialized/neural.md Outdated Show resolved Hide resolved
_search-plugins/knn/approximate-knn.md Outdated Show resolved Hide resolved
| `4x` | No default rescoring |
| `2x` | No default rescoring |

To explicitly apply rescoring, provide the `rescore` parameter in a query on a quantized index and specify the `oversample_factor`:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
To explicitly apply rescoring, provide the `rescore` parameter in a query on a quantized index and specify the `oversample_factor`:
To explicitly apply rescoring, provide the `rescore` parameter in a quantized index query and specify the `oversample_factor`:

_search-plugins/knn/approximate-knn.md Outdated Show resolved Hide resolved
_search-plugins/knn/knn-index.md Outdated Show resolved Hide resolved
_search-plugins/knn/knn-index.md Show resolved Hide resolved
@natebower natebower added the 5 - Editorial review PR: Editorial review in progress label Sep 16, 2024
kolchfa-aws and others added 2 commits September 16, 2024 09:52
Co-authored-by: Nathan Bower <nbower@amazon.com>
Signed-off-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com>
Copy link
Collaborator

@natebower natebower left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@kolchfa-aws kolchfa-aws merged commit 967f257 into opensearch-project:main Sep 16, 2024
6 checks passed
@jmazanec15
Copy link
Member Author

Thanks @natebower and @kolchfa-aws!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
5 - Editorial review PR: Editorial review in progress release-notes PR: Include this PR in the automated release notes v2.17.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[DOC] k-NN Disk Based Feature documentation
5 participants