Skip to content

Commit

Permalink
Adds note on reindexing existing data for semantic_text usage (#113590)
Browse files Browse the repository at this point in the history
* Adds note on reindexing existing data for semantic_text usage

* Adds note about full crawl and full sync

* Style guide related fix

* Update docs/reference/search/search-your-data/semantic-search-semantic-text.asciidoc

Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com>

---------

Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com>
  • Loading branch information
kosabogi and leemthompo authored Oct 8, 2024
1 parent bb9d612 commit 4af241b
Showing 1 changed file with 17 additions and 0 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -89,6 +89,16 @@ PUT semantic-embeddings
It will be used to generate the embeddings based on the input text.
Every time you ingest data into the related `semantic_text` field, this endpoint will be used for creating the vector representation of the text.

[NOTE]
====
If you're using web crawlers or connectors to generate indices, you have to
<<indices-put-mapping,update the index mappings>> for these indices to
include the `semantic_text` field. Once the mapping is updated, you'll need to run
a full web crawl or a full connector sync. This ensures that all existing
documents are reprocessed and updated with the new semantic embeddings,
enabling semantic search on the updated data.
====


[discrete]
[[semantic-text-load-data]]
Expand Down Expand Up @@ -118,6 +128,13 @@ Create the embeddings from the text by reindexing the data from the `test-data`
The data in the `content` field will be reindexed into the `content` semantic text field of the destination index.
The reindexed data will be processed by the {infer} endpoint associated with the `content` semantic text field.

[NOTE]
====
This step uses the reindex API to simulate data ingestion. If you are working with data that has already been indexed,
rather than using the test-data set, reindexing is required to ensure that the data is processed by the {infer} endpoint
and the necessary embeddings are generated.
====

[source,console]
------------------------------------------------------------
POST _reindex?wait_for_completion=false
Expand Down

0 comments on commit 4af241b

Please sign in to comment.