From dbc377fd3aabdd1f772bb59fdec05934700bb783 Mon Sep 17 00:00:00 2001 From: "Mark J. Hoy" Date: Mon, 8 Jul 2024 11:09:14 -0400 Subject: [PATCH 1/5] Add Amazon Bedrock Inference API to docs --- .../inference/inference-apis.asciidoc | 1 + .../inference/put-inference.asciidoc | 1 + .../inference/service-amazon-bedrock.asciidoc | 173 ++++++++++++++++++ 3 files changed, 175 insertions(+) create mode 100644 docs/reference/inference/service-amazon-bedrock.asciidoc diff --git a/docs/reference/inference/inference-apis.asciidoc b/docs/reference/inference/inference-apis.asciidoc index 896cb02a9e699..21b361bf55390 100644 --- a/docs/reference/inference/inference-apis.asciidoc +++ b/docs/reference/inference/inference-apis.asciidoc @@ -25,6 +25,7 @@ include::delete-inference.asciidoc[] include::get-inference.asciidoc[] include::post-inference.asciidoc[] include::put-inference.asciidoc[] +include::service-amazon-bedrock.asciidoc[] include::service-azure-ai-studio.asciidoc[] include::service-azure-openai.asciidoc[] include::service-cohere.asciidoc[] diff --git a/docs/reference/inference/put-inference.asciidoc b/docs/reference/inference/put-inference.asciidoc index 101c0a24b66b7..8122d93b74033 100644 --- a/docs/reference/inference/put-inference.asciidoc +++ b/docs/reference/inference/put-inference.asciidoc @@ -34,6 +34,7 @@ The create {infer} API enables you to create an {infer} endpoint and configure a The following services are available through the {infer} API, click the links to review the configuration details of the services: +* <> * <> * <> * <> diff --git a/docs/reference/inference/service-amazon-bedrock.asciidoc b/docs/reference/inference/service-amazon-bedrock.asciidoc new file mode 100644 index 0000000000000..787e1fcd635de --- /dev/null +++ b/docs/reference/inference/service-amazon-bedrock.asciidoc @@ -0,0 +1,173 @@ +[[infer-service-amazon-bedrock]] +=== Amazon Bedrock {infer} service + +Creates an {infer} endpoint to perform an {infer} task with the `amazonbedrock` service. + +[discrete] +[[infer-service-azure-ai-studio-api-request]] +==== {api-request-title} + +`PUT /_inference//` + +[discrete] +[[infer-service-azure-ai-studio-api-path-params]] +==== {api-path-parms-title} + +``:: +(Required, string) +include::inference-shared.asciidoc[tag=inference-id] + +``:: +(Required, string) +include::inference-shared.asciidoc[tag=task-type] ++ +-- +Available task types: + +* `completion`, +* `text_embedding`. +-- + +[discrete] +[[infer-service-amazon-bedrock-api-request-body]] +==== {api-request-body-title} + +`service`:: +(Required, string) The type of service supported for the specified task type. +In this case, +`amazonbedrock`. + +`service_settings`:: +(Required, object) +include::inference-shared.asciidoc[tag=service-settings] ++ +-- +These settings are specific to the `amazonbedrock` service. +-- + +`access_key`::: +(Required, string) +A valid AWS access key that has permissions to use Amazon Bedrock and access to models for inference requests. + +`secret_key`::: +(Required, string) +A valid AWS secret key that is paired with the `access_key`. +To create or manage access and secret keys, see https://docs.aws.amazon.com/IAM/latest/UserGuide/id_credentials_access-keys.html[Managing access keys for IAM users] in the AWS documentation. + +IMPORTANT: You need to provide the access and secret keys only once, during the {infer} model creation. +The <> does not retrieve your access or secret keys. +After creating the {infer} model, you cannot change the associated key pairs. +If you want to use a different access and secret key pair, delete the {infer} model and recreate it with the same name and the updated keys. + +`provider`::: +(Required, string) +The model provider for your deployment. +Note that some providers may support only certain task types. +Supported providers include: + +* `amazontitan` - available for `text_embedding` and `completion` task types +* `anthropic` - available for `completion` task type only +* `ai21labs` - available for `completion` task type only +* `cohere` - available for `text_embedding` and `completion` task types +* `meta` - available for `completion` task type only +* `mistral` - available for `completion` task type only + +`model`::: +(Required, string) +The base model ID or an ARN to a custom model based on a foundational model. +The base model IDs can be found in the https://docs.aws.amazon.com/bedrock/latest/userguide/model-ids.html[Amazon Bedrock model IDs] documentation. +Note that the model ID must be available for the provider chosen, and your IAM user must have access to the model. + +`region`::: +(Required, string) +The region that your model or ARN is deployed in. +The list of available regions per model can be found in the https://docs.aws.amazon.com/bedrock/latest/userguide/models-regions.html[Model support by AWS region] documentation. + +`rate_limit`::: +(Optional, object) +By default, the `azureaistudio` service sets the number of requests allowed per minute to `240`. +This helps to minimize the number of rate limit errors returned from Azure AI Studio. +To modify this, set the `requests_per_minute` setting of this object in your service settings: ++ +-- +include::inference-shared.asciidoc[tag=request-per-minute-example] +-- + +`task_settings`:: +(Optional, object) +include::inference-shared.asciidoc[tag=task-settings] ++ +.`task_settings` for the `completion` task type +[%collapsible%closed] +===== + +`max_new_tokens`::: +(Optional, integer) +Provides a hint for the maximum number of output tokens to be generated. +Defaults to 64. + +`temperature`::: +(Optional, float) +A number in the range of 0.0 to 1.0 that specifies the sampling temperature to use that controls the apparent creativity of generated completions. +Should not be used if `top_p` or `top_k` is specified. + +`top_p`::: +(Optional, float) +A number in the range of 0.0 to 1.0 that is an alternative value to temperature that causes the model to consider the results of the tokens with nucleus sampling probability. +Should not be used if `temperature` or `top_k` is specified. + +`top_p`::: +(Optional, float) +Only available for `anthropic`, `cohere`, and `mistral` providers. +A number in the range of 0.0 to 1.0 that is an alternative value to temperature or top_p that causes the model to consider the results of the tokens with nucleus sampling probability. +Should not be used if `temperature` or `top_p` is specified. + +===== ++ +.`task_settings` for the `text_embedding` task type +[%collapsible%closed] +===== + +There are no `task_settings` available for the `text_embedding` task type. + +[discrete] +[[inference-example-amazonbedrock]] +==== Amazon Bedrock service example + +The following example shows how to create an {infer} endpoint called `amazon_bedrock_embeddings` to perform a `text_embedding` task type. + +The list of chat completion and embeddings models that you can choose from should be a https://docs.aws.amazon.com/bedrock/latest/userguide/model-ids.html[Amazon Bedrock base model] you have access to. + +[source,console] +------------------------------------------------------------ +PUT _inference/text_embedding/amazon_bedrock_embeddings +{ + "service": "amazonbedrock", + "service_settings": { + "access_key": "", + "secret_key": "", + "region": "us-east-1", + "provider": "amazontitan", + "model": "amazon.titan-embed-text-v2:0", + } +} +------------------------------------------------------------ +// TEST[skip:TBD] + +The next example shows how to create an {infer} endpoint called `amazon_bedrock_completion` to perform a `completion` task type. + +[source,console] +------------------------------------------------------------ +PUT _inference/completion/amazon_bedrock_completion +{ + "service": "amazonbedrock", + "service_settings": { + "access_key": "", + "secret_key": "", + "region": "us-east-1", + "provider": "amazontitan", + "model": "amazon.titan-text-premier-v1:0", + } +} +------------------------------------------------------------ +// TEST[skip:TBD] From ca82238ccaa27a0f7ebc9ac695628e87fabe7090 Mon Sep 17 00:00:00 2001 From: "Mark J. Hoy" Date: Mon, 8 Jul 2024 11:24:03 -0400 Subject: [PATCH 2/5] fix example errors --- docs/reference/inference/service-amazon-bedrock.asciidoc | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/reference/inference/service-amazon-bedrock.asciidoc b/docs/reference/inference/service-amazon-bedrock.asciidoc index 787e1fcd635de..c81d6cc7912de 100644 --- a/docs/reference/inference/service-amazon-bedrock.asciidoc +++ b/docs/reference/inference/service-amazon-bedrock.asciidoc @@ -148,7 +148,7 @@ PUT _inference/text_embedding/amazon_bedrock_embeddings "secret_key": "", "region": "us-east-1", "provider": "amazontitan", - "model": "amazon.titan-embed-text-v2:0", + "model": "amazon.titan-embed-text-v2:0" } } ------------------------------------------------------------ @@ -166,7 +166,7 @@ PUT _inference/completion/amazon_bedrock_completion "secret_key": "", "region": "us-east-1", "provider": "amazontitan", - "model": "amazon.titan-text-premier-v1:0", + "model": "amazon.titan-text-premier-v1:0" } } ------------------------------------------------------------ From 9fb47751620e240bcd947933985de504ec900679 Mon Sep 17 00:00:00 2001 From: "Mark J. Hoy" Date: Mon, 8 Jul 2024 12:36:19 -0400 Subject: [PATCH 3/5] update semantic search tutorial; add changelog --- docs/changelog/110248.yaml | 5 ++ .../inference/service-amazon-bedrock.asciidoc | 8 +-- .../semantic-search-inference.asciidoc | 1 + .../infer-api-ingest-pipeline-widget.asciidoc | 17 +++++ .../infer-api-ingest-pipeline.asciidoc | 26 ++++++++ .../infer-api-mapping-widget.asciidoc | 17 +++++ .../inference-api/infer-api-mapping.asciidoc | 35 ++++++++++ .../infer-api-reindex-widget.asciidoc | 17 +++++ .../inference-api/infer-api-reindex.asciidoc | 23 +++++++ .../infer-api-requirements-widget.asciidoc | 17 +++++ .../infer-api-requirements.asciidoc | 6 ++ .../infer-api-search-widget.asciidoc | 17 +++++ .../inference-api/infer-api-search.asciidoc | 65 +++++++++++++++++++ .../infer-api-task-widget.asciidoc | 17 +++++ .../inference-api/infer-api-task.asciidoc | 26 ++++++++ 15 files changed, 293 insertions(+), 4 deletions(-) create mode 100644 docs/changelog/110248.yaml diff --git a/docs/changelog/110248.yaml b/docs/changelog/110248.yaml new file mode 100644 index 0000000000000..85739528b69c6 --- /dev/null +++ b/docs/changelog/110248.yaml @@ -0,0 +1,5 @@ +pr: 110248 +summary: "[Inference API] Add Amazon Bedrock Support to Inference API" +area: Machine Learning +type: enhancement +issues: [ ] diff --git a/docs/reference/inference/service-amazon-bedrock.asciidoc b/docs/reference/inference/service-amazon-bedrock.asciidoc index c81d6cc7912de..4cbc3d864350e 100644 --- a/docs/reference/inference/service-amazon-bedrock.asciidoc +++ b/docs/reference/inference/service-amazon-bedrock.asciidoc @@ -4,13 +4,13 @@ Creates an {infer} endpoint to perform an {infer} task with the `amazonbedrock` service. [discrete] -[[infer-service-azure-ai-studio-api-request]] +[[infer-service-amazon-bedrock-api-request]] ==== {api-request-title} `PUT /_inference//` [discrete] -[[infer-service-azure-ai-studio-api-path-params]] +[[infer-service-amazon-bedrock-api-path-params]] ==== {api-path-parms-title} ``:: @@ -85,8 +85,8 @@ The list of available regions per model can be found in the https://docs.aws.ama `rate_limit`::: (Optional, object) -By default, the `azureaistudio` service sets the number of requests allowed per minute to `240`. -This helps to minimize the number of rate limit errors returned from Azure AI Studio. +By default, the `amazonbedrock` service sets the number of requests allowed per minute to `240`. +This helps to minimize the number of rate limit errors returned from Amazon Bedrock. To modify this, set the `requests_per_minute` setting of this object in your service settings: + -- diff --git a/docs/reference/search/search-your-data/semantic-search-inference.asciidoc b/docs/reference/search/search-your-data/semantic-search-inference.asciidoc index 6ecfea0a02dbc..f89181595ae3c 100644 --- a/docs/reference/search/search-your-data/semantic-search-inference.asciidoc +++ b/docs/reference/search/search-your-data/semantic-search-inference.asciidoc @@ -17,6 +17,7 @@ For a list of supported models available on HuggingFace, refer to Azure based examples use models available through https://ai.azure.com/explore/models?selectedTask=embeddings[Azure AI Studio] or https://learn.microsoft.com/en-us/azure/ai-services/openai/concepts/models[Azure OpenAI]. Mistral examples use the `mistral-embed` model from https://docs.mistral.ai/getting-started/models/[the Mistral API]. +Amazon Bedrock examples use the `amazon.titan-embed-text-v1` model fom https://docs.aws.amazon.com/bedrock/latest/userguide/model-ids.html[the Amazon Bedrock base models]. Click the name of the service you want to use on any of the widgets below to review the corresponding instructions. diff --git a/docs/reference/tab-widgets/inference-api/infer-api-ingest-pipeline-widget.asciidoc b/docs/reference/tab-widgets/inference-api/infer-api-ingest-pipeline-widget.asciidoc index c8a42c4d0585a..6039d1de5345b 100644 --- a/docs/reference/tab-widgets/inference-api/infer-api-ingest-pipeline-widget.asciidoc +++ b/docs/reference/tab-widgets/inference-api/infer-api-ingest-pipeline-widget.asciidoc @@ -37,6 +37,12 @@ id="infer-api-ingest-mistral"> Mistral +
+
diff --git a/docs/reference/tab-widgets/inference-api/infer-api-ingest-pipeline.asciidoc b/docs/reference/tab-widgets/inference-api/infer-api-ingest-pipeline.asciidoc index a239c79e5a6d1..f95c4a6dbc8c8 100644 --- a/docs/reference/tab-widgets/inference-api/infer-api-ingest-pipeline.asciidoc +++ b/docs/reference/tab-widgets/inference-api/infer-api-ingest-pipeline.asciidoc @@ -164,3 +164,29 @@ PUT _ingest/pipeline/mistral_embeddings and the `output_field` that will contain the {infer} results. // end::mistral[] + +// tag::amazon-bedrock[] + +[source,console] +-------------------------------------------------- +PUT _ingest/pipeline/amazon_bedrock_embeddings +{ + "processors": [ + { + "inference": { + "model_id": "amazon_bedrock_embeddings", <1> + "input_output": { <2> + "input_field": "content", + "output_field": "content_embedding" + } + } + } + ] +} +-------------------------------------------------- +<1> The name of the inference endpoint you created by using the +<>, it's referred to as `inference_id` in that step. +<2> Configuration object that defines the `input_field` for the {infer} process +and the `output_field` that will contain the {infer} results. + +// end::amazon-bedrock[] diff --git a/docs/reference/tab-widgets/inference-api/infer-api-mapping-widget.asciidoc b/docs/reference/tab-widgets/inference-api/infer-api-mapping-widget.asciidoc index 80c7c7ef23ee3..66b0cde549545 100644 --- a/docs/reference/tab-widgets/inference-api/infer-api-mapping-widget.asciidoc +++ b/docs/reference/tab-widgets/inference-api/infer-api-mapping-widget.asciidoc @@ -37,6 +37,12 @@ id="infer-api-mapping-mistral"> Mistral +
+
diff --git a/docs/reference/tab-widgets/inference-api/infer-api-mapping.asciidoc b/docs/reference/tab-widgets/inference-api/infer-api-mapping.asciidoc index a1bce38a02ad2..72c648e63871d 100644 --- a/docs/reference/tab-widgets/inference-api/infer-api-mapping.asciidoc +++ b/docs/reference/tab-widgets/inference-api/infer-api-mapping.asciidoc @@ -207,3 +207,38 @@ the {infer} pipeline configuration in the next step. <6> The field type which is text in this example. // end::mistral[] + +// tag::amazon-bedrock[] + +[source,console] +-------------------------------------------------- +PUT amazon-bedrock-embeddings +{ + "mappings": { + "properties": { + "content_embedding": { <1> + "type": "dense_vector", <2> + "dims": 1024, <3> + "element_type": "float", + "similarity": "dot_product" <4> + }, + "content": { <5> + "type": "text" <6> + } + } + } +} +-------------------------------------------------- +<1> The name of the field to contain the generated tokens. It must be referenced +in the {infer} pipeline configuration in the next step. +<2> The field to contain the tokens is a `dense_vector` field. +<3> The output dimensions of the model. This value may be different depending on the underlying model used. +See the https://docs.aws.amazon.com/bedrock/latest/userguide/titan-multiemb-models.html[Amazon Titan model] or the https://docs.cohere.com/reference/embed[Cohere Embeddings model] documentation. +<4> For Amazon Bedrock embeddings, the `dot_product` function should be used to +calculate similarity for Amazon titan models, or `cosine` for Cohere models. +<5> The name of the field from which to create the dense vector representation. +In this example, the name of the field is `content`. It must be referenced in +the {infer} pipeline configuration in the next step. +<6> The field type which is text in this example. + +// end::amazon-bedrock[] diff --git a/docs/reference/tab-widgets/inference-api/infer-api-reindex-widget.asciidoc b/docs/reference/tab-widgets/inference-api/infer-api-reindex-widget.asciidoc index 4face6a105819..9a8028e2b3c6c 100644 --- a/docs/reference/tab-widgets/inference-api/infer-api-reindex-widget.asciidoc +++ b/docs/reference/tab-widgets/inference-api/infer-api-reindex-widget.asciidoc @@ -37,6 +37,12 @@ id="infer-api-reindex-mistral"> Mistral +
+
diff --git a/docs/reference/tab-widgets/inference-api/infer-api-reindex.asciidoc b/docs/reference/tab-widgets/inference-api/infer-api-reindex.asciidoc index 927e47ea4d67c..995189f1309aa 100644 --- a/docs/reference/tab-widgets/inference-api/infer-api-reindex.asciidoc +++ b/docs/reference/tab-widgets/inference-api/infer-api-reindex.asciidoc @@ -154,3 +154,26 @@ number makes the update of the reindexing process quicker which enables you to follow the progress closely and detect errors early. // end::mistral[] + +// tag::amazon-bedrock[] + +[source,console] +---- +POST _reindex?wait_for_completion=false +{ + "source": { + "index": "test-data", + "size": 50 <1> + }, + "dest": { + "index": "amazon-bedrock-embeddings", + "pipeline": "amazon_bedrock_embeddings" + } +} +---- +// TEST[skip:TBD] +<1> The default batch size for reindexing is 1000. Reducing `size` to a smaller +number makes the update of the reindexing process quicker which enables you to +follow the progress closely and detect errors early. + +// end::amazon-bedrock[] diff --git a/docs/reference/tab-widgets/inference-api/infer-api-requirements-widget.asciidoc b/docs/reference/tab-widgets/inference-api/infer-api-requirements-widget.asciidoc index 9981eb90d4929..cf2e4994279d9 100644 --- a/docs/reference/tab-widgets/inference-api/infer-api-requirements-widget.asciidoc +++ b/docs/reference/tab-widgets/inference-api/infer-api-requirements-widget.asciidoc @@ -37,6 +37,12 @@ id="infer-api-requirements-mistral"> Mistral +
+
diff --git a/docs/reference/tab-widgets/inference-api/infer-api-requirements.asciidoc b/docs/reference/tab-widgets/inference-api/infer-api-requirements.asciidoc index 435e53bbc0bc0..856e4d5f0fe47 100644 --- a/docs/reference/tab-widgets/inference-api/infer-api-requirements.asciidoc +++ b/docs/reference/tab-widgets/inference-api/infer-api-requirements.asciidoc @@ -39,3 +39,9 @@ You can apply for access to Azure OpenAI by completing the form at https://aka.m * An API key generated for your account // end::mistral[] + +// tag::amazon-bedrock[] +* An AWS Account with https://aws.amazon.com/bedrock/[Amazon Bedrock] access +* A pair of access and secret keys used to access Amazon Bedrock + +// end::amazon-bedrock[] diff --git a/docs/reference/tab-widgets/inference-api/infer-api-search-widget.asciidoc b/docs/reference/tab-widgets/inference-api/infer-api-search-widget.asciidoc index 6a67b28f91601..52cf65c4a1509 100644 --- a/docs/reference/tab-widgets/inference-api/infer-api-search-widget.asciidoc +++ b/docs/reference/tab-widgets/inference-api/infer-api-search-widget.asciidoc @@ -37,6 +37,12 @@ id="infer-api-search-mistral"> Mistral +
+
diff --git a/docs/reference/tab-widgets/inference-api/infer-api-search.asciidoc b/docs/reference/tab-widgets/inference-api/infer-api-search.asciidoc index 523c2301e75ff..5e23afeb19a9f 100644 --- a/docs/reference/tab-widgets/inference-api/infer-api-search.asciidoc +++ b/docs/reference/tab-widgets/inference-api/infer-api-search.asciidoc @@ -405,3 +405,68 @@ query from the `mistral-embeddings` index sorted by their proximity to the query // NOTCONSOLE // end::mistral[] + +// tag::amazon-bedrock[] + +[source,console] +-------------------------------------------------- +GET amazon-bedrock-embeddings/_search +{ + "knn": { + "field": "content_embedding", + "query_vector_builder": { + "text_embedding": { + "model_id": "amazon_bedrock_embeddings", + "model_text": "Calculate fuel cost" + } + }, + "k": 10, + "num_candidates": 100 + }, + "_source": [ + "id", + "content" + ] +} +-------------------------------------------------- +// TEST[skip:TBD] + +As a result, you receive the top 10 documents that are closest in meaning to the +query from the `amazon-bedrock-embeddings` index sorted by their proximity to the query: + +[source,consol-result] +-------------------------------------------------- +"hits": [ + { + "_index": "amazon-bedrock-embeddings", + "_id": "DDd5OowBHxQKHyc3TDSC", + "_score": 0.83704096, + "_source": { + "id": 862114, + "body": "How to calculate fuel cost for a road trip. By Tara Baukus Mello • Bankrate.com. Dear Driving for Dollars, My family is considering taking a long road trip to finish off the end of the summer, but I'm a little worried about gas prices and our overall fuel cost.It doesn't seem easy to calculate since we'll be traveling through many states and we are considering several routes.y family is considering taking a long road trip to finish off the end of the summer, but I'm a little worried about gas prices and our overall fuel cost. It doesn't seem easy to calculate since we'll be traveling through many states and we are considering several routes." + } + }, + { + "_index": "amazon-bedrock-embeddings", + "_id": "ajd5OowBHxQKHyc3TDSC", + "_score": 0.8345704, + "_source": { + "id": 820622, + "body": "Home Heating Calculator. Typically, approximately 50% of the energy consumed in a home annually is for space heating. When deciding on a heating system, many factors will come into play: cost of fuel, installation cost, convenience and life style are all important.This calculator can help you estimate the cost of fuel for different heating appliances.hen deciding on a heating system, many factors will come into play: cost of fuel, installation cost, convenience and life style are all important. This calculator can help you estimate the cost of fuel for different heating appliances." + } + }, + { + "_index": "amazon-bedrock-embeddings", + "_id": "Djd5OowBHxQKHyc3TDSC", + "_score": 0.8327426, + "_source": { + "id": 8202683, + "body": "Fuel is another important cost. This cost will depend on your boat, how far you travel, and how fast you travel. A 33-foot sailboat traveling at 7 knots should be able to travel 300 miles on 50 gallons of diesel fuel.If you are paying $4 per gallon, the trip would cost you $200.Most boats have much larger gas tanks than cars.uel is another important cost. This cost will depend on your boat, how far you travel, and how fast you travel. A 33-foot sailboat traveling at 7 knots should be able to travel 300 miles on 50 gallons of diesel fuel." + } + }, + (...) + ] +-------------------------------------------------- +// NOTCONSOLE + +// end::amazon-bedrock[] diff --git a/docs/reference/tab-widgets/inference-api/infer-api-task-widget.asciidoc b/docs/reference/tab-widgets/inference-api/infer-api-task-widget.asciidoc index 1f3ad645d7c29..d13301b64a871 100644 --- a/docs/reference/tab-widgets/inference-api/infer-api-task-widget.asciidoc +++ b/docs/reference/tab-widgets/inference-api/infer-api-task-widget.asciidoc @@ -37,6 +37,12 @@ id="infer-api-task-mistral"> Mistral +
+
diff --git a/docs/reference/tab-widgets/inference-api/infer-api-task.asciidoc b/docs/reference/tab-widgets/inference-api/infer-api-task.asciidoc index 18fa3ba541bff..c6ef2a46a8731 100644 --- a/docs/reference/tab-widgets/inference-api/infer-api-task.asciidoc +++ b/docs/reference/tab-widgets/inference-api/infer-api-task.asciidoc @@ -177,3 +177,29 @@ PUT _inference/text_embedding/mistral_embeddings <1> <3> The Mistral embeddings model name, for example `mistral-embed`. // end::mistral[] + +// tag::amazon-bedrock[] + +[source,console] +------------------------------------------------------------ +PUT _inference/text_embedding/amazon_bedrock_embeddings <1> +{ + "service": "amazonbedrock", + "service_settings": { + "access_key": "", <2> + "secret_key": "", <3> + "region": "", <4> + "provider": "", <5> + "model": "" <6> + } +} +------------------------------------------------------------ +// TEST[skip:TBD] +<1> The task type is `text_embedding` in the path and the `inference_id` which is the unique identifier of the {infer} endpoint is `amazon_bedrock_embeddings`. +<2> The access key can be found on your AWS IAM management page for the user account to access Amazon Bedrock. +<3> The secret key should be the paired key for the specified access key. +<4> Specify the region that your model is hosted in. +<5> Specify the model provider. +<6> The model ID or ARN of the model to use. + +// end::amazon-bedrock[] From 1e7f832b0675075d60337f97d56b3cb5f00a0dca Mon Sep 17 00:00:00 2001 From: "Mark J. Hoy" Date: Mon, 8 Jul 2024 13:21:47 -0400 Subject: [PATCH 4/5] fix typo --- .../search/search-your-data/semantic-search-inference.asciidoc | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/reference/search/search-your-data/semantic-search-inference.asciidoc b/docs/reference/search/search-your-data/semantic-search-inference.asciidoc index f89181595ae3c..ae27b46d4b876 100644 --- a/docs/reference/search/search-your-data/semantic-search-inference.asciidoc +++ b/docs/reference/search/search-your-data/semantic-search-inference.asciidoc @@ -17,7 +17,7 @@ For a list of supported models available on HuggingFace, refer to Azure based examples use models available through https://ai.azure.com/explore/models?selectedTask=embeddings[Azure AI Studio] or https://learn.microsoft.com/en-us/azure/ai-services/openai/concepts/models[Azure OpenAI]. Mistral examples use the `mistral-embed` model from https://docs.mistral.ai/getting-started/models/[the Mistral API]. -Amazon Bedrock examples use the `amazon.titan-embed-text-v1` model fom https://docs.aws.amazon.com/bedrock/latest/userguide/model-ids.html[the Amazon Bedrock base models]. +Amazon Bedrock examples use the `amazon.titan-embed-text-v1` model from https://docs.aws.amazon.com/bedrock/latest/userguide/model-ids.html[the Amazon Bedrock base models]. Click the name of the service you want to use on any of the widgets below to review the corresponding instructions. From 07afa0ee78b797d3d9f037627f833cc3c20a1276 Mon Sep 17 00:00:00 2001 From: "Mark J. Hoy" Date: Thu, 11 Jul 2024 13:06:17 -0400 Subject: [PATCH 5/5] fix error; accept suggestions --- .../inference/service-amazon-bedrock.asciidoc | 18 ++++++++++-------- 1 file changed, 10 insertions(+), 8 deletions(-) diff --git a/docs/reference/inference/service-amazon-bedrock.asciidoc b/docs/reference/inference/service-amazon-bedrock.asciidoc index 4cbc3d864350e..4ffa368613a0e 100644 --- a/docs/reference/inference/service-amazon-bedrock.asciidoc +++ b/docs/reference/inference/service-amazon-bedrock.asciidoc @@ -103,24 +103,24 @@ include::inference-shared.asciidoc[tag=task-settings] `max_new_tokens`::: (Optional, integer) -Provides a hint for the maximum number of output tokens to be generated. +Sets the maximum number for the output tokens to be generated. Defaults to 64. `temperature`::: (Optional, float) -A number in the range of 0.0 to 1.0 that specifies the sampling temperature to use that controls the apparent creativity of generated completions. +A number between 0.0 and 1.0 that controls the apparent creativity of the results. At temperature 0.0 the model is most deterministic, at temperature 1.0 most random. Should not be used if `top_p` or `top_k` is specified. `top_p`::: (Optional, float) -A number in the range of 0.0 to 1.0 that is an alternative value to temperature that causes the model to consider the results of the tokens with nucleus sampling probability. -Should not be used if `temperature` or `top_k` is specified. +Alternative to `temperature`. A number in the range of 0.0 to 1.0, to eliminate low-probability tokens. Top-p uses nucleus sampling to select top tokens whose sum of likelihoods does not exceed a certain value, ensuring both variety and coherence. +Should not be used if `temperature` is specified. -`top_p`::: +`top_k`::: (Optional, float) Only available for `anthropic`, `cohere`, and `mistral` providers. -A number in the range of 0.0 to 1.0 that is an alternative value to temperature or top_p that causes the model to consider the results of the tokens with nucleus sampling probability. -Should not be used if `temperature` or `top_p` is specified. +Alternative to `temperature`. Limits samples to the top-K most likely words, balancing coherence and variability. +Should not be used if `temperature` is specified. ===== + @@ -130,13 +130,15 @@ Should not be used if `temperature` or `top_p` is specified. There are no `task_settings` available for the `text_embedding` task type. +===== + [discrete] [[inference-example-amazonbedrock]] ==== Amazon Bedrock service example The following example shows how to create an {infer} endpoint called `amazon_bedrock_embeddings` to perform a `text_embedding` task type. -The list of chat completion and embeddings models that you can choose from should be a https://docs.aws.amazon.com/bedrock/latest/userguide/model-ids.html[Amazon Bedrock base model] you have access to. +Choose chat completion and embeddings models that you have access to from the https://docs.aws.amazon.com/bedrock/latest/userguide/model-ids.html[Amazon Bedrock base models]. [source,console] ------------------------------------------------------------