[Inference API] Add Docs for Amazon Bedrock Support for the Inference API #110594

markjhoy · 2024-07-08T15:10:22Z

Add docs in support of Amazon Bedrock support in the Inference API: #110248

github-actions · 2024-07-08T15:10:33Z

Documentation preview:

✨ Changed pages

timgrein

Just highlighting some Azure AI Studio references (Sorry, I just saw that this is draft, but leaving the comments here, so we don't forget it :) )

timgrein · 2024-07-08T15:22:02Z

docs/reference/inference/service-amazon-bedrock.asciidoc

+Creates an {infer} endpoint to perform an {infer} task with the `amazonbedrock` service.
+
+[discrete]
+[[infer-service-azure-ai-studio-api-request]]


Suggested change

[[infer-service-azure-ai-studio-api-request]]

[[infer-service-amazon-bedrock-api-request]]

timgrein · 2024-07-08T15:22:18Z

docs/reference/inference/service-amazon-bedrock.asciidoc

+`PUT /_inference/<task_type>/<inference_id>`
+
+[discrete]
+[[infer-service-azure-ai-studio-api-path-params]]


Suggested change

[[infer-service-azure-ai-studio-api-path-params]]

[[infer-service-amazon-bedrock-path-params]]

timgrein · 2024-07-08T15:24:19Z

docs/reference/inference/service-amazon-bedrock.asciidoc

+
+`rate_limit`:::
+(Optional, object)
+By default, the `azureaistudio` service sets the number of requests allowed per minute to `240`.


Suggested change

By default, the `azureaistudio` service sets the number of requests allowed per minute to `240`.

By default, the `amazonbedrock` service sets the number of requests allowed per minute to `240`.

timgrein · 2024-07-08T15:24:28Z

docs/reference/inference/service-amazon-bedrock.asciidoc

+`rate_limit`:::
+(Optional, object)
+By default, the `azureaistudio` service sets the number of requests allowed per minute to `240`.
+This helps to minimize the number of rate limit errors returned from Azure AI Studio.


Suggested change

This helps to minimize the number of rate limit errors returned from Azure AI Studio.

This helps to minimize the number of rate limit errors returned from Amazon Bedrock.

Argh - great catches ;) That's what I get for copy / pasting

markjhoy · 2024-07-08T17:17:28Z

@elasticmachine run docs build

elasticsearchmachine · 2024-07-08T17:24:00Z

Pinging @elastic/es-docs (Team:Docs)

leemthompo · 2024-07-11T09:55:57Z

docs/reference/inference/service-amazon-bedrock.asciidoc

+
+.`task_settings` for the `text_embedding` task type
+[%collapsible%closed]
+=====


@markjhoy I think this unclosed ==== block might be breaking your build :)

Ah thanks! I could not figure out for the life of me where that error was coming from!

leemthompo

This is looking good. Found a few minor errors and suggested some rephrasings. Also hopefully identified the formatting issues that's failing docs build. Once these updates are made and we can preview the formatting for tabs, will be ready for final review!

leemthompo · 2024-07-11T10:04:43Z

docs/reference/inference/service-amazon-bedrock.asciidoc

+A number in the range of 0.0 to 1.0 that is an alternative value to temperature that causes the model to consider the results of the tokens with nucleus sampling probability.
+Should not be used if `temperature` or `top_k` is specified.
+
+`top_p`:::


Suggested change

`top_p`:::

`top_k`:::

Assuming the first top_p is the correct one 😉

leemthompo · 2024-07-11T10:07:30Z

docs/reference/inference/service-amazon-bedrock.asciidoc

+
+`max_new_tokens`:::
+(Optional, integer)
+Provides a hint for the maximum number of output tokens to be generated.


Suggested change

Provides a hint for the maximum number of output tokens to be generated.

Sets a maximum number for the output tokens to be generated.

Not sure what "hint" means here, rewording tries to clarify

leemthompo · 2024-07-11T10:12:05Z

docs/reference/inference/service-amazon-bedrock.asciidoc

+
+`temperature`:::
+(Optional, float)
+A number in the range of 0.0 to 1.0 that specifies the sampling temperature to use that controls the apparent creativity of generated completions.


Suggested change

A number in the range of 0.0 to 1.0 that specifies the sampling temperature to use that controls the apparent creativity of generated completions.

A number between 0.0 and 1.0 that controls the apparent creativity of the results. At temperature 0.0 the model is most deterministic, at temperature 1.0 most random.

leemthompo · 2024-07-11T10:14:04Z

docs/reference/inference/service-amazon-bedrock.asciidoc

+
+`top_p`:::
+(Optional, float)
+A number in the range of 0.0 to 1.0 that is an alternative value to temperature that causes the model to consider the results of the tokens with nucleus sampling probability.


Suggested change

A number in the range of 0.0 to 1.0 that is an alternative value to temperature that causes the model to consider the results of the tokens with nucleus sampling probability.

Alternative to `temperature`. A number in the range of 0.0 to 1.0, to eliminate low-probability tokens. Top-p uses nucleus sampling to select top tokens whose sum of likelihoods does not exceed a certain value, ensuring both variety and coherence.

leemthompo · 2024-07-11T10:22:09Z

docs/reference/inference/service-amazon-bedrock.asciidoc

+`top_p`:::
+(Optional, float)
+A number in the range of 0.0 to 1.0 that is an alternative value to temperature that causes the model to consider the results of the tokens with nucleus sampling probability.
+Should not be used if `temperature` or `top_k` is specified.


Reading around it looks like top-p and top-k can be used in combination?

FYI - you're correct here... theoretically, you can use all three, but you shouldn't use temperature and top_p at the same time. For reference, see the parameters in Amazon's Anthropic docs

leemthompo · 2024-07-11T10:25:21Z

docs/reference/inference/service-amazon-bedrock.asciidoc

+`top_p`:::
+(Optional, float)
+Only available for `anthropic`, `cohere`, and `mistral` providers.
+A number in the range of 0.0 to 1.0 that is an alternative value to temperature or top_p that causes the model to consider the results of the tokens with nucleus sampling probability.


Suggested change

A number in the range of 0.0 to 1.0 that is an alternative value to temperature or top_p that causes the model to consider the results of the tokens with nucleus sampling probability.

Alternative to `temperature`. Limits samples to the top-K most likely words, balancing coherence and variability.

A number in the range of 0.0 to 1.0.

leemthompo · 2024-07-11T10:29:44Z

docs/reference/inference/service-amazon-bedrock.asciidoc

+
+The following example shows how to create an {infer} endpoint called `amazon_bedrock_embeddings` to perform a `text_embedding` task type.
+
+The list of chat completion and embeddings models that you can choose from should be a https://docs.aws.amazon.com/bedrock/latest/userguide/model-ids.html[Amazon Bedrock base model] you have access to.


Suggested change

The list of chat completion and embeddings models that you can choose from should be a https://docs.aws.amazon.com/bedrock/latest/userguide/model-ids.html[Amazon Bedrock base model] you have access to.

Choose chat completion and embeddings models you have access to from the https://docs.aws.amazon.com/bedrock/latest/userguide/model-ids.html[Amazon Bedrock base models].

nit: keep sentence short

leemthompo

LGTM from writing perspective!

… API (elastic#110594) * Add Amazon Bedrock Inference API to docs * fix example errors * update semantic search tutorial; add changelog * fix typo * fix error; accept suggestions

elasticsearchmachine · 2024-07-12T14:16:11Z

💚 Backport successful

Status	Branch	Result
✅	8.15

… API (#110594) (#110832) * Add Amazon Bedrock Inference API to docs * fix example errors * update semantic search tutorial; add changelog * fix typo * fix error; accept suggestions

Add Amazon Bedrock Inference API to docs

dbc377f

elasticsearchmachine added the v8.16.0 label Jul 8, 2024

fix example errors

ca82238

timgrein reviewed Jul 8, 2024

View reviewed changes

update semantic search tutorial; add changelog

9fb4775

fix typo

1e7f832

markjhoy marked this pull request as ready for review July 8, 2024 17:21

elasticsearchmachine added the needs:triage Requires assignment of a team area label label Jul 8, 2024

markjhoy added >docs General docs changes >non-issue Team:Docs Meta label for docs team Team:ML Meta label for the ML team auto-backport-and-merge v8.15.0 labels Jul 8, 2024

elasticsearchmachine removed Team:ML Meta label for the ML team needs:triage Requires assignment of a team area label labels Jul 8, 2024

markjhoy requested review from timgrein and leemthompo July 9, 2024 00:18

leemthompo reviewed Jul 11, 2024

View reviewed changes

fix error; accept suggestions

07afa0e

markjhoy requested a review from leemthompo July 11, 2024 17:06

leemthompo approved these changes Jul 12, 2024

View reviewed changes

markjhoy merged commit 560d404 into elastic:main Jul 12, 2024
5 checks passed

markjhoy mentioned this pull request Jul 12, 2024

[8.15] [Inference API] Add Docs for Amazon Bedrock Support for the Inference API (#110594) #110832

Merged

lkts mentioned this pull request Aug 13, 2024

Fix references to logsdb index mode in release highlights lkts/elasticsearch#1

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Inference API] Add Docs for Amazon Bedrock Support for the Inference API #110594

[Inference API] Add Docs for Amazon Bedrock Support for the Inference API #110594

markjhoy commented Jul 8, 2024 •

edited

Loading

github-actions bot commented Jul 8, 2024

timgrein left a comment

timgrein Jul 8, 2024

timgrein Jul 8, 2024

timgrein Jul 8, 2024

timgrein Jul 8, 2024

markjhoy Jul 8, 2024

markjhoy commented Jul 8, 2024

elasticsearchmachine commented Jul 8, 2024

leemthompo Jul 11, 2024

markjhoy Jul 11, 2024

leemthompo left a comment

leemthompo Jul 11, 2024

leemthompo Jul 11, 2024

leemthompo Jul 11, 2024

leemthompo Jul 11, 2024

leemthompo Jul 11, 2024

markjhoy Jul 12, 2024

leemthompo Jul 11, 2024

leemthompo Jul 11, 2024 •

edited

Loading

leemthompo left a comment •

edited

Loading

elasticsearchmachine commented Jul 12, 2024

	[[infer-service-azure-ai-studio-api-request]]
	[[infer-service-amazon-bedrock-api-request]]

	[[infer-service-azure-ai-studio-api-path-params]]
	[[infer-service-amazon-bedrock-path-params]]

	By default, the `azureaistudio` service sets the number of requests allowed per minute to `240`.
	By default, the `amazonbedrock` service sets the number of requests allowed per minute to `240`.

	This helps to minimize the number of rate limit errors returned from Azure AI Studio.
	This helps to minimize the number of rate limit errors returned from Amazon Bedrock.

	Provides a hint for the maximum number of output tokens to be generated.
	Sets a maximum number for the output tokens to be generated.

	A number in the range of 0.0 to 1.0 that specifies the sampling temperature to use that controls the apparent creativity of generated completions.
	A number between 0.0 and 1.0 that controls the apparent creativity of the results. At temperature 0.0 the model is most deterministic, at temperature 1.0 most random.

	A number in the range of 0.0 to 1.0 that is an alternative value to temperature that causes the model to consider the results of the tokens with nucleus sampling probability.
	Alternative to `temperature`. A number in the range of 0.0 to 1.0, to eliminate low-probability tokens. Top-p uses nucleus sampling to select top tokens whose sum of likelihoods does not exceed a certain value, ensuring both variety and coherence.

	A number in the range of 0.0 to 1.0 that is an alternative value to temperature or top_p that causes the model to consider the results of the tokens with nucleus sampling probability.
	Alternative to `temperature`. Limits samples to the top-K most likely words, balancing coherence and variability.
	A number in the range of 0.0 to 1.0.


		The following example shows how to create an {infer} endpoint called `amazon_bedrock_embeddings` to perform a `text_embedding` task type.

		The list of chat completion and embeddings models that you can choose from should be a https://docs.aws.amazon.com/bedrock/latest/userguide/model-ids.html[Amazon Bedrock base model] you have access to.

	The list of chat completion and embeddings models that you can choose from should be a https://docs.aws.amazon.com/bedrock/latest/userguide/model-ids.html[Amazon Bedrock base model] you have access to.
	Choose chat completion and embeddings models you have access to from the https://docs.aws.amazon.com/bedrock/latest/userguide/model-ids.html[Amazon Bedrock base models].

[Inference API] Add Docs for Amazon Bedrock Support for the Inference API #110594

[Inference API] Add Docs for Amazon Bedrock Support for the Inference API #110594

Conversation

markjhoy commented Jul 8, 2024 • edited Loading

github-actions bot commented Jul 8, 2024

timgrein left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

markjhoy commented Jul 8, 2024

elasticsearchmachine commented Jul 8, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

leemthompo left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

leemthompo Jul 11, 2024 • edited Loading

Choose a reason for hiding this comment

leemthompo left a comment • edited Loading

Choose a reason for hiding this comment

elasticsearchmachine commented Jul 12, 2024

💚 Backport successful

markjhoy commented Jul 8, 2024 •

edited

Loading

leemthompo Jul 11, 2024 •

edited

Loading

leemthompo left a comment •

edited

Loading