Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Inference API] Add Docs for Amazon Bedrock Support for the Inference API #110594

Merged

Conversation

markjhoy
Copy link
Contributor

@markjhoy markjhoy commented Jul 8, 2024

Add docs in support of Amazon Bedrock support in the Inference API: #110248

Copy link
Contributor

github-actions bot commented Jul 8, 2024

Documentation preview:

Copy link
Contributor

@timgrein timgrein left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just highlighting some Azure AI Studio references (Sorry, I just saw that this is draft, but leaving the comments here, so we don't forget it :) )

Creates an {infer} endpoint to perform an {infer} task with the `amazonbedrock` service.

[discrete]
[[infer-service-azure-ai-studio-api-request]]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
[[infer-service-azure-ai-studio-api-request]]
[[infer-service-amazon-bedrock-api-request]]

`PUT /_inference/<task_type>/<inference_id>`

[discrete]
[[infer-service-azure-ai-studio-api-path-params]]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
[[infer-service-azure-ai-studio-api-path-params]]
[[infer-service-amazon-bedrock-path-params]]


`rate_limit`:::
(Optional, object)
By default, the `azureaistudio` service sets the number of requests allowed per minute to `240`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
By default, the `azureaistudio` service sets the number of requests allowed per minute to `240`.
By default, the `amazonbedrock` service sets the number of requests allowed per minute to `240`.

`rate_limit`:::
(Optional, object)
By default, the `azureaistudio` service sets the number of requests allowed per minute to `240`.
This helps to minimize the number of rate limit errors returned from Azure AI Studio.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
This helps to minimize the number of rate limit errors returned from Azure AI Studio.
This helps to minimize the number of rate limit errors returned from Amazon Bedrock.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Argh - great catches ;) That's what I get for copy / pasting

@markjhoy
Copy link
Contributor Author

markjhoy commented Jul 8, 2024

@elasticmachine run docs build

@markjhoy markjhoy marked this pull request as ready for review July 8, 2024 17:21
@elasticsearchmachine elasticsearchmachine added the needs:triage Requires assignment of a team area label label Jul 8, 2024
@markjhoy markjhoy added >docs General docs changes >non-issue Team:Docs Meta label for docs team Team:ML Meta label for the ML team auto-backport-and-merge v8.15.0 labels Jul 8, 2024
@elasticsearchmachine elasticsearchmachine removed Team:ML Meta label for the ML team needs:triage Requires assignment of a team area label labels Jul 8, 2024
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-docs (Team:Docs)

+
.`task_settings` for the `text_embedding` task type
[%collapsible%closed]
=====
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@markjhoy I think this unclosed ==== block might be breaking your build :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah thanks! I could not figure out for the life of me where that error was coming from!

Copy link
Contributor

@leemthompo leemthompo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is looking good. Found a few minor errors and suggested some rephrasings. Also hopefully identified the formatting issues that's failing docs build. Once these updates are made and we can preview the formatting for tabs, will be ready for final review!

A number in the range of 0.0 to 1.0 that is an alternative value to temperature that causes the model to consider the results of the tokens with nucleus sampling probability.
Should not be used if `temperature` or `top_k` is specified.

`top_p`:::
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
`top_p`:::
`top_k`:::

Assuming the first top_p is the correct one 😉


`max_new_tokens`:::
(Optional, integer)
Provides a hint for the maximum number of output tokens to be generated.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Provides a hint for the maximum number of output tokens to be generated.
Sets a maximum number for the output tokens to be generated.

Not sure what "hint" means here, rewording tries to clarify


`temperature`:::
(Optional, float)
A number in the range of 0.0 to 1.0 that specifies the sampling temperature to use that controls the apparent creativity of generated completions.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
A number in the range of 0.0 to 1.0 that specifies the sampling temperature to use that controls the apparent creativity of generated completions.
A number between 0.0 and 1.0 that controls the apparent creativity of the results. At temperature 0.0 the model is most deterministic, at temperature 1.0 most random.


`top_p`:::
(Optional, float)
A number in the range of 0.0 to 1.0 that is an alternative value to temperature that causes the model to consider the results of the tokens with nucleus sampling probability.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
A number in the range of 0.0 to 1.0 that is an alternative value to temperature that causes the model to consider the results of the tokens with nucleus sampling probability.
Alternative to `temperature`. A number in the range of 0.0 to 1.0, to eliminate low-probability tokens. Top-p uses nucleus sampling to select top tokens whose sum of likelihoods does not exceed a certain value, ensuring both variety and coherence.

`top_p`:::
(Optional, float)
A number in the range of 0.0 to 1.0 that is an alternative value to temperature that causes the model to consider the results of the tokens with nucleus sampling probability.
Should not be used if `temperature` or `top_k` is specified.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reading around it looks like top-p and top-k can be used in combination?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI - you're correct here... theoretically, you can use all three, but you shouldn't use temperature and top_p at the same time. For reference, see the parameters in Amazon's Anthropic docs

`top_p`:::
(Optional, float)
Only available for `anthropic`, `cohere`, and `mistral` providers.
A number in the range of 0.0 to 1.0 that is an alternative value to temperature or top_p that causes the model to consider the results of the tokens with nucleus sampling probability.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
A number in the range of 0.0 to 1.0 that is an alternative value to temperature or top_p that causes the model to consider the results of the tokens with nucleus sampling probability.
Alternative to `temperature`. Limits samples to the top-K most likely words, balancing coherence and variability.
A number in the range of 0.0 to 1.0.


The following example shows how to create an {infer} endpoint called `amazon_bedrock_embeddings` to perform a `text_embedding` task type.

The list of chat completion and embeddings models that you can choose from should be a https://docs.aws.amazon.com/bedrock/latest/userguide/model-ids.html[Amazon Bedrock base model] you have access to.
Copy link
Contributor

@leemthompo leemthompo Jul 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
The list of chat completion and embeddings models that you can choose from should be a https://docs.aws.amazon.com/bedrock/latest/userguide/model-ids.html[Amazon Bedrock base model] you have access to.
Choose chat completion and embeddings models you have access to from the https://docs.aws.amazon.com/bedrock/latest/userguide/model-ids.html[Amazon Bedrock base models].

nit: keep sentence short

@markjhoy markjhoy requested a review from leemthompo July 11, 2024 17:06
Copy link
Contributor

@leemthompo leemthompo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM from writing perspective!

@markjhoy markjhoy merged commit 560d404 into elastic:main Jul 12, 2024
5 checks passed
markjhoy added a commit to markjhoy/elasticsearch that referenced this pull request Jul 12, 2024
… API (elastic#110594)

* Add Amazon Bedrock Inference API to docs

* fix example errors

* update semantic search tutorial; add changelog

* fix typo

* fix error; accept suggestions
@elasticsearchmachine
Copy link
Collaborator

💚 Backport successful

Status Branch Result
8.15

elasticsearchmachine pushed a commit that referenced this pull request Jul 12, 2024
… API (#110594) (#110832)

* Add Amazon Bedrock Inference API to docs

* fix example errors

* update semantic search tutorial; add changelog

* fix typo

* fix error; accept suggestions
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>docs General docs changes >non-issue Team:Docs Meta label for docs team v8.15.0 v8.16.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants