Skip to content

Commit

Permalink
Merge pull request #253 from jbyrne-redhat/DS-3011
Browse files Browse the repository at this point in the history
DS-3011 Add upstream modules for authorization
  • Loading branch information
jbyrne-redhat authored Apr 17, 2024
2 parents 037bf44 + 0d504e6 commit 74fdf72
Show file tree
Hide file tree
Showing 4 changed files with 170 additions and 0 deletions.
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
:_module-type: PROCEDURE

[id="accessing-authorization-token-for-deployed-model_{context}"]
= Accessing the authorization token for a deployed model

[role='_abstract']
If you secured your model inference endpoint by enabling token authorization, you must know how to access your authorization token so that you can specify it in your inference requests.

.Prerequisites
* You have logged in to {productname-long}.
ifndef::upstream[]
* If you are using specialized {productname-short} groups, you are part of the user group or admin group (for example, {oai-user-group} or {oai-admin-group}) in OpenShift.
endif::[]
ifdef::upstream[]
* If you are using specialized {productname-short} groups, you are part of the user group or admin group (for example, {odh-user-group} or {odh-admin-group}) in OpenShift.
endif::[]
* You have deployed a model by using the single-model serving platform.

.Procedure

. From the {productname-short} dashboard, click *Data Science Projects*.
+
The *Data Science Projects* page opens.
. Click the name of the project that contains your deployed model.
+
A project details page opens.
. Click the *Models* tab.
. In the *Models and model servers* list, expand the section for your model.
+
Your authorization token is shown in the *Token authorization* section, in the *Token secret* field.
. Optional: To copy the authorization token for use in an inference request, click the *Copy* button (image:images/osd-copy.png[]) next to the token value.

// [role='_additional-resources']
// .Additional resources
Original file line number Diff line number Diff line change
@@ -0,0 +1,120 @@
:_module-type: PROCEDURE

[id="accessing-inference-endpoint-for-deployed-model_{context}"]
= Accessing the inference endpoint for a deployed model

[role='_abstract']
To make inference requests to your deployed model, you must know how to access the inference endpoint that is available.

.Prerequisites
* You have logged in to {productname-long}.
ifndef::upstream[]
* If you are using specialized {productname-short} groups, you are part of the user group or admin group (for example, {oai-user-group} or {oai-admin-group}) in OpenShift.
endif::[]
ifdef::upstream[]
* If you are using specialized {productname-short} groups, you are part of the user group or admin group (for example, {odh-user-group} or {odh-admin-group}) in OpenShift.
endif::[]
* You have deployed a model by using the single-model serving platform.
* If you enabled token authorization for your deployed model, you have the associated token value.

.Procedure
. From the {productname-short} dashboard, click *Data Science Projects*.
+
The *Data Science Projects* page opens.
. Click the name of the project that contains your deployed model.
+
A project details page opens.
. Click the *Models* tab.
. In the *Models and model servers* list, expand the section for your model.
+
The inference endpoint for the model is shown in the *Inference endpoint* field.
. Depending on what action you want to perform with the model (and if the model supports that action), copy the inference endpoint shown and then add one of the following paths to the end of the URL:
+
--
*Caikit TGIS ServingRuntime for KServe*

* `:443/api/v1/task/text-generation`
* `:443/api/v1/task/server-streaming-text-generation`
// * `:443/api/v1/task/text-classification`
// * `:443/api/v1/task/token-classification`

*TGIS Standalone ServingRuntime for KServe*

* `:443 fmaas.GenerationService/Generate`
* `:443 fmaas.GenerationService/GenerateStream`
+
NOTE: To query the endpoint for the TGIS standalone runtime, you must also download the files in the `proto` directory of the IBM link:https://github.com/IBM/text-generation-inference[text-generation-inference^] repository.

*OpenVINO Model Server*

* `/v2/models/<model-name>/infer`

As indicated by the paths shown, the single-model serving platform uses the HTTPS port of your OpenShift router (usually port 443) to serve external API requests.
--

. Use the endpoint to make API requests to your deployed model, as shown in the following example commands.

ifdef::upstream[]
+
--
*Caikit TGIS ServingRuntime for KServe*
[source,subs="+quotes"]
----
curl --json '{"model_id": "<model_name>", "inputs": "<text>"}' \
https://<inference_endpoint_url>:443/api/v1/task/server-streaming-text-generation \
-H 'Authorization: Bearer __<token>__' <1>
----
<1> You need to add the HTTP `Authorization` header and specify a token value _only_ if you added Authorino as an authoriztion provider to secure your model inference endpoint.

*TGIS Standalone ServingRuntime for KServe*
[source]
----
grpcurl -proto text-generation-inference/proto/generation.proto -d \
'{"requests": [{"text":"<text>"}]}' \
-H 'mm-model-id: <model_name>' -insecure <inference_endpoint_url>:443 fmaas.GenerationService/Generate \
-H 'Authorization: Bearer __<token>__' <1>
----
<1> You need to add the HTTP `Authorization` header and specify a token value _only_ if you added Authorino as an authoriztion provider to secure your model inference endpoint.

*OpenVINO Model Server*
[source]
----
curl -ks <inference_endpoint_url>/v2/models/<model_name>/infer -d \
'{ "model_name": "<model_name>", \
"inputs": [{ "name": "<name_of_model_input>", "shape": [<shape>], "datatype": "<data_type>", "data": [<data>] }]}' \
-H 'Authorization: Bearer __<token>__' <1>
----
<1> You need to add the HTTP `Authorization` header and specify a token value _only_ if you added Authorino as an authoriztion provider to secure your model inference endpoint.
--
endif::[]
ifdef::self-managed,cloud-service[]
+
--
*Caikit TGIS ServingRuntime for KServe*
[source]
----
curl --json '{"model_id": "<model_name>", "inputs": "<text>"}' https://<inference_endpoint_url>:443/api/v1/task/server-streaming-text-generation -H 'Authorization: Bearer __<token>__' <1>
----
<1> You need to add the HTTP `Authorization` header and specify a token value _only_ if you added Authorino as an authoriztion provider to secure your model inference endpoint.

*TGIS Standalone ServingRuntime for KServe*
[source]
----
grpcurl -proto text-generation-inference/proto/generation.proto -d '{"requests": [{"text":"<text>"}]}' -H 'mm-model-id: <model_name>' -H 'Authorization: Bearer __<token>__' -insecure <inference_endpoint_url>:443 fmaas.GenerationService/Generate <1>
----
<1> You need to add the HTTP `Authorization` header and specify a token value _only_ if you added Authorino as an authoriztion provider to secure your model inference endpoint.

*OpenVINO Model Server*
[source]
----
curl -ks <inference_endpoint_url>/v2/models/<model_name>/infer -d '{ "model_name": "<model_name>", "inputs": [{ "name": "<name_of_model_input>", "shape": [<shape>], "datatype": "<data_type>", "data": [<data>] }]}' -H 'Authorization: Bearer __<token>__' <1>
----
<1> You need to add the HTTP `Authorization` header and specify a token value _only_ if you added Authorino as an authoriztion provider to secure your model inference endpoint.
--
endif::[]

[role='_additional-resources']
.Additional resources
* link:https://github.com/IBM/text-generation-inference[Text Generation Inference Server (TGIS)^]
* link:https://caikit.readthedocs.io/en/latest/autoapi/caikit/index.html[Caikit API documentation^]
* link:https://docs.openvino.ai/2023.3/ovms_docs_rest_api_kfs.html[OpenVINO KServe-compatible REST API documentation^]
Original file line number Diff line number Diff line change
Expand Up @@ -52,6 +52,9 @@ The *Deploy model* dialog opens.
.. From the *Model framework* list, select a value.
.. In the *Number of model replicas to deploy* field, specify a value.
.. From the *Model server size* list, select a value.
.. To require token authorization for inference requests to the deployed model, perform the following actions:
... Select *Require token authorization*.
... In the *Service account name* field, enter the service account name that the token will be generated for.
.. To specify the location of your model, perform one of the following sets of actions:
+
--
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
:_module-type: CONCEPT

[id="making-inference-requests-to-models-deployed-on-single-model-serving-platform_{context}"]
= Making inference requests to models deployed on the single-model serving platform

[role='_abstract']
When you deploy a model by using the single-model serving platform, the model is available as a service that you can access using API requests. This enables you to return predictions based on data inputs. To use API requests to interact with your deployed model, you must know the inference endpoint for the model.

In adddition, if you secured your inference endpoint by enabling token authorization, you must know how to access your authorization token so that you can specify this in your inference requests.

// [role='_additional-resources']
// .Additional resources

0 comments on commit 74fdf72

Please sign in to comment.