diff --git a/assemblies/using-explainability.adoc b/assemblies/using-explainability.adoc new file mode 100644 index 00000000..479f916e --- /dev/null +++ b/assemblies/using-explainability.adoc @@ -0,0 +1,24 @@ +:_module-type: ASSEMBLY + +ifdef::context[:parent-context: {context}] + +:productname-long: Open Data Hub +:productname-short: Open Data Hub + +:context: explainers + +[id="using-explainability_{context}"] += Using explainability + +As a data scientist, you can learn how your machine learning model makes its predictions and decisions. You can use explainers from TrustyAI to provide saliency explanations for model inferences in {productname-long}. + +For information about the specific explainers, see link:{odhdocshome}/monitoring-data-science-models/#supported-explainers_explainers[Supported explainers]. + +include::modules/requesting-a-lime-explanation.adoc[leveloffset=+1] + +include::modules/requesting-a-shap-explanation.adoc[leveloffset=+1] + +include::modules/supported-explainers.adoc[leveloffset=+1] + +ifdef::parent-context[:context: {parent-context}] +ifndef::parent-context[:!context:] diff --git a/modules/requesting-a-lime-explanation-using-cli.adoc b/modules/requesting-a-lime-explanation-using-cli.adoc new file mode 100644 index 00000000..0d2b910b --- /dev/null +++ b/modules/requesting-a-lime-explanation-using-cli.adoc @@ -0,0 +1,147 @@ +:_module-type: PROCEDURE + +[id='requesting-a-lime-explanation-using-CLI_{context}'] += Requesting a LIME explanation by using the CLI + +[role='_abstract'] +You can use the OpenShift command-line interface (CLI) to request a LIME explanation. + +.Prerequisites + +* Your OpenShift cluster administrator added you as a user to the {openshift-platform} cluster and has installed the TrustyAI service for the data science project that contains the deployed models. + +* You authenticated the TrustyAI service, as described in link:{odhdocshome}/monitoring-data-science-models/#authenticating-trustyai-service_monitor[Authenticating the TrustyAI service]. + +* You have real-world data from the deployed models. + +ifdef::upstream,self-managed[] +* You installed the OpenShift command line interface (`oc`) as described in link:https://docs.openshift.com/container-platform/{ocp-latest-version}/cli_reference/openshift_cli/getting-started-cli.html[Get Started with the CLI]. +endif::[] +ifdef::cloud-service[] +* You installed the OpenShift command line interface (`oc`) as described in link:https://docs.openshift.com/dedicated/cli_reference/openshift_cli/getting-started-cli.html[Getting started with the CLI] (OpenShift Dedicated) or link:https://docs.openshift.com/rosa/cli_reference/openshift_cli/getting-started-cli.html[Getting started with the CLI] (Red Hat OpenShift Service on AWS) +endif::[] + +.Procedure + +. Open a new terminal window. +. Follow these steps to log in to your {openshift-platform} cluster: +.. In the upper-right corner of the OpenShift web console, click your user name and select *Copy login command*. +.. After you have logged in, click *Display token*. +.. Copy the *Log in with this token* command and paste it in the OpenShift command-line interface (CLI). ++ +[source,subs="+quotes"] +---- +$ oc login --token=____ --server=____ +---- + +. Set an environment variable to define the external route for the TrustyAI service pod. ++ +---- +export TRUSTY_ROUTE=$(oc get route trustyai-service -n $NAMESPACE -o jsonpath='{.spec.host}') +---- + +. Set an environment variable to define the name of your model. ++ +---- +export MODEL="model-name" +---- + +. Use `GET /info/inference/ids/${MODEL}` to get a list of all inference IDs within your model inference data set. ++ +[source] +---- +curl -skv -H "Authorization: Bearer ${TOKEN}" \ + https://${TRUSTY_ROUTE}/info/inference/ids/${MODEL}?type=organic +---- ++ +You see output similar to the following: ++ +[source] +---- +[ + { + "id":"a3d3d4a2-93f6-4a23-aedb-051416ecf84f", + "timestamp":"2024-06-25T09:06:28.75701201" + } +] +---- + +. Set environment variables to define the two latest inference IDs (highest and lowest predictions). ++ +[source] +---- +export ID_LOWEST=$(curl -s ${TRUSTY_ROUTE}/info/inference/ids/${MODEL}?type=organic | jq -r '.[-1].id') + +export ID_HIGHEST=$(curl -s ${TRUSTY_ROUTE}/info/inference/ids/${MODEL}?type=organic | jq -r '.[-2].id') +---- + +. Use `POST /explainers/local/lime` to request the LIME explanation with the following syntax and payload structure: ++ +*Sytnax*: ++ +---- +curl -sk -H "Authorization: Bearer ${TOKEN}" -X POST \ + -H "Content-Type: application/json" \ + -d +---- ++ +*Payload structure*: + +`PredictionId`:: The inference ID. +`config`:: The configuration for the LIME explanation, including `model` and `explainer` parameters. For more information, see link:https://trustyai-explainability.github.io/trustyai-site/main/trustyai-service-api-reference.html#ModelConfig[Model configuration parameters] and link:https://trustyai-explainability.github.io/trustyai-site/main/trustyai-service-api-reference.html#LimeExplainerConfig[LIME explainer configuration parameters]. + +For example: + +[source] +---- +echo "Requesting LIME for lowest" +curl -s -H "Authorization: Bearer ${TOKEN}" -X POST \ + -H "Content-Type: application/json" \ + -d "{ + \"predictionId\": \"$ID_LOWEST\", + \"config\": { + \"model\": { <1> + \"target\": \"modelmesh-serving:8033\", <2> + \"name\": \"${MODEL}\", + \"version\": \"v1\" + }, + \"explainer\": { <3> + \"n_samples\": 50, + \"normalize_weights\": \"false\", + \"feature_selection\": \"false\" + } + } + }" \ + ${TRUSTYAI_ROUTE}/explainers/local/lime +---- + +[source] +---- +echo "Requesting LIME for highest" +curl -sk -H "Authorization: Bearer ${TOKEN}" -X POST \ + -H "Content-Type: application/json" \ + -d "{ + \"predictionId\": \"$ID_HIGHEST\", + \"config\": { + \"model\": { <1> + \"target\": \"modelmesh-serving:8033\", <2> + \"name\": \"${MODEL}\", + \"version\": \"v1\" + }, + \"explainer\": { <3> + \"n_samples\": 50, + \"normalize_weights\": \"false\", + \"feature_selection\": \"false\" + } + } + }" \ + ${TRUSTYAI_ROUTE}/explainers/local/lime +---- +<1> Specifies configuration for the model. For more information about the model configuration options, see link:https://trustyai-explainability.github.io/trustyai-site/main/trustyai-service-api-reference.html#ModelConfig[Model configuration parameters]. +<2> Specifies the model server service URL. This field only accepts model servers in the same namespace as the TrustyAI service, with or without protocol or port number. ++ +* `http[s]://service[:port]` +* `service[:port]` +<3> Specifies the configuration for the explainer. For more information about the explainer configuration parameters, see link:https://trustyai-explainability.github.io/trustyai-site/main/trustyai-service-api-reference.html#LimeExplainerConfig[LIME explainer configuration parameters]. + +//.Verification \ No newline at end of file diff --git a/modules/requesting-a-lime-explanation.adoc b/modules/requesting-a-lime-explanation.adoc new file mode 100644 index 00000000..5d6a123c --- /dev/null +++ b/modules/requesting-a-lime-explanation.adoc @@ -0,0 +1,13 @@ +:_module-type: CONCEPT + +[id='requesting-a-lime-explanation_{context}'] += Requesting a LIME explanation + +[role='_abstract'] +To understand how a model makes its predictions and decisions, you can use a _Local Interpretable Model-agnostic Explanations_ (LIME) explainer. LIME explains a model's predictions by showing how much each feature affected the outcome. For example, for a model predicting not to target a user for a marketing campaign, LIME provides a list of weights, both positive and negative, indicating how each feature influenced the model's outcome. + +For more information, see link:{odhdocshome}/monitoring-data-science-models/#supported-explainers_explainers[Supported explainers]. + +//You can request a LIME explanation by using the {productname-short} dashboard or by using the OpenShift command-line interface (CLI). + +include::requesting-a-lime-explanation-using-cli.adoc[leveloffset=+1] diff --git a/modules/requesting-a-shap-explanation-using-cli.adoc b/modules/requesting-a-shap-explanation-using-cli.adoc new file mode 100644 index 00000000..6c081e77 --- /dev/null +++ b/modules/requesting-a-shap-explanation-using-cli.adoc @@ -0,0 +1,144 @@ +:_module-type: PROCEDURE + +[id='requesting-a-shap-explanation-using-CLI_{context}'] += Requesting a SHAP explanation by using the CLI + +[role='_abstract'] +You can use the OpenShift command-line interface (CLI) to request a SHAP explanation. + +.Prerequisites + +* Your OpenShift cluster administrator added you as a user to the {openshift-platform} cluster and has installed the TrustyAI service for the data science project that contains the deployed models. + +* You authenticated the TrustyAI service, as described in link:{odhdocshome}/monitoring-data-science-models/#authenticating-trustyai-service_monitor[Authenticating the TrustyAI service]. + +* You have real-world data from the deployed models. + +ifdef::upstream,self-managed[] +* You installed the OpenShift command line interface (`oc`) as described in link:https://docs.openshift.com/container-platform/{ocp-latest-version}/cli_reference/openshift_cli/getting-started-cli.html[Get Started with the CLI]. +endif::[] +ifdef::cloud-service[] +* You installed the OpenShift command line interface (`oc`) as described in link:https://docs.openshift.com/dedicated/cli_reference/openshift_cli/getting-started-cli.html[Getting started with the CLI] (OpenShift Dedicated) or link:https://docs.openshift.com/rosa/cli_reference/openshift_cli/getting-started-cli.html[Getting started with the CLI] (Red Hat OpenShift Service on AWS) +endif::[] + +.Procedure + +. Open a new terminal window. +. Follow these steps to log in to your {openshift-platform} cluster: +.. In the upper-right corner of the OpenShift web console, click your user name and select *Copy login command*. +.. After you have logged in, click *Display token*. +.. Copy the *Log in with this token* command and paste it in the OpenShift command-line interface (CLI). ++ +[source,subs="+quotes"] +---- +$ oc login --token=____ --server=____ +---- + +. Set an environment variable to define the external route for the TrustyAI service pod. ++ +---- +export TRUSTY_ROUTE=$(oc get route trustyai-service -n $NAMESPACE -o jsonpath='{.spec.host}') +---- + +. Set an environment variable to define the name of your model. ++ +---- +export MODEL="model-name" +---- + +. Use `GET /info/inference/ids/${MODEL}` to get a list of all inference IDs within your model inference data set. ++ +[source] +---- +curl -skv -H "Authorization: Bearer ${TOKEN}" \ + https://${TRUSTY_ROUTE}/info/inference/ids/${MODEL}?type=organic +---- ++ +You see output similar to the following: ++ +[source] +---- +[ + { + "id":"a3d3d4a2-93f6-4a23-aedb-051416ecf84f", + "timestamp":"2024-06-25T09:06:28.75701201" + } +] +---- + +. Set environment variables to define the two latest inference IDs (highest and lowest predictions). ++ +[source] +---- +export ID_LOWEST=$(curl -s ${TRUSTY_ROUTE}/info/inference/ids/${MODEL}?type=organic | jq -r '.[-1].id') + +export ID_HIGHEST=$(curl -s ${TRUSTY_ROUTE}/info/inference/ids/${MODEL}?type=organic | jq -r '.[-2].id') +---- + +. Use `POST /explainers/local/shap` to request the SHAP explanation with the following syntax and payload structure: ++ +*Sytnax*: ++ +---- +curl -sk -H "Authorization: Bearer ${TOKEN}" -X POST \ + -H "Content-Type: application/json" \ + -d +---- ++ +*Payload structure*: + +`PredictionId`:: The inference ID. +`config`:: The configuration for the SHAP explanation, including `model` and `explainer` parameters. For more information, see link:https://trustyai-explainability.github.io/trustyai-site/main/trustyai-service-api-reference.html#ModelConfig[Model configuration parameters] and link:https://trustyai-explainability.github.io/trustyai-site/main/trustyai-service-api-reference.html#SHAPExplainerConfig[SHAP explainer configuration parameters]. + +For example: + +[source] +---- +echo "Requesting SHAP for lowest" +curl -sk -H "Authorization: Bearer ${TOKEN}" -X POST \ + -H "Content-Type: application/json" \ + -d "{ + \"predictionId\": \"$ID_LOWEST\", + \"config\": { + \"model\": { <1> + \"target\": \"modelmesh-serving:8033\", <2> + \"name\": \"${MODEL}\", + \"version\": \"v1\" + }, + \"explainer\": { <3> + \"n_samples\": 75 + } + } + }" \ + ${TRUSTYAI_ROUTE}/explainers/local/shap +---- + +[source] +---- +echo "Requesting SHAP for highest" +curl -sk -H "Authorization: Bearer ${TOKEN}" -X POST \ + -H "Content-Type: application/json" \ + -d "{ + \"predictionId\": \"$ID_HIGHEST\", + \"config\": { + \"model\": { <1> + \"target\": \"modelmesh-serving:8033\", <2> + \"name\": \"${MODEL}\", + \"version\": \"v1\" + }, + \"explainer\": { <3> + \"n_samples\": 75 + } + } + }" \ + ${TRUSTYAI_ROUTE}/explainers/local/shap + +---- +<1> Specifies configuration for the model. For more information about the model configuration options, see link:https://trustyai-explainability.github.io/trustyai-site/main/trustyai-service-api-reference.html#ModelConfig[Model configuration parameters]. +<2> Specifies the model server service URL. This field only accepts model servers in the same namespace as the TrustyAI service, with or without protocol or port number. ++ +* `http[s]://service[:port]` +* `service[:port]` +<3> Specifies the configuration for the explainer. For more information about the explainer configuration parameters, see link:https://trustyai-explainability.github.io/trustyai-site/main/trustyai-service-api-reference.html#SHAPExplainerConfig[SHAP explainer configuration parameters]. + +//.Verification \ No newline at end of file diff --git a/modules/requesting-a-shap-explanation.adoc b/modules/requesting-a-shap-explanation.adoc new file mode 100644 index 00000000..800e5903 --- /dev/null +++ b/modules/requesting-a-shap-explanation.adoc @@ -0,0 +1,13 @@ +:_module-type: CONCEPT + +[id='requesting-a-shap-explanation_{context}'] += Requesting a SHAP explanation + +[role='_abstract'] +To understand how a model makes its predictions and decisions, you can use a _SHapley Additive exPlanations_ (SHAP) explainer. SHAP explains a model's prediction by showing a detailed breakdown of each feature's contribution to the final outcome. For example, for a model predicting the price of a house, SHAP provides a list of how much each feature contributed (in monetary value) to the final price. + +For more information, see link:{odhdocshome}/monitoring-data-science-models/#supported-explainers_explainers[Supported explainers]. + +//You can request a SHAP explanation by using the {productname-short} dashboard or by using the OpenShift command-line interface (CLI). + +include::requesting-a-shap-explanation-using-cli.adoc[leveloffset=+1] diff --git a/modules/sending-training-data-to-trustyai.adoc b/modules/sending-training-data-to-trustyai.adoc index 3b939a43..12529cc5 100644 --- a/modules/sending-training-data-to-trustyai.adoc +++ b/modules/sending-training-data-to-trustyai.adoc @@ -1,6 +1,6 @@ :_module-type: PROCEDURE -[id="sending-training-data-to-trustyai{context}"] +[id="sending-training-data-to-trustyai_{context}"] = Sending training data to TrustyAI [role='_abstract'] diff --git a/modules/supported-explainers.adoc b/modules/supported-explainers.adoc new file mode 100644 index 00000000..5459e902 --- /dev/null +++ b/modules/supported-explainers.adoc @@ -0,0 +1,63 @@ +:_module-type: REFERENCE +:stem: + +[id="supported-explainers_{context}"] += Supported explainers + +{productname-long} supports the following explainers: + +*LIME* + +_Local Interpretable Model-agnostic Explanations_ (LIME) footnote:1[Marco Tulio Ribeiro, Sameer Singh, Carlos Guestrin. "Why Should I Trust You?": Explaining the Predictions of Any Classifier." _Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data mining_, 2016. Pages 1135-1144.] is a saliency explanation method. LIME aims to explain a prediction stem:[p = (x, y)] (an input-output pair) generated by a black-box model stem:[f : \mathbb{R}^d \rightarrow \mathbb{R}]. The explanations come in the form of a "saliency" stem:[w_i] attached to each feature stem:[x_i] in the prediction input stem:[x]. LIME generates a local explanation stem:[\xi(x)] according to the following model: + +[stem] +++++ +\xi(x) = \arg\min_{g \in G}L(f, g, \pi_x) + \Omega(g) +++++ + +* stem:[\pi_x] is a proximity function +* stem:[G] the family of interpretable models +* stem:[\Omega(g)] is a measure of complexity of an explanation stem:[g \in G] +* stem:[L(f, g, \pi_x)] is a measure of how unfaithful stem:[g] is in approximating stem:[f] in the locality defined by stem:[\pi_x] + +In the original paper, G is the class of linear models and πx is an exponential kernel on a distance function stem:[D] (for example, cosine distance). LIME converts samples stem:[x_i] from the original domain into interpretable samples as binary vectors stem:[x^{\prime}_i \in {0, 1}]. An encoded data set stem:[E] is built by taking nonzero elements of stem:[x^{\prime}_i], recovering the original representation stem:[z \in \mathbb{R}^d] and then computing stem:[f(z)]. A weighted linear model stem:[g] (with weights provided via stem:[\pi_x]) is then trained on the generated sparse data set stem:[E] and the model weights stem:[w] are used as feature weights for the final explanation stem:[\xi(x)]. + +*SHAP* + +_SHapley Additive exPlanations_ (SHAP), footnote:[Scott Lundberg, Su-In Lee. "A Unified Approach to Interpreting Model Predictions." _Advances in Neural Information Processing Systems_, 2017.] seeks to unify several common explanation methods, notably LIME footnote:1[] and DeepLIFT, footnote:[Avanti Shrikumar, Peyton Greenside, Anshul Kundaje. "Learning Important Features Through Propagating Activation Differences." _CoRR abs/1704.02685_, 2017.] under a common umbrella of additive feature attributions. These methods explain how an input stem:[x = [x_1, x_2, ..., x_M ]] affects the output of some model stem:[f] by transforming stem:[x \in \mathbb{R}^M] into simplified inputs stem:[z^{\prime} \in 0, 1^M] , such that stem:[z^{\prime}_i] indicates the inclusion or exclusion of feature stem:[i]. The simplified inputs are then passed to an explanatory model stem:[g] that takes the following form: + +[stem] +++++ +x = h_x(z^{\prime}) \\ +++++ + +[stem] +++++ +g(z^{\prime}) = \phi_0 + \sum_{i=1}^M \phi_i z_i^{\prime} \\ +++++ + +[stem] +++++ +\textbf{s.t.}\quad g(z^{\prime}) \approx f (h_x(z^{\prime})) +++++ + +In that form, each value stem:[\phi_i] marks the contribution that feature stem:[i] had on the output model (called the attribution), and stem:[\phi_0] marks the null output of the model; the model output when every feature is excluded. Therefore, this presents an easily interpretable explanation of the importance of each feature and a framework to permute the various input features to establish their collection contributions. + +The final result of the algorithm are the Shapley values of each feature, which give an itemized "receipt" of all the contributing factors to the decision. For example, a SHAP explanation of a loan application might be as follows: + +[%autowidth] +|=== +|Feature | Shapley Value φ + +|Null Output | 50% +|Income | +10% +|# Children | -15% +|Age | +22% +|Own Home? | -30% +|Acceptance% | 37% +|Deny | 63% +|=== + +From this, the applicant can see that the biggest contributor to their denial was their home ownership status, which reduced their acceptance probability by 30 percentage points. Meanwhile, their number of children was of particular benefit, increasing their probability by 22 percentage points. + + diff --git a/monitoring-data-science-models.adoc b/monitoring-data-science-models.adoc index 6e6d8593..bd930db0 100644 --- a/monitoring-data-science-models.adoc +++ b/monitoring-data-science-models.adoc @@ -24,3 +24,5 @@ include::assemblies/setting-up-trustyai-for-your-project.adoc[leveloffset=+1] include::assemblies/monitoring-model-bias.adoc[leveloffset=+1] include::assemblies/monitoring-data-drift.adoc[leveloffset=+1] + +include::assemblies/using-explainability.adoc[leveloffset=+1]