diff --git a/docs/demos/lm.html b/docs/demos/lm.html index 9b305159..ff17ae2c 100644 --- a/docs/demos/lm.html +++ b/docs/demos/lm.html @@ -2,4 +2,4 @@ - + \ No newline at end of file diff --git a/lit_nlp/components/umap_test.py b/lit_nlp/components/umap_test.py index a5d5c877..550be257 100644 --- a/lit_nlp/components/umap_test.py +++ b/lit_nlp/components/umap_test.py @@ -37,7 +37,7 @@ def test_fit_transform(self): # Check that the _fitted flag has been flipped. self.assertTrue(umap_model._fitted) - # Check correctness of the output shape. + # Check that the output shape is correct. output_np = np.array([o['z'] for o in outputs_list]) shape = output_np.shape expected_shape = (n, 3) diff --git a/website/sphinx_src/components.md b/website/sphinx_src/components.md index 29bd8a9e..c7d97708 100644 --- a/website/sphinx_src/components.md +++ b/website/sphinx_src/components.md @@ -433,36 +433,47 @@ You don't have to call the field "label", and it's okay if this field isn't present in the *dataset* - as long as it's something that the model will recognize and use as the target to derive gradients. -### Sequence salience +## Sequence salience -Sequence salience generalizes the salience methods mentioned above to -text-to-text generative models and explains the impact of the preceding tokens -on the generated tokens. Currently, we support sequence salience computation for -various OSS modeling frameworks, including KerasNLP and Hugging Face -Transformers. +Sequence salience generalizes token-based salience to text-to-text models, +allowing you to explain the impact of the prompt tokens on parts of the model +output. -Sequence salience in the LIT UI provides multiple options for analysis, -including: +LIT has a general-purpose sequence salience visualization designed for +left-to-right ("causal") language models. Currently, this works out-of-the-box +with +[GPT-2 models](https://github.com/PAIR-code/lit/blob/main/lit_nlp/examples/lm_salience_demo.py) +and with the new Gemma LMs via +[this Colab](https://colab.research.google.com/github/google/generative-ai-docs/blob/main/site/en/gemma/docs/lit_gemma.ipynb). -* running the salience methods on the text from the dataset (target) or from - the model (response). -* computing the sequence salience through [Gradient Norm](#gradient-norm) or - [Gradient-dot-Input](#gradient-dot-input). -* selecting different granularity levels for salience analysis, from the - smallest possible level of tokens, to more interpretable larger spans, such - as words, sentences, lines, or paragraphs. +![Sequence salience - sequence selection](./images/components/sequence-salience-1.png){w=650px align=center} -(a) Options for sequence salience. | (b) Sequence salience visualization. --------------------------------------------------------------------------------------------------- | ------------------------------------ -![Sequence salience selections](./images/components/sequence-salience-selections.png){w=650px align=center} | ![Sequence salience vis](./images/components/sequence-salience-vis.png){w=650px align=center} +![Sequence salience - visualization](./images/components/sequence-salience-2.png){w=650px align=center} +The UI supports multiple options for analysis, including: -**Code:** +* Select from pre-defined target sequences, or explain generations from the + model. +* Different salience methods, including [Gradient Norm](#gradient-norm) and + [Gradient-dot-Input](#gradient-dot-input). +* Multiple granularity levels for analysis, from individual sub-word tokens up + to words, sentences, lines, or paragraphs. Quickly switch between different + views to refine your analysis to different parts of a prompt. +* Display density options to enable working with longer sequences, such as + document text, few-shot eaxmples, or chain-of-thought prompts. -* Demo: [`lm_salience_demo.py`](https://github.com/PAIR-code/lit/blob/main/lit_nlp/examples/lm_salience_demo.py) -* KerasNLP model wrappers: [`instrumented_keras_lms.py`](https://github.com/PAIR-code/lit/blob/main/lit_nlp/examples/models/instrumented_keras_lms.py) -* Transformers model wrappers: [`pretrained_lms.py`](https://github.com/PAIR-code/lit/blob/main/lit_nlp/examples/models/pretrained_lms.py) +For a walkthrough of how to use sequence salience to debug LLMs, check out the +Responsible Generative AI Toolkit at +https://ai.google.dev/responsible/model_behavior. + +**Code:** +* Demo: + [`lm_salience_demo.py`](https://github.com/PAIR-code/lit/blob/main/lit_nlp/examples/lm_salience_demo.py) +* KerasNLP model wrappers: + [`instrumented_keras_lms.py`](https://github.com/PAIR-code/lit/blob/main/lit_nlp/examples/models/instrumented_keras_lms.py) +* Transformers model wrappers: + [`pretrained_lms.py`](https://github.com/PAIR-code/lit/blob/main/lit_nlp/examples/models/pretrained_lms.py) ## Salience Clustering diff --git a/website/sphinx_src/images/components/sequence-salience-1.png b/website/sphinx_src/images/components/sequence-salience-1.png new file mode 100644 index 00000000..1ee7012b Binary files /dev/null and b/website/sphinx_src/images/components/sequence-salience-1.png differ diff --git a/website/sphinx_src/images/components/sequence-salience-2.png b/website/sphinx_src/images/components/sequence-salience-2.png new file mode 100644 index 00000000..e4bfc4cc Binary files /dev/null and b/website/sphinx_src/images/components/sequence-salience-2.png differ diff --git a/website/sphinx_src/images/components/sequence-salience-selections.png b/website/sphinx_src/images/components/sequence-salience-selections.png deleted file mode 100644 index 10fa5386..00000000 Binary files a/website/sphinx_src/images/components/sequence-salience-selections.png and /dev/null differ diff --git a/website/sphinx_src/images/components/sequence-salience-vis.png b/website/sphinx_src/images/components/sequence-salience-vis.png deleted file mode 100644 index 50df3de3..00000000 Binary files a/website/sphinx_src/images/components/sequence-salience-vis.png and /dev/null differ