Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

treescope visualization for attribute_context #284

Merged
merged 4 commits into from
Aug 14, 2024

Conversation

gsarti
Copy link
Member

@gsarti gsarti commented Aug 12, 2024

Description

This PR introduces a new visualization based on the treescope library for the attribute_context CLI method.

Currently, the visualization is produced using rich using the same format for console and notebook environments:

image

The main limitations of this mode of visualization are:

  • The context needs to be repeated for every attributed token in output_current_text, which can produce very long sequences.
  • Although attribution (CCI) scores are collected for all context tokens, only a subset is shown using the attribution_std_threshold and attribution_topk arguments of AttributeContextArgs to highlight only the most salient ones.
  • Default rendering of strings and numbers in color for rich, producing confusing visualizations.

The new proposed visualization is the following:

image

It includes:

  • Foldable parameters list that can be shown upon clicking (expanded in figure)
  • Long contexts are collapsed by default and can be expanded upon clicking (collapsed in figure)
  • A single output sequence with color gradients representing context sensitivity (CTI) scores, with salient tokens (as selected by AttributeContextArgs.context_sensitivity_std_threshold and AttributeContextArgs.context_sensitivity_topk) allowing for expansion to visualize attribution scores for that step (using context_sensitivity_std_threshold = None performs attribution for every step).
  • CTI and CCI scores are shown upon hovering context-sensitive generated tokens.
  • The non-contextual alternative for the current context-sensitive token is also shown.

Details about usage:

  • Shown as default representation of AttributeContextOutput objects in notebooks thanks to __treescope_repr__.
  • Used as the default method to show outputs for the CLI in notebook environments with visualize_attribute_context.
  • Used as the default method to save outputs to HTML in all environments if visualization is turned off.
  • rich is preserved as the default when showing the visualization in console environments. If HTML serialization is also requested, the rich outputs are saved.

@gsarti gsarti merged commit 9de007e into main Aug 14, 2024
3 checks passed
@gsarti gsarti deleted the treescope-attribute-context branch August 14, 2024 07:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant