Transformer interpretability research

Transformer models are improving at a rapid pace, making it of paramount importance to develop methods to explain, reverse-engineer, and visualize their inner workings. In this project, we study the interpretability of transformer models through a series of experiments divided into two parts:

Visualizing Transformer Attention
- Results published in paper AttentionViz: A Global View of Transformer Attention.
Exploring Induction Heads in BERT

This research was conducted as part of an independent study at the Harvard Insight and Interaction Lab under mentorship of Professor Martin Wattenberg, Professor Fernanda Viégas, and Catherine Yeh. The full write-up of this project can be found here.

Name		Name	Last commit message	Last commit date
Latest commit History 37 Commits
Attention Visualization		Attention Visualization
Induction Heads		Induction Heads
.gitignore		.gitignore
Paper.pdf		Paper.pdf
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Transformer interpretability research

About

Releases

Packages

Languages

chenxcynthia/transformers

Folders and files

Latest commit

History

Repository files navigation

Transformer interpretability research

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages