Relative Importance in Sentence Processing

This project contains code for the following paper:

Nora Hollenstein, Lisa Beinborn (2021):
Relative Importance for Sentence Processing
to appear in the proceedings of ACL 2021

1. Re-run experiments

In order to re-run our analyses, simply run python analyze_all.py. Some calculations take a bit more time. Feel free to comment them out.

If you want to reproduce the data extraction, run python extract_all.py. Note that this takes quite long. You need to make sure that you have downloaded the eye-tracking corpora (see extract_human_fixations/README.md for details).

2. Adding models or eye-tracking datasets

If you want to run the code for another model, you need to modify extract_all.py.
You can also add another eye-tracking dataset there but you would need to implement a new data_extractor. Note that aligning the tokenization of eye-tracking corpora with the language model tokenizers can be tricky.

3. Folder structure

extract_human_fixations: code to extract the relative fixation duration from two eye-tracking corpora and average it over all subjects. The two corpora are GECO and ZUCO.
extract_model_importance: code to extract saliency-based and attention-based importance from transformer-based language models.
analysis: code to compare and analyze patterns of importance in the human fixation durations and the model data. Also contains code to replicate the plots in the paper.
plots: contains all plots.
results: contains intermediate results.

4. Requirements

Python should be <= 3.8.

We use the following packages (see requirements.txt):
numpy (1.19.5), tensorflow (2.4.1), transformers (4.2.2), scikit-learn (0.22.2), spacy (2.3.5), wordfreq (2.3.2), scipy (1.4.1)

Note that later versions of transformers might lead to errors.

To install, create and activate a virtual environment and run:
pip3 install -r requirements.txt

For the more fine-grained analyses (POS-tags, word frequencies), you need to download the English spaCy model en_core_web_md to your virtual environment:
python -m spacy download en_core_web_md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Relative Importance in Sentence Processing

1. Re-run experiments

2. Adding models or eye-tracking datasets

3. Folder structure

4. Requirements

About

Releases

Packages

Contributors 2

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
analysis		analysis
extract_human_fixations		extract_human_fixations
extract_model_importance		extract_model_importance
plots		plots
results		results
LICENSE.txt		LICENSE.txt
README.md		README.md
analyze_all.py		analyze_all.py
attention_plot.py		attention_plot.py
extract_all.py		extract_all.py
requirements.txt		requirements.txt
test_gpt2.py		test_gpt2.py

License

beinborn/relative_importance

Folders and files

Latest commit

History

Repository files navigation

Relative Importance in Sentence Processing

1. Re-run experiments

2. Adding models or eye-tracking datasets

3. Folder structure

4. Requirements

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages