NLP_CTF

Carleton College ML/NLP CS Comps 2023

Thomas Zeng, Teagan Johnson, Jared Chen, Nathan Hedgecock

A reimplementation of:

Garg, Sahaj, et al. "Counterfactual fairness in text classification through robustness." Proceedings of the 2019 AAAI/ACM Conference on AI, Ethics, and Society. 2019.

Installation

To install dependencies run the following command in the root directory:

pip install -r requirements.txt

We also require three files that must be manually installed due to file size.

Word2Vec embedings (specifically the GoogleNews-vecotors-negative300) to be installed in the data subdirectory

wget -O GoogleNews-vectors-negative300.bin  'https://www.dropbox.com/s/mlg71vsawice3xd/GoogleNews-vectors-negative300.bin?dl=1'

glove embeddings (specifically the glove_840B_300d) to be installed in the data subdirectory -- only necessary if desire to run tests using glove instead of word2vec

wget -O glove_840B_300d.txt 'https://www.dropbox.com/s/a3meyi58v0jy4tv/glove_840B_300d.txt?dl=1'

Civil Comments dataset (We use a modified version from the paper WILDS: A Benchmark of in-the-Wild Distribution Shifts). Should be installed in the data/civil_comments subdirectory

wget -O civil_comments.csv 'https://www.dropbox.com/s/xv8zkmcmg74n0ak/civil_comments.csv?dl=1'

(NOTE: wget links may go stale as they are linked to my school address, if so you'll need to manually find them and install)

Code Usage

For a quick demonstration of our experiments, open the run.ipynb jupyternotebook in the notebooks subdirectory. The notebook has an "Open in Colab" button at the top that allows one to run all the experiments in Colab (Note: A premium Colab instance may be required due to memory limitations of the free tier).

To run our tests use run.py.

As an example the following group will run 10 trials of our baseline model and save results to baseline_experiment.csv.

python run.py baseline -v -n baseline_experiment

Miscellaneous

run.py is currently configured to default to mps on Apple silicon device. Due to bugs in PyTorch implementation -- tests do not work on mps. We thus sugget manually setting the flag -d cpu if using run.py on an Apple silicon device.

License

This source code is released under the MIT license, included here.

Name		Name	Last commit message	Last commit date
Latest commit History 203 Commits
data		data
notebooks		notebooks
results		results
scripts		scripts
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
loss.py		loss.py
models.py		models.py
process_data.py		process_data.py
requirements.txt		requirements.txt
run.py		run.py
run_implementation.ipynb		run_implementation.ipynb
run_more.py		run_more.py
train_eval.py		train_eval.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

NLP_CTF

Installation

Code Usage

Miscellaneous

License

About

Releases

Packages

Contributors 2

Languages

License

mtzig/NLP_CTF

Folders and files

Latest commit

History

Repository files navigation

NLP_CTF

Installation

Code Usage

Miscellaneous

License

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages