Author: Zhecheng Li && Professor: Ndapa Nakashole
pip install -r requirements.txt
All datasets are already in the GitHub repo.
-
If you want to train with
traditional attention
andmean embedding output
, use:python main.py --run "encoder_classic_mean"
-
If you want to train with
slide window attention
andmean embedding output
, use:python main.py --run "encoder_window_attention"
-
If you want to train with
alibi relative positional embedding
andmean embedding output
, use:python main.py --run "encoder_alibi"
-
If you want to train with
disentangled attention patterns
andmean embedding output
, use:python main.py --run "encoder_deberta"
-
If you want to train with extra [cls] token to represent the final embedding output, use:
python main.py --run "encoder_cls_token"
You can change the parameters in main.py
, but you should be able to get around 86-87% accuracy using default values.
-
If you want to train the traditional decoder-only model for text generation, use:
python main.py --run "decoder"
You can also change the parameters in main.py
, but you should be able to get around 4.8 loss using default values.
You are welcome to discuss any issues you encounter while running this GitHub repository. Feel free to either open an issue or contact me directly at zhl186@ucsd.edu.