Name		Name	Last commit message	Last commit date
parent directory ..
clean_runs		clean_runs
cleaned_models		cleaned_models
Cleaned Results.xlsx		Cleaned Results.xlsx
CrossDataset_TPU.ipynb		CrossDataset_TPU.ipynb
README.md		README.md
lit.py		lit.py
models.py		models.py
requirements.txt		requirements.txt
test_model.py		test_model.py
test_model_xla.py		test_model_xla.py
train_model.py		train_model.py
trainer.py		trainer.py

README.md

Deep Learning Experiments

Dependencies

Requirements can be downloaded by running

pip install -r requirements.txt

Training Script

The training script is train_models.py which loads a jsonlines dataset, splits it 80/10/10 into a train-validate-test split, and trains for 10 epochs with early stopping. It supports training on multiple GPUs using PyTorch's DataParallel wrapper and mini-batch training with gradient accumulation. It outputs a trained model and a JSON file containing the indexes of the validation set used and the held out test set.

Users can either pass in training parameters via command-line arguments or via a JSON file. If a user chooses to use command-line arguments, the script will automatically save training parameters in a JSON file.

Current training parameters include the following:

datasets - The list of datasets to train on.
base_model - The pre-trained model to use as the base. Our code currently supports bert-base-uncased and roberta-base
output_layers - The number of layers to use in the output head. (Can be 1, 2 or 3).
weighted_loss - Whether to use a weighted loss function
lr - The ADAM learning rate to use
max_grad_norm - The threshold used for gradient clipping.
weight_decay - The weight decay value to use
dropout_rate - The dropout value to use before the linear layer
batch_size - The batch size to use between optimizer steps

To load training parameters from a JSON file, run

python train_model.py --config [JSON FILE PATH]

Evaluation Script

We have two options for evaluating models - one on GPUs and one on Google Colab TPUs. Both methods can evaluate a model against multiple datasets. If a dataset was used to train the model, they evaluate the model against the test set held out during training. Otherwise, they evaluate the model against the entire dataset.

To test models on GPUs, use the test_model.py script as follows:

test_model.py --model model_name --datasets dataset1.jsonl dataset2.jsonl

To test models on TPUs, import the test_xla_mp function as follows:

import os

from glob import glob 
os.chdir("/content/deep learning")
print(os.getcwd())
from test_model_xla import test_xla_mp

We have a full evaluation notebook in here

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

deep learning

deep learning

README.md

Deep Learning Experiments

Dependencies

Training Script

Evaluation Script

Files

deep learning

Directory actions

More options

Directory actions

More options

Latest commit

History

deep learning

Folders and files

parent directory

README.md

Deep Learning Experiments

Dependencies

Training Script

Evaluation Script