GitHub - lolofo/stage_4_gm: The objective here is to study the plausibility of attention mechanisms in automatic language processing on an NLI (natural naguage inference) task, in transformers (BERT) architecture

Table of content

Introduction
Natural Language Inference (NLI) task
The data (SNLI dataset)
Command lines (How to use this git)
- Pytorch lightning script

Introduction

Natural Language Inference (NLI) task

The data (SNLI dataset)

Command lines (How to use this git)

First of all make sure to use the environnement.

Virtualenv - pip environment (recommended)

Path to $VENV should be saved in ~/.bashrc

# Specify path to venv
export VENV=path/to/venv
echo $VENV

# Create venv
python -m venv $VENV/bert

# Activate venv
source $VENV/bert/bin/activate

# Replicate on cpu
pip install -r python_env/requirements.cpu.txt --no-cache-dir

# Replicate on gpu
pip install -r python_env/requirements.gpu.txt --no-cache-dir

# Exit venv
deactivate

Virtualenv - conda environment

if you are using conda you can use the two following command :

conda env create -f python_env/environment.yml
conda activate nlp

conda create --name nlp --file requirements.txt
conda activate nlp

WARNING: All the environments were exported on windows 11 -64 bits.

Download the data

To download the snli and e-snli data the command line is the following :

python data_download.py

All the data downloaded in this part will be stored in the folder : .cache\raw_data

Pytorch lightning training script

To run the training_bert.py for some tests we used the following command line :

python training_bert.py --epoch 3 --batch_size 4 --nb_data 16 --experiment bert --version 0

# Or by shorthand
python training_bert.py -e 3 -b 4 -n 16 --experiment bert --version 0

The objective was only to see the behaviour of the training with a small amount of data. (Spot some mistakes and see the behaviour of the loss)

To visualize our training performance we used the tool tensorboard. The default logdir in in .cache/logs/$EXPERIMENT where $EXPERIMENT is specified in --experiment. The log could be changed using flag --logdir or shorthand -s

tensorboard --logdir .cache/logs/$EXPERIMENT

Name		Name	Last commit message	Last commit date
Latest commit History 289 Commits
attention_algorithms		attention_algorithms
dataset		dataset
gradient_methods		gradient_methods
inference_scripts		inference_scripts
modules		modules
python_env		python_env
shell_scripts		shell_scripts
.gitignore		.gitignore
README.md		README.md
custom_data_set.py		custom_data_set.py
data_download.py		data_download.py
e_snli_dataset.py		e_snli_dataset.py
logger.py		logger.py
regularize_training_bert.py		regularize_training_bert.py
regularize_training_bert_ent_modify.py		regularize_training_bert_ent_modify.py
regularize_training_bert_lagrange.py		regularize_training_bert_lagrange.py
torch_set_up.py		torch_set_up.py
training_bert.py		training_bert.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Table of content

Introduction

Natural Language Inference (NLI) task

The data (SNLI dataset)

Command lines (How to use this git)

Virtualenv - pip environment (recommended)

Virtualenv - conda environment

Download the data

Pytorch lightning training script

About

Releases

Packages

Languages

lolofo/stage_4_gm

Folders and files

Latest commit

History

Repository files navigation

Table of content

Introduction

Natural Language Inference (NLI) task

The data (SNLI dataset)

Command lines (How to use this git)

Virtualenv - pip environment (recommended)

Virtualenv - conda environment

Download the data

Pytorch lightning training script

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages