This is the project code to develop a foundational Question and Answer (Q&A) system for further uses. The primary focus is to develop the Q&A system that can process the basic user queries.
The first model employs DistilBERT, imported from Hugging Face. Functioning as an 'Extractive' AI, it extracts answers from provided contexts.
DistilBERT heavily relies on the transformers library. Hugging Face recommends an optimal environment for running Transformers offline.
As the developer of this code, I've listed the specific environment configurations required to replicate my setup and ensure the code runs smoothly.
- python 3.9.18
- pytorch 1.13.1
- pytorch-cuda 11.7
- transformers 4.39.2
- dataset 2.18
I utilize a conda environment to set up PyTorch GPU, which is essential for facilitating my project. This configuration has proven particularly beneficial for me, especially considering the challenges of configuring such environments on a Windows 10 system.
conda install pytorch==1.31.1 torchvision==0.14.1 torchaudio==0.13.1 pytorch-cuda=11.7 -c pytorch -c nvidia
Hope it works.
ON Unix Cluster (JHPCE approach)
I also implemented the bash command to establish a stable conda environment for the code to run. Check from this directory.
The tutorial is under the directory tutorial for you to know how the code in distilBERT.py means.
Again, I recommend you to looks through the Bash Description before this section
First of all, I assume you are in the directory of ~/QA-System-BERT
You need to give the permission to /QA-System-BERT/bash/install_conda_environment.sh
chmod +x ./bash/install_conda_environment.sh
Then run it to install conda Environment
./bash/install_conda_environment.sh
After installing conda environment, you can now give SLURM the job to do, like initialize a model
sbatch ./bash/initial_model_script.sh