Doctor-Assistant

Setup

To setup this project, you need to use Python 3.12 or higher. You can either install dependencies via requirements.txt or using Poetry and the provided poetry.lock file.

Additionally, you will need to have Nvidia CUDA version 12.1 installed on the system.

Server

Then, create a .env file using the structure shown in .env.example. The FAISS_STORAGE_DIR is where the indexing files for the FAISS Vector Store will be saved. The KB_DIR is where you place text files with information you want to be indexed. For the purpose of this project, the textbooks provided in this repository were used.

After creating the directories and updating the .env file, run python -m scripts.create_faiss_index (make sure you activate it first)

Lastly, you can start the server using fastapi dev .\server.py --no-reload --port 8000. There is another repository with a React based web UI that can be used to interface with this server.

Scripts

In the scripts directory, there are a couple of scripts that can be run for setup and evaluation of the various RAG implementations. To use the scripts, a .env.scripts file must be created and configured. All fields necessary are documented in the .env.scripts.example file.

Scripts should be run in module mode (python -m ...).The scripts are:

scripts.create_faiss_index.py will create an FAISS index using files from the KB_DIR and store the data in the FAISS_STORIAGE_DIR. The FAISS index is creating using the CHUNK_SIZE and CHUNK_OVERLAP parameters in the .env.scripts file.
scripts.create_test_output.py runs all RAG methods over the question answer test set (subset of the MedQuAD dataset) from the TEST_INPUT_DIR. The generated outputs are stored in the TEST_OUTPUT_DIR.
scripts.create_test_score.py calculates BERTScore and cosine similarity between the generated response of each RAG method and the expected response. This script reads generated output from theTEST_OUTPUT_DIR and stores the scores in the TEST_OUTPUT_DIR.
scripts.create_text_mult_output.py will read the multiple choice test set (subset of the MedQA) from the TEST_INPUT_DIR and run all RAG methods over it. Results are stored in the TEST_OUTPUT_DIR

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
docs		docs
knowledge-base		knowledge-base
model		model
out		out
scripts		scripts
services		services
storage		storage
test		test
.env.example		.env.example
.env.scripts.example		.env.scripts.example
.gitignore		.gitignore
README.md		README.md
eval-mult.ipynb		eval-mult.ipynb
eval.ipynb		eval.ipynb
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
server.py		server.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Doctor-Assistant

Setup

Server

Scripts

About

Releases

Packages

Languages

RishabD/Doctor-Assistant

Folders and files

Latest commit

History

Repository files navigation

Doctor-Assistant

Setup

Server

Scripts

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages