Chat with your Documentation...!!!

This project is inspired by the youtube video by Alejandro AO - Software & Ai that uses OpenAI API to create a question answering chatbot that responds to user queries based on their documents.

My implementation of the project deviates from the original video in the following ways -

I am using locally installed instances of LLMs - Specifically GPT4ALL models for inference and Huggingface Embedding models (both through langchain)
I have incorporated a Chroma vector store to persist document texts and the embeddings to avoid re-ingesting the documents each time
I have tried out multiple GPT4ALL models as well as Huggingface text embedding models.
I have also kept code implementation (commented out) for doing the same through OpenAI API as well.

I have tried two approaches to implement this solution namely, The Conversation Chain with buffer memory as shown in the original video as well as a Retrieval QA chain approach.

How It Works

The application follows these steps to provide responses to your questions:

PDF Loading: The app reads multiple PDF documents and extracts their text content.
Text Chunking: The extracted text is divided into smaller chunks that can be processed effectively.
Language Model: The application utilizes a language model to generate vector representations (embeddings) of the text chunks.
Similarity Matching: When you ask a question, the app compares it with the text chunks and identifies the most semantically similar ones.
Response Generation: The selected chunks are passed to the language model, which generates a response based on the relevant content of the PDFs.

Dependencies and Installation

To install the application, please follow these steps:

Clone the repository to your local machine.
Install the required dependencies by running the following command:
```
pip install -r requirements.txt
```
Obtain an API key from OpenAI & Huggingface (If your are using OpenAI or Huggingface) and add it to the .env file in the project directory.
Please create the following directories in your local environment for storing the database and LLM models -

./model_cache
./documents_cache
./tmp_documents
./logs

Usage

To use the application, follow these steps:

Ensure that you have installed the required dependencies and added the API keys to the .env file (as required).
Run the document_chat_retrieval_qa.py file using the Streamlit CLI. Execute the following command:
```
streamlit run document_chat_retrieval_qa.py
```
The application will launch in your default web browser, displaying the user interface.
Load multiple PDF documents into the app by following the provided instructions.
Ask questions in natural language about the loaded PDFs using the chat interface.

Contributing

This repository is intended for educational purposes and does not accept further contributions. Feel free to utilize and enhance the app based on your own requirements.

Please note that running the GPT4ALL & Huggingface embedding models on local computer is very slow without a built in GPU with CUDA support. You will need to install the Cuda toolkit from Nvidia on supported machine and compile/install pytorch for cuda support in order to take advantage of GPU support.

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
docs		docs
.gitignore		.gitignore
README.md		README.md
document_chat_conversational_retrieval_chain.py		document_chat_conversational_retrieval_chain.py
document_chat_retrieval_qa.py		document_chat_retrieval_qa.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Chat with your Documentation...!!!

How It Works

Dependencies and Installation

Usage

Contributing

About

Uh oh!

Releases

Packages

Uh oh!

Languages

devanandk/document-chat

Folders and files

Latest commit

History

Repository files navigation

Chat with your Documentation...!!!

How It Works

Dependencies and Installation

Usage

Contributing

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages