#Chat with your documents

This repo is an implementation of a locally hosted chatbot specifically focused on question answering over documents of different formats.

Built with LangChain.

The app leverages LangChain's streaming support and async API to update the page in real time for multiple users.

Getting Started

First, create a new .env file from .env.example and add your OpenAI API key found here.

cp .env.example .env

Prerequisites

Node.js (v16 or higher)
Yarn
wget (on macOS, you can install this with brew install wget)

Next, we'll need to load our data source.

Data Ingestion

Data ingestion happens in two steps.

First, you should save your documents in source_documentsfolder.

Different formats are supported:

csv
doc/docx
enex (Evernote)
eml (e-mail)
epub
html
md
odt
pdf
ppt/pptx
txt

Next, install dependencies and run the ingestion script:

yarn
cd ingest
pip install -r requirements.txt
python ingest_docs.py

Note: If on Node v16, use NODE_OPTIONS='--experimental-fetch' yarn ingest

This will parse the data, split text, create embeddings, store them in a vectorstore, and then save it to the db/ directory.

We save it to a directory because we only want to run the (expensive) data ingestion process once.

The Next.js server relies on the presence of the db/ directory. Please make sure to run this before moving on to the next step.

Running the Server

Then, run the development server:

yarn dev

Open http://localhost:3000 with your browser to see the result.

Deploying the server

The production version of this repo is hosted on fly. To deploy your own server on Fly, you can use the provided fly.toml and Dockerfile as a starting point.

Inspirations

This repo borrows heavily from

ChatLangChain - for the backend and data ingestion logic
LangChain Chat NextJS - for the frontend.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Getting Started

Prerequisites

Data Ingestion

Running the Server

Deploying the server

Inspirations

Files

README.md

Latest commit

History

README.md

File metadata and controls

Getting Started

Prerequisites

Data Ingestion

Running the Server

Deploying the server

Inspirations