Skip to content

Commit

Permalink
upd: better onboarding doc
Browse files Browse the repository at this point in the history
  • Loading branch information
AlexisVLRT committed Jan 5, 2024
1 parent d491134 commit 7881acd
Show file tree
Hide file tree
Showing 9 changed files with 2,821 additions and 76 deletions.
15 changes: 13 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -23,24 +23,35 @@ pip install -r requirements.txt

You will need to set some env vars, either in a .env file at the project root, or just by exporting them like so:
```shell
export PYTHONPATH=.
export OPENAI_API_KEY="xxx" # API key used to query the LLM
export EMBEDDING_API_KEY="xxx" # API key used to query the embedding model
export DATABASE_URL="sqlite:///$(pwd)/database/db.sqlite3" # For local developement only. You will need a real, cloud-based SQL database URL for prod.
```

Start the backend server locally
```shell
python backend/main.py
python -m uvicorn backend.main:app
```

Start the frontend demo
```shell
streamlit run frontend/app.py
python -m streamlit run frontend/app.py
```

You should than be able to login and chat to the bot:

![](docs/login_and_chat.gif)

Right now the RAG does not have any document loaded, let's add a sample:
```shell
python data_sample/add_data_sample_to_rag.py
```

The RAG now has access to thn information from your loaded documents:

![](docs/query_with_knowledge.gif)

## Documentation

To deep dive into under the hood, take a look at the documentation
Expand Down
15 changes: 10 additions & 5 deletions backend/config.yaml
Original file line number Diff line number Diff line change
@@ -1,7 +1,12 @@
LLMConfig: &LLMConfig
source: ChatOllama
source: AzureChatOpenAI
source_config:
model: llama2
openai_api_type: azure
openai_api_key: {{ OPENAI_API_KEY }}
openai_api_base: https://genai-ds.openai.azure.com/
openai_api_version: 2023-07-01-preview
deployment_name: gpt4
temperature: 0.1

VectorStoreConfig: &VectorStoreConfig
source: Chroma
Expand All @@ -10,12 +15,12 @@ VectorStoreConfig: &VectorStoreConfig
collection_metadata:
hnsw:space: cosine

retriever_search_type: similarity
retriever_search_type: similarity_score_threshold
retriever_config:
top_k: 20
k: 20
score_threshold: 0.5

insertion_mode: full
insertion_mode: null

EmbeddingModelConfig: &EmbeddingModelConfig
source: OpenAIEmbeddings
Expand Down
10 changes: 10 additions & 0 deletions data_sample/add_data_sample_to_rag.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
from pathlib import Path

from backend.rag_components.rag import RAG


config_directory = Path("backend/config.yaml")
rag = RAG(config_directory)

data_sample_path = Path("data_sample/billionaires_csv.csv")
print(rag.load_file(data_sample_path))
2,641 changes: 2,641 additions & 0 deletions data_sample/billionaires_csv.csv

Large diffs are not rendered by default.

Binary file modified docs/login_and_chat.gif
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/query_with_knowledge.gif
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
10 changes: 5 additions & 5 deletions docs/recipe_vector_stores_configs.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,12 +8,12 @@ VectorStoreConfig: &VectorStoreConfig
collection_metadata:
hnsw:space: cosine

retreiver_search_type: similarity
retreiver_config:
top_k: 20
retriever_search_type: similarity_score_threshold
retriever_config:
k: 20
score_threshold: 0.5

insertion_mode: full
insertion_mode: null
```
`persist_directory`: where, locally the Chroma database will be persisted.
Expand All @@ -24,4 +24,4 @@ VectorStoreConfig: &VectorStoreConfig

`score_threshold`: score below which a document is deemed irrelevant and not fetched.

`ìnsertion_mode`: `null` | `full` | `incremental`. [How document insertion in the vector store is handled.](https://python.langchain.com/docs/modules/data_connection/indexing#deletion-modes)
`ìnsertion_mode`: `null` | `full` | `incremental`. [How document indexing and insertion in the vector store is handled.](https://python.langchain.com/docs/modules/data_connection/indexing#deletion-modes)
2 changes: 1 addition & 1 deletion requirements.in
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ streamlit
extra_streamlit_components
google-cloud-storage
openai==0.28.1
langchain==0.0.316
langchain==0.0.354
chromadb==0.4.14
tiktoken
gcsfs
Expand Down
Loading

0 comments on commit 7881acd

Please sign in to comment.