-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
1 parent
d36f22d
commit 0eeef24
Showing
8 changed files
with
298 additions
and
2 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,43 @@ | ||
The database config is the "easiest" as it only requires a database URL. | ||
|
||
So far, `sqlite`, `mysql`, and `postgresql` are supported. | ||
|
||
CloudSQL on GCP, RDS on AWS, or Azure Database will allow you to deploy `mysql`, and `postgresql` database instances. | ||
|
||
!!! warning | ||
If using `mysql` or `postgresql` you will need to also create a database, typically named `rag`, to be able to use it. | ||
|
||
You will also need to create a user, and get its password. Make sure there are no spacial characters in the password. | ||
|
||
|
||
As the database URL contains a username and password, we don't want to have it directly in the `config.yaml`. | ||
|
||
Instead, we have: | ||
```yaml | ||
# backend/config.yaml | ||
DatabaseConfig: &DatabaseConfig | ||
database_url: {{ DATABASE_URL }} | ||
``` | ||
And `DATABASE_URL` is coming from an environment variable. | ||
|
||
The connection strings are formated as follows: | ||
|
||
- **SQLite:** `sqlite:///database/rag.sqlite3` | ||
```shell | ||
export DATABASE_URL=sqlite:///database/rag.sqlite3 | ||
``` | ||
|
||
- **mySQL:** `mysql://<username>:<password>@<host>:<port>/rag` | ||
```shell | ||
# The typical port is 3306 for mySQL | ||
export DATABASE_URL=mysql://username:abcdef12345@123.45.67.89:3306/rag | ||
``` | ||
|
||
- **postgreSQL:** `postgresql://<username>:<password>@<host>:<port>/rag` | ||
```shell | ||
# The typical port is 5432 for postgreSQL | ||
export DATABASE_URL=postgresql://username:abcdef12345@123.45.67.89:5432/rag | ||
``` | ||
|
||
When first testing the RAG locally, `sqlite` is the best since it requires no setup as the database is just a file on your machine. However, if you're working as part of a team, or looking to industrialize, you will need to deploy a `mysql`, or `postgresql` instance. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,46 @@ | ||
## Locally hosted embedding model from Hugging Face | ||
|
||
This will download the selected model from the HF hub and make embeddings on the machine the backend is running on. | ||
```shell | ||
pip install sentence_transformers | ||
``` | ||
|
||
```yaml | ||
# backend/config.yaml | ||
EmbeddingModelConfig: &EmbeddingModelConfig | ||
source: HuggingFaceEmbeddings | ||
source_config: | ||
model_name : 'BAAI/bge-base-en-v1.5' | ||
``` | ||
## Artefact Azure-hosted embedding model | ||
```yaml | ||
# backend/config.yaml | ||
EmbeddingModelConfig: &EmbeddingModelConfig | ||
source: OpenAIEmbeddings | ||
source_config: | ||
openai_api_type: azure | ||
openai_api_key: {{ EMBEDDING_API_KEY }} | ||
openai_api_base: https://poc-openai-artefact.openai.azure.com/ | ||
deployment: embeddings | ||
chunk_size: 500 | ||
``` | ||
## AWS Bedrock | ||
!!! info "You will first need to login to AWS" | ||
```shell | ||
pip install boto3 | ||
``` | ||
[Follow this guide to authenticate your machine](https://docs.aws.amazon.com/cli/latest/userguide/cli-authentication-user.html) | ||
|
||
```yaml | ||
# backend/config.yaml | ||
EmbeddingModelConfig: &EmbeddingModelConfig | ||
source: BedrockEmbeddings | ||
source_config: | ||
model_id: 'amazon.titan-embed-text-v1' | ||
``` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,62 @@ | ||
## Artefact Azure-hosted GPT4-turbo | ||
|
||
```yaml | ||
# backend/config.yaml | ||
LLMConfig: &LLMConfig | ||
source: AzureChatOpenAI | ||
source_config: | ||
openai_api_type: azure | ||
openai_api_key: {{ OPENAI_API_KEY }} | ||
openai_api_base: https://genai-ds.openai.azure.com/ | ||
openai_api_version: 2023-07-01-preview | ||
deployment_name: gpt4 | ||
temperature: 0.1 | ||
``` | ||
## Local llama2 | ||
!!! info "You will first need to install and run Ollama" | ||
[Download the Ollama application here](https://ollama.ai/download) | ||
Ollama will automatically utilize the GPU on Apple devices. | ||
```shell | ||
ollama run llama2 | ||
``` | ||
|
||
```yaml | ||
# backend/config.yaml | ||
LLMConfig: &LLMConfig | ||
source: ChatOllama | ||
source_config: | ||
model: llama2 | ||
``` | ||
|
||
## Vertex AI gemini-pro | ||
|
||
!!! warning | ||
|
||
Right now Gemini models' safety settings are **very** sensitive, and is is not possible to disable them. That makes this model pretty much useless for the time being as it blocks most requests and/or responses. | ||
|
||
Github issue to follow: https://github.com/langchain-ai/langchain/pull/15344#issuecomment-1888597151 | ||
|
||
!!! info "You will first need to login to GCP" | ||
|
||
```shell | ||
export PROJECT_ID=<gcp_project_id> | ||
gcloud config set project $PROJECT_ID | ||
gcloud auth login | ||
gcloud auth application-default login | ||
``` | ||
|
||
!!! info "" | ||
<a href="https://console.cloud.google.com/vertex-ai" target="_blank">Activate the Vertex APIs in your project</a> | ||
|
||
```yaml | ||
# backend/config.yaml | ||
LLMConfig: &LLMConfig | ||
source: ChatVertexAI | ||
source_config: | ||
model_name: gemini-pro | ||
temperature: 0.1 | ||
``` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,60 @@ | ||
## PostgreSQL | ||
|
||
As we need a backend SQL database to store conversation history and other info, using Postgres as a vector store is very attractive for us. Implementeing all this functionalities using the same technology reduces deployment overhead and complexity. | ||
|
||
[See the recipes for database configs here](databases_configs.md) | ||
|
||
```shell | ||
pip install psycopg2-binary pgvector | ||
``` | ||
|
||
```yaml | ||
# backend/config.yaml | ||
VectorStoreConfig: &VectorStoreConfig | ||
source: PGVector | ||
source_config: | ||
connection_string: {{ DATABASE_URL }} | ||
|
||
retriever_search_type: similarity_score_threshold | ||
retriever_config: | ||
k: 20 | ||
score_threshold: 0.5 | ||
|
||
insertion_mode: null | ||
``` | ||
`top_k`: maximum number of documents to fetch. | ||
|
||
`score_threshold`: score below which a document is deemed irrelevant and not fetched. | ||
|
||
`insertion_mode`: `null` | `full` | `incremental`. [How document indexing and insertion in the vector store is handled.](https://python.langchain.com/docs/modules/data_connection/indexing#deletion-modes) | ||
|
||
|
||
## Local Chroma | ||
|
||
```yaml | ||
# backend/config.yaml | ||
VectorStoreConfig: &VectorStoreConfig | ||
source: Chroma | ||
source_config: | ||
persist_directory: vector_database/ | ||
collection_metadata: | ||
hnsw:space: cosine | ||
retriever_search_type: similarity_score_threshold | ||
retriever_config: | ||
k: 20 | ||
score_threshold: 0.5 | ||
insertion_mode: null | ||
``` | ||
|
||
`persist_directory`: where, locally the Chroma database will be persisted. | ||
|
||
`hnsw:space: cosine`: [distance function used. Default is `l2`.](https://docs.trychroma.com/usage-guide#changing-the-distance-function) Cosine is bounded [0; 1], making it easier to set a score threshold for retrival. | ||
|
||
`top_k`: maximum number of documents to fetch. | ||
|
||
`score_threshold`: score below which a document is deemed irrelevant and not fetched. | ||
|
||
`insertion_mode`: `null` | `full` | `incremental`. [How document indexing and insertion in the vector store is handled.](https://python.langchain.com/docs/modules/data_connection/indexing#deletion-modes) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,6 @@ | ||
Here you will find a repository of configurations that have proven to work. | ||
|
||
- [LLM Configuration samples](configs/llms_configs.md) | ||
- [Embedding model Configuration samples](configs/embedding_models_configs.md) | ||
- [Vector Store Configuration samples](configs/vector_stores_configs.md) | ||
- [Database Configuration samples](configs/databases_configs.md) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,42 @@ | ||
As you tune this starter kit to your needs, you may need to add specific configuration that your RAG will use. | ||
|
||
For example, let's say you want to add the `foo` configuration parameter to your vector store configuration. | ||
|
||
First, add it to `config.py` in the part relavant to the vector store: | ||
|
||
```python | ||
# ... | ||
|
||
@dataclass | ||
class VectorStoreConfig: | ||
# ... rest of the VectorStoreConfig ... | ||
|
||
foo: str = "bar" # We add foo param, of type str, with the default value "bar" | ||
|
||
# ... | ||
``` | ||
|
||
This parameter will now be available in your `RAG` object configuration. | ||
|
||
```python | ||
from pathlib import Path | ||
from backend.rag_components.rag import RAG | ||
|
||
config_directory = Path("backend/config.yaml") | ||
rag = RAG(config_directory) | ||
|
||
print(rag.config.vector_store.foo) | ||
# > bar | ||
``` | ||
|
||
if you want to override its default value. You can do that in your `config.yaml`: | ||
```yaml | ||
VectorStoreConfig: &VectorStoreConfig | ||
# ... rest of the VectorStoreConfig ... | ||
foo: baz | ||
``` | ||
```python | ||
print(rag.config.vector_store.foo) | ||
# > baz | ||
``` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,29 @@ | ||
## Loading documents | ||
|
||
The easiest but least flexible way to load documents to your RAG is to use the `RAG.load_file` method. It will semi-intellignetly try to pick the best Langchain loader and parameters for your file. | ||
|
||
```python | ||
from pathlib import Path | ||
|
||
from backend.rag_components.rag import RAG | ||
|
||
|
||
data_directory = Path("data") | ||
|
||
config_directory = Path("backend/config.yaml") | ||
rag = RAG(config_directory) | ||
|
||
for file in data_directory.iterdir(): | ||
if file.is_file(): | ||
rag.load_file(file) | ||
``` | ||
|
||
If you want more flexibility, you can use the `rag.load_documents` method which expects a list of `langchain.docstore.document` objects. | ||
|
||
**TODO: example** | ||
|
||
## Document indexing | ||
|
||
The document loader maintains an index of the loaded documents. You can change it in the configuration of your RAG at `vector_store.insertion_mode` to `None`, `incremental`, or `full`. | ||
|
||
[Details of what that means here.](https://python.langchain.com/docs/modules/data_connection/indexing) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters