feat(docs): update documentation and fix preview-docs (#2000)

* docs: add missing configurations * docs: change HF embeddings by ollama * docs: add disclaimer about Gradio UI * docs: improve readability in concepts * docs: reorder `Fully Local Setups` * docs: improve setup instructions * docs: prevent have duplicate documentation and use table to show different options * docs: rename privateGpt to PrivateGPT * docs: update ui image * docs: remove useless header * docs: convert to alerts ingestion disclaimers * docs: add UI alternatives * docs: reference UI alternatives in disclaimers * docs: fix table * chore: update doc preview version * chore: add permissions * chore: remove useless line * docs: fixes ...
zylon-ai · Jul 18, 2024 · 4523a30 · 4523a30
1 parent 01b7ccd
commit 4523a30
Show file tree

Hide file tree

Showing 13 changed files with 162 additions and 101 deletions.
diff --git a/.github/workflows/preview-docs.yml b/.github/workflows/preview-docs.yml
@@ -11,13 +11,17 @@ jobs:
   preview-docs:
     runs-on: ubuntu-latest
 
+    permissions:
+      contents: read
+      pull-requests: write
+
     steps:
       - name: Checkout repository
         uses: actions/checkout@v4
         with:
           ref: refs/pull/${{ github.event.pull_request.number }}/merge
 
-      - name: Setup Node.js 
+      - name: Setup Node.js
         uses: actions/setup-node@v4
         with:
           node-version: "18"
@@ -37,14 +41,14 @@ jobs:
           # Set the output for the step
           echo "::set-output name=preview_url::$preview_url"
       - name: Comment PR with URL using github-actions bot
-        uses: actions/github-script@v4
+        uses: actions/github-script@v7
         if: ${{ steps.generate_docs.outputs.preview_url }}
         with:
           script: |
             const preview_url = '${{ steps.generate_docs.outputs.preview_url }}';
-            const issue_number = context.issue.number;
-            github.issues.createComment({
-              ...context.repo,
-              issue_number: issue_number,
+            github.rest.issues.createComment({
+              issue_number: context.issue.number,
+              owner: context.repo.owner,
+              repo: context.repo.repo,
               body: `Published docs preview URL: ${preview_url}`
             })
diff --git a/fern/README.md b/fern/README.md
@@ -1,4 +1,4 @@
-# Documentation of privateGPT
+# Documentation of PrivateGPT
 
 The documentation of this project is being rendered thanks to [fern](https://github.com/fern-api/fern).
 

diff --git a/fern/docs.yml b/fern/docs.yml
@@ -32,7 +32,7 @@ navigation:
         contents:
           - page: Introduction
             path: ./docs/pages/overview/welcome.mdx
-  # How to install privateGPT, with FAQ and troubleshooting
+  # How to install PrivateGPT, with FAQ and troubleshooting
   - tab: installation
     layout:
       - section: Getting started
@@ -43,7 +43,7 @@ navigation:
             path: ./docs/pages/installation/installation.mdx
           - page: Troubleshooting
             path: ./docs/pages/installation/troubleshooting.mdx
-  # Manual of privateGPT: how to use it and configure it
+  # Manual of PrivateGPT: how to use it and configure it
   - tab: manual
     layout:
       - section: General configuration
@@ -70,8 +70,10 @@ navigation:
             path: ./docs/pages/manual/reranker.mdx
       - section: User Interface
         contents:
-          - page: User interface (Gradio) Manual
-            path: ./docs/pages/manual/ui.mdx
+          - page: Gradio Manual
+            path: ./docs/pages/ui/gradio.mdx
+          - page: Alternatives
+            path: ./docs/pages/ui/alternatives.mdx
   # Small code snippet or example of usage to help users
   - tab: recipes
     layout:
@@ -80,7 +82,7 @@ navigation:
           # TODO: add recipes
           - page: List of LLMs
             path: ./docs/pages/recipes/list-llm.mdx
-  # More advanced usage of privateGPT, by API
+  # More advanced usage of PrivateGPT, by API
   - tab: api-reference
     layout:
       - section: Overview

diff --git a/fern/docs/pages/installation/concepts.mdx b/fern/docs/pages/installation/concepts.mdx
@@ -8,18 +8,25 @@ It supports a variety of LLM providers, embeddings providers, and vector stores,
 
 ## Setup configurations available
 You get to decide the setup for these 3 main components:
-- LLM: the large language model provider used for inference. It can be local, or remote, or even OpenAI.
-- Embeddings: the embeddings provider used to encode the input, the documents and the users' queries. Same as the LLM, it can be local, or remote, or even OpenAI.
-- Vector store: the store used to index and retrieve the documents.
+- **LLM**: the large language model provider used for inference. It can be local, or remote, or even OpenAI.
+- **Embeddings**: the embeddings provider used to encode the input, the documents and the users' queries. Same as the LLM, it can be local, or remote, or even OpenAI.
+- **Vector store**: the store used to index and retrieve the documents.
 
 There is an extra component that can be enabled or disabled: the UI. It is a Gradio UI that allows to interact with the API in a more user-friendly way.
 
+<Callout intent = "warning">
+A working **Gradio UI client** is provided to test the API, together with a set of useful tools such as bulk
+model download script, ingestion script, documents folder watch, etc. Please refer to the [UI alternatives](/manual/user-interface/alternatives) page for more UI alternatives.
+</Callout>
+
 ### Setups and Dependencies
 Your setup will be the combination of the different options available. You'll find recommended setups in the [installation](./installation) section.
 PrivateGPT uses poetry to manage its dependencies. You can install the dependencies for the different setups by running `poetry install --extras "<extra1> <extra2>..."`.
-Extras are the different options available for each component. For example, to install the dependencies for a a local setup with UI and qdrant as vector database, Ollama as LLM and HuggingFace as local embeddings, you would run
+Extras are the different options available for each component. For example, to install the dependencies for a a local setup with UI and qdrant as vector database, Ollama as LLM and local embeddings, you would run:
 
-`poetry install --extras "ui vector-stores-qdrant llms-ollama embeddings-huggingface"`.
+```bash
+poetry install --extras "ui vector-stores-qdrant llms-ollama embeddings-ollama"
+```
 
 Refer to the [installation](./installation) section for more details.
 
@@ -37,24 +44,23 @@ will load the configuration from `settings.yaml` and `settings-ollama.yaml`.
 
 ## About Fully Local Setups
 In order to run PrivateGPT in a fully local setup, you will need to run the LLM, Embeddings and Vector Store locally.
-### Vector stores
-The vector stores supported (Qdrant, ChromaDB and Postgres) run locally by default.
-### Embeddings
-For local Embeddings there are two options:
+### LLM
+For local LLM there are two options:
 * (Recommended) You can use the 'ollama' option in PrivateGPT, which will connect to your local Ollama instance. Ollama simplifies a lot the installation of local LLMs.
-* You can use the 'embeddings-huggingface' option in PrivateGPT, which will use HuggingFace.
+* You can use the 'llms-llama-cpp' option in PrivateGPT, which will use LlamaCPP. It works great on Mac with Metal most of the times (leverages Metal GPU), but it can be tricky in certain Linux and Windows distributions, depending on the GPU. In the installation document you'll find guides and troubleshooting.
 
-In order for HuggingFace LLM to work (the second option), you need to download the embeddings model to the `models` folder. You can do so by running the `setup` script:
+In order for LlamaCPP powered LLM to work (the second option), you need to download the LLM model to the `models` folder. You can do so by running the `setup` script:
 ```bash
 poetry run python scripts/setup
 ```
-
-### LLM
-For local LLM there are two options:
+### Embeddings
+For local Embeddings there are two options:
 * (Recommended) You can use the 'ollama' option in PrivateGPT, which will connect to your local Ollama instance. Ollama simplifies a lot the installation of local LLMs.
-* You can use the 'llms-llama-cpp' option in PrivateGPT, which will use LlamaCPP. It works great on Mac with Metal most of the times (leverages Metal GPU), but it can be tricky in certain Linux and Windows distributions, depending on the GPU. In the installation document you'll find guides and troubleshooting.
+* You can use the 'embeddings-huggingface' option in PrivateGPT, which will use HuggingFace.
 
-In order for LlamaCPP powered LLM to work (the second option), you need to download the LLM model to the `models` folder. You can do so by running the `setup` script:
+In order for HuggingFace LLM to work (the second option), you need to download the embeddings model to the `models` folder. You can do so by running the `setup` script:
 ```bash
 poetry run python scripts/setup
 ```
+### Vector stores
+The vector stores supported (Qdrant, ChromaDB and Postgres) run locally by default.
diff --git a/fern/docs/pages/installation/installation.mdx b/fern/docs/pages/installation/installation.mdx
@@ -1,63 +1,101 @@
-It is important that you review the Main Concepts before you start the installation process.
+It is important that you review the [Main Concepts](../concepts) section to understand the different components of PrivateGPT and how they interact with each other.
 
 ## Base requirements to run PrivateGPT
 
-* Clone PrivateGPT repository, and navigate to it:
-
+### 1. Clone the PrivateGPT Repository
+Clone the repository and navigate to it:
 ```bash
-  git clone https://github.com/zylon-ai/private-gpt
-  cd private-gpt
+git clone https://github.com/zylon-ai/private-gpt
+cd private-gpt
 ```
 
-* Install Python `3.11` (*if you do not have it already*). Ideally through a python version manager like `pyenv`.
-  Earlier python versions are not supported.
-    * osx/linux: [pyenv](https://github.com/pyenv/pyenv)
-    * windows: [pyenv-win](https://github.com/pyenv-win/pyenv-win)
-
+### 2. Install Python 3.11
+If you do not have Python 3.11 installed, install it using a Python version manager like `pyenv`. Earlier Python versions are not supported.
+#### macOS/Linux
+Install and set Python 3.11 using [pyenv](https://github.com/pyenv/pyenv):
+```bash
+pyenv install 3.11
+pyenv local 3.11
+```
+#### Windows
+Install and set Python 3.11 using [pyenv-win](https://github.com/pyenv-win/pyenv-win):
 ```bash
 pyenv install 3.11
 pyenv local 3.11
 ```
 
-* Install [Poetry](https://python-poetry.org/docs/#installing-with-the-official-installer) for dependency management:
-
-* Install `make` to be able to run the different scripts:
-    * osx: (Using homebrew): `brew install make`
-    * windows: (Using chocolatey) `choco install make`
-
-## Install and run your desired setup
+### 3. Install `Poetry`
+Install [Poetry](https://python-poetry.org/docs/#installing-with-the-official-installer) for dependency management:
+Follow the instructions on the official Poetry website to install it.
 
-PrivateGPT allows to customize the setup -from fully local to cloud based- by deciding the modules to use.
-Here are the different options available:
+### 4. Optional: Install `make`
+To run various scripts, you need to install `make`. Follow the instructions for your operating system:
+#### macOS
+(Using Homebrew):
+```bash
+brew install make
+```
+#### Windows
+(Using Chocolatey):
+```bash
+choco install make
+```
 
-- LLM: "llama-cpp", "ollama", "sagemaker", "openai", "openailike", "azopenai"
-- Embeddings: "huggingface", "openai", "sagemaker", "azopenai"
-- Vector stores: "qdrant", "chroma", "postgres"
-- UI: whether or not to enable UI (Gradio) or just go with the API
+## Install and Run Your Desired Setup
 
-In order to only install the required dependencies, PrivateGPT offers different `extras` that can be combined during the installation process:
+PrivateGPT allows customization of the setup, from fully local to cloud-based, by deciding the modules to use. To install only the required dependencies, PrivateGPT offers different `extras` that can be combined during the installation process:
 
 ```bash
 poetry install --extras "<extra1> <extra2>..."
 ```
-
-Where `<extra>` can be any of the following:
-
-- ui: adds support for UI using Gradio
-- llms-ollama: adds support for Ollama LLM, the easiest way to get a local LLM running, requires Ollama running locally
-- llms-llama-cpp: adds support for local LLM using LlamaCPP - expect a messy installation process on some platforms
-- llms-sagemaker: adds support for Amazon Sagemaker LLM, requires Sagemaker inference endpoints
-- llms-openai: adds support for OpenAI LLM, requires OpenAI API key
-- llms-openai-like: adds support for 3rd party LLM providers that are compatible with OpenAI's API
-- llms-azopenai: adds support for Azure OpenAI LLM, requires Azure OpenAI inference endpoints
-- embeddings-ollama: adds support for Ollama Embeddings, requires Ollama running locally
-- embeddings-huggingface: adds support for local Embeddings using HuggingFace
-- embeddings-sagemaker: adds support for Amazon Sagemaker Embeddings, requires Sagemaker inference endpoints
-- embeddings-openai = adds support for OpenAI Embeddings, requires OpenAI API key
-- embeddings-azopenai = adds support for Azure OpenAI Embeddings, requires Azure OpenAI inference endpoints
-- vector-stores-qdrant: adds support for Qdrant vector store
-- vector-stores-chroma: adds support for Chroma DB vector store
-- vector-stores-postgres: adds support for Postgres vector store
+Where `<extra>` can be any of the following options described below.
+
+### Available Modules
+
+You need to choose one option per category (LLM, Embeddings, Vector Stores, UI). Below are the tables listing the available options for each category.
+
+#### LLM
+
+| **Option**   | **Description**                                                        | **Extra**           |
+|--------------|------------------------------------------------------------------------|---------------------|
+| **ollama**   | Adds support for Ollama LLM, requires Ollama running locally           | llms-ollama         |
+| llama-cpp    | Adds support for local LLM using LlamaCPP                              | llms-llama-cpp      |
+| sagemaker    | Adds support for Amazon Sagemaker LLM, requires Sagemaker endpoints    | llms-sagemaker      |
+| openai       | Adds support for OpenAI LLM, requires OpenAI API key                   | llms-openai         |
+| openailike   | Adds support for 3rd party LLM providers compatible with OpenAI's API  | llms-openai-like    |
+| azopenai     | Adds support for Azure OpenAI LLM, requires Azure endpoints            | llms-azopenai       |
+| gemini       | Adds support for Gemini LLM, requires Gemini API key                   | llms-gemini         |
+
+#### Embeddings
+
+| **Option**       | **Description**                                                                | **Extra**               |
+|------------------|--------------------------------------------------------------------------------|-------------------------|
+| **ollama**       | Adds support for Ollama Embeddings, requires Ollama running locally            | embeddings-ollama       |
+| huggingface      | Adds support for local Embeddings using HuggingFace                            | embeddings-huggingface  |
+| openai           | Adds support for OpenAI Embeddings, requires OpenAI API key                    | embeddings-openai       |
+| sagemaker        | Adds support for Amazon Sagemaker Embeddings, requires Sagemaker endpoints     | embeddings-sagemaker    |
+| azopenai         | Adds support for Azure OpenAI Embeddings, requires Azure endpoints             | embeddings-azopenai     |
+| gemini           | Adds support for Gemini Embeddings, requires Gemini API key                    | embeddings-gemini       |
+
+#### Vector Stores
+
+| **Option**       | **Description**                         | **Extra**               |
+|------------------|-----------------------------------------|-------------------------|
+| **qdrant**       | Adds support for Qdrant vector store    | vector-stores-qdrant    |
+| chroma           | Adds support for Chroma DB vector store | vector-stores-chroma    |
+| postgres         | Adds support for Postgres vector store  | vector-stores-postgres  |
+| clickhouse       | Adds support for Clickhouse vector store| vector-stores-clickhouse|
+
+#### UI
+
+| **Option**   | **Description**                          | **Extra** |
+|--------------|------------------------------------------|-----------|
+| Gradio       | Adds support for UI using Gradio         | ui        |
+
+<Callout intent = "warning">
+A working **Gradio UI client** is provided to test the API, together with a set of useful tools such as bulk
+model download script, ingestion script, documents folder watch, etc. Please refer to the [UI alternatives](/manual/user-interface/alternatives) page for more UI alternatives.
+</Callout>
 
 ## Recommended Setups
 

diff --git a/fern/docs/pages/installation/troubleshooting.mdx b/fern/docs/pages/installation/troubleshooting.mdx
@@ -1,44 +1,31 @@
 # Downloading Gated and Private Models
-
 Many models are gated or private, requiring special access to use them. Follow these steps to gain access and set up your environment for using these models.
-
 ## Accessing Gated Models
-
 1. **Request Access:**
    Follow the instructions provided [here](https://huggingface.co/docs/hub/en/models-gated) to request access to the gated model.
-
 2. **Generate a Token:**
    Once you have access, generate a token by following the instructions [here](https://huggingface.co/docs/hub/en/security-tokens).
-
 3. **Set the Token:**
    Add the generated token to your `settings.yaml` file:
-
    ```yaml
    huggingface:
      access_token: <your-token>
    ```
-
    Alternatively, set the `HF_TOKEN` environment variable:
-
    ```bash
    export HF_TOKEN=<your-token>
    ```
 
 # Tokenizer Setup
-
 PrivateGPT uses the `AutoTokenizer` library to tokenize input text accurately. It connects to HuggingFace's API to download the appropriate tokenizer for the specified model.
 
 ## Configuring the Tokenizer
-
 1. **Specify the Model:**
    In your `settings.yaml` file, specify the model you want to use:
-
    ```yaml
    llm:
      tokenizer: mistralai/Mistral-7B-Instruct-v0.2
    ```
-
 2. **Set Access Token for Gated Models:**
    If you are using a gated model, ensure the `access_token` is set as mentioned in the previous section.
-
 This configuration ensures that PrivateGPT can download and use the correct tokenizer for the model you are working with.
diff --git a/fern/docs/pages/manual/ingestion.mdx b/fern/docs/pages/manual/ingestion.mdx
@@ -93,7 +93,7 @@ time PGPT_PROFILES=mock python ./scripts/ingest_folder.py ~/my-dir/to-ingest/
 
 ## Supported file formats
 
-privateGPT by default supports all the file formats that contains clear text (for example, `.txt` files, `.html`, etc.).
+PrivateGPT by default supports all the file formats that contains clear text (for example, `.txt` files, `.html`, etc.).
 However, these text based file formats as only considered as text files, and are not pre-processed in any other way.
 
 It also supports the following file formats:
@@ -115,11 +115,15 @@ It also supports the following file formats:
 * `.ipynb`
 * `.json`
 
-**Please note the following nuance**: while `privateGPT` supports these file formats, it **might** require additional
+<Callout intent = "info">
+While `PrivateGPT` supports these file formats, it **might** require additional
 dependencies to be installed in your python's virtual environment.
-For example, if you try to ingest `.epub` files, `privateGPT` might fail to do it, and will instead display an
+For example, if you try to ingest `.epub` files, `PrivateGPT` might fail to do it, and will instead display an
 explanatory error asking you to download the necessary dependencies to install this file format.
+</Callout>
 
-
+<Callout intent = "info">
 **Other file formats might work**, but they will be considered as plain text
-files (in other words, they will be ingested as `.txt` files).
+files (in other words, they will be ingested as `.txt` files).
+</Callout>
+
diff --git a/fern/docs/pages/manual/settings.mdx b/fern/docs/pages/manual/settings.mdx
@@ -3,8 +3,8 @@
 The configuration of your private GPT server is done thanks to `settings` files (more precisely `settings.yaml`).
 These text files are written using the [YAML](https://en.wikipedia.org/wiki/YAML) syntax.
 
-While privateGPT is distributing safe and universal configuration files, you might want to quickly customize your
-privateGPT, and this can be done using the `settings` files.
+While PrivateGPT is distributing safe and universal configuration files, you might want to quickly customize your
+PrivateGPT, and this can be done using the `settings` files.
 
 This project is defining the concept of **profiles** (or configuration profiles).
 This mechanism, using your environment variables, is giving you the ability to easily switch between
@@ -43,7 +43,7 @@ If the above is not working, you might want to try other ways to set an env vari
 
 ---
 
-Once you've set this environment variable to the desired profile, you can simply launch your privateGPT,
+Once you've set this environment variable to the desired profile, you can simply launch your PrivateGPT,
 and it will run using your profile on top of the default configuration.
 
 ## Reference