Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: make tokenizer optional and include a troubleshooting doc #1998

Merged
merged 5 commits into from
Jul 17, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions fern/docs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -41,6 +41,8 @@ navigation:
path: ./docs/pages/installation/concepts.mdx
- page: Installation
path: ./docs/pages/installation/installation.mdx
- page: Troubleshooting
path: ./docs/pages/installation/troubleshooting.mdx
# Manual of privateGPT: how to use it and configure it
- tab: manual
layout:
Expand Down
2 changes: 2 additions & 0 deletions fern/docs/pages/installation/installation.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -81,6 +81,8 @@ set PGPT_PROFILES=ollama
make run
```

Refer to the [troubleshooting](./troubleshooting) section for specific issues you might encounter.

### Local, Ollama-powered setup - RECOMMENDED

**The easiest way to run PrivateGPT fully locally** is to depend on Ollama for the LLM. Ollama provides local LLM and Embeddings super easy to install and use, abstracting the complexity of GPU support. It's the recommended setup for local development.
Expand Down
44 changes: 44 additions & 0 deletions fern/docs/pages/installation/troubleshooting.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,44 @@
# Downloading Gated and Private Models

Many models are gated or private, requiring special access to use them. Follow these steps to gain access and set up your environment for using these models.

## Accessing Gated Models

1. **Request Access:**
Follow the instructions provided [here](https://huggingface.co/docs/hub/en/models-gated) to request access to the gated model.

2. **Generate a Token:**
Once you have access, generate a token by following the instructions [here](https://huggingface.co/docs/hub/en/security-tokens).

3. **Set the Token:**
Add the generated token to your `settings.yaml` file:

```yaml
huggingface:
access_token: <your-token>
```

Alternatively, set the `HF_TOKEN` environment variable:

```bash
export HF_TOKEN=<your-token>
```

# Tokenizer Setup

PrivateGPT uses the `AutoTokenizer` library to tokenize input text accurately. It connects to HuggingFace's API to download the appropriate tokenizer for the specified model.

## Configuring the Tokenizer

1. **Specify the Model:**
In your `settings.yaml` file, specify the model you want to use:

```yaml
llm:
tokenizer: mistralai/Mistral-7B-Instruct-v0.2
```

2. **Set Access Token for Gated Models:**
If you are using a gated model, ensure the `access_token` is set as mentioned in the previous section.

This configuration ensures that PrivateGPT can download and use the correct tokenizer for the model you are working with.
8 changes: 4 additions & 4 deletions private_gpt/components/llm/llm_component.py
Original file line number Diff line number Diff line change
Expand Up @@ -35,10 +35,10 @@ def __init__(self, settings: Settings) -> None:
)
except Exception as e:
logger.warning(
"Failed to download tokenizer %s. Falling back to "
"default tokenizer.",
settings.llm.tokenizer,
e,
f"Failed to download tokenizer {settings.llm.tokenizer}: {e!s}"
f"Please follow the instructions in the documentation to download it if needed: "
f"https://docs.privategpt.dev/installation/getting-started/troubleshooting#tokenizer-setup."
f"Falling back to default tokenizer."
)

logger.info("Initializing the LLM in mode=%s", llm_mode)
Expand Down
16 changes: 10 additions & 6 deletions scripts/setup
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,7 @@ snapshot_download(
repo_id=settings().huggingface.embedding_hf_model_name,
cache_dir=models_cache_path,
local_dir=embedding_path,
token=settings().huggingface.access_token,
)
print("Embedding model downloaded!")

Expand All @@ -35,15 +36,18 @@ hf_hub_download(
cache_dir=models_cache_path,
local_dir=models_path,
resume_download=resume_download,
token=settings().huggingface.access_token,
)
print("LLM model downloaded!")

# Download Tokenizer
print(f"Downloading tokenizer {settings().llm.tokenizer}")
AutoTokenizer.from_pretrained(
pretrained_model_name_or_path=settings().llm.tokenizer,
cache_dir=models_cache_path,
)
print("Tokenizer downloaded!")
if settings().llm.tokenizer:
print(f"Downloading tokenizer {settings().llm.tokenizer}")
AutoTokenizer.from_pretrained(
pretrained_model_name_or_path=settings().llm.tokenizer,
cache_dir=models_cache_path,
token=settings().huggingface.access_token,
)
print("Tokenizer downloaded!")

print("Setup done")
5 changes: 3 additions & 2 deletions settings.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -40,7 +40,8 @@ llm:
# Should be matching the selected model
max_new_tokens: 512
context_window: 3900
tokenizer: mistralai/Mistral-7B-Instruct-v0.2
# Select your tokenizer. Llama-index tokenizer is the default.
# tokenizer: mistralai/Mistral-7B-Instruct-v0.2
temperature: 0.1 # The temperature of the model. Increasing the temperature will make the model answer more creatively. A value of 0.1 would be more factual. (Default: 0.1)

rag:
Expand Down Expand Up @@ -76,7 +77,7 @@ embedding:

huggingface:
embedding_hf_model_name: BAAI/bge-small-en-v1.5
access_token: ${HUGGINGFACE_TOKEN:}
access_token: ${HF_TOKEN:}

vectorstore:
database: qdrant
Expand Down
Loading