Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pydantic validation error with ['default', 'docker'] #1756

Open
mrepetto-certx opened this issue Mar 18, 2024 · 15 comments
Open

Pydantic validation error with ['default', 'docker'] #1756

mrepetto-certx opened this issue Mar 18, 2024 · 15 comments

Comments

@mrepetto-certx
Copy link
Contributor

I tried to run docker compose run --rm --entrypoint="bash -c '[ -f scripts/setup ] && scripts/setup'" private-gpt

In a compose file somewhat similar to the repo:

version: '3'
services:
  private-gpt:
    image: marcorepettocertx/privategpt:0.4.0
    volumes:
      - ./private-gpt/local_data/:/home/worker/app/local_data
      - ./private-gpt/models/:/home/worker/app/models
    ports:
      - 8001:8080
    environment:
      PORT: 8080
      PGPT_PROFILES: docker
      PGPT_MODE: local

But I got in return the following error:

10:25:57.921 [INFO    ] private_gpt.settings.settings_loader - Starting application with profiles=['default', 'docker']
Traceback (most recent call last):
  File "/home/worker/app/scripts/setup", line 8, in <module>
    from private_gpt.paths import models_path, models_cache_path
  File "/home/worker/app/private_gpt/paths.py", line 4, in <module>
    from private_gpt.settings.settings import settings
  File "/home/worker/app/private_gpt/settings/settings.py", line 392, in <module>
    unsafe_typed_settings = Settings(**unsafe_settings)
                            ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/worker/app/.venv/lib/python3.11/site-packages/pydantic/main.py", line 164, in __init__
    __pydantic_self__.__pydantic_validator__.validate_python(data, self_instance=__pydantic_self__)
pydantic_core._pydantic_core.ValidationError: 2 validation errors for Settings
llm.mode
  Input should be 'llamacpp', 'openai', 'openailike', 'azopenai', 'sagemaker', 'mock' or 'ollama' [type=literal_error, input_value='local', input_type=str]
    For further information visit https://errors.pydantic.dev/2.5/v/literal_error
embedding.mode
  Input should be 'huggingface', 'openai', 'azopenai', 'sagemaker', 'ollama' or 'mock' [type=literal_error, input_value='local', input_type=str]
    For further information visit https://errors.pydantic.dev/2.5/v/literal_error
@mrepetto-certx
Copy link
Contributor Author

mrepetto-certx commented Mar 18, 2024

Repeating the same by simply copying the repo and following #1445 causes the same problem, plus the following:

There was a problem when trying to write in your cache folder (/nonexistent/.cache/huggingface/hub). You should set the environment variable TRANSFORMERS_CACHE to a writable directory.

@Alexander1177
Copy link

@mrepetto-certx, can you be more specific please? I have the same issue

@mrepetto-certx
Copy link
Contributor Author

mrepetto-certx commented Mar 18, 2024

@mrepetto-certx, can you be more specific please? I have the same issue

Well. To reproduce:

git clone https://github.com/imartinez/privateGPT
cd PrivateGPT
docker compose --build
docker compose run --rm --entrypoint="bash -c '[ -f scripts/setup ] && scripts/setup'" private-gpt

I do not know how to be more specific than that.

@mrepetto-certx
Copy link
Contributor Author

I think local should be substituted with ollama 45f0571

#1445 is not taking into account this last change.

@mrepetto-certx
Copy link
Contributor Author

mrepetto-certx commented Mar 18, 2024

Indeed

services:
  private-gpt:
    build:
      dockerfile: Dockerfile.local
    volumes:
      - ./local_data/:/home/worker/app/local_data
      - ./models/:/home/worker/app/models
    ports:
      - 8001:8080
    environment:
      PORT: 8080
      PGPT_PROFILES: docker
      PGPT_MODE: llamacpp

quite works. But still requires an embedding mode, which is different from llamacpp.

@clipod
Copy link

clipod commented Mar 22, 2024

I am still getting the same error even when I change to llamacpp. Should I do any prerequisite before doing docker-compose build such as setting any env variables. Downloading any modules etc.?

@mrepetto-certx
Copy link
Contributor Author

mrepetto-certx commented Mar 23, 2024 via email

@makeSmartio
Copy link

makeSmartio commented Mar 29, 2024

I think I can help a little, if you are trying to use Ollama, which you will need to get installed and running first, then make these changes settings.yaml: from localhost to host.docker.internal here:
ollama:
llm_model: llama2
embedding_model: nomic-embed-text
api_base: http://host.docker.internal:11434

In docker-compose.yaml change dockerfile: Dockerfile.local to dockerfile: Dockerfile.external

in Dockerfile.external add these extras: RUN poetry install --extras "ui vector-stores-qdrant llms-ollama embeddings-ollama"

then do a
docker compose build
and then
docker compose up

you will probably need to run ollama pull nomic-embed-text if you get the error about not having nomic

I hope this helps. I was able to finally get it running on my M2 MacBook Air.

@yeetesh
Copy link

yeetesh commented Mar 29, 2024

I made these changes:

  1. RUN poetry install --extras "ui embeddings-huggingface llms-llama-cpp vector-stores-qdrant llms-ollama embeddings-ollama" in my Dockerfile.local

  2. set PGPT_MODE: ollama in my docker-compose.

  3. downloaded ollama docker image and ran it separately

  4. ran ollama pull nomic-embed-text in my ollama docker container.

I am still facing this issue:
requests.exceptions.ConnectionError: HTTPConnectionPool(host='localhost', port=11434): Max retries exceeded with url: /api/embeddings (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0xffff4cd571d0>: Failed to establish a new connection: [Errno 111] Connection refused'))

My ollama server is running however when i get http://localhost:11434/api/embeddings, i get a 404. Any ideas on this? @makeSmartio

@makeSmartio
Copy link

makeSmartio commented Mar 29, 2024

What about step 1 with changing localhost to: api_base: http://host.docker.internal:11434/ in the file settings.yaml.
The problem with localhost is that the docker container thinks it is localhost (it is.) host.docker.internal is the host address from the container's point of view

I also get a 404 for http://localhost:11434/api/embeddings, so no issue there.

@mrepetto-certx
Copy link
Contributor Author

What is your take on decupling it in a way that ollama is used as a microservice? Something like:

services:
  private-gpt:
    build:
      dockerfile: Dockerfile.local
    volumes:
      - ./local_data/:/home/worker/app/local_data
    ports:
      - 8001:8080
    environment:
      PORT: 8080
      PGPT_PROFILES: docker
      PGPT_MODE: ollama
 ollama:
  build:
   image: ollama/ollama
  command: ollama pull nomic-embed-text

With the settings-ollama.yaml:

server:
  env_name: ${APP_ENV:ollama}

llm:
  mode: ollama
  max_new_tokens: 512
  context_window: 3900
  temperature: 0.1     #The temperature of the model. Increasing the temperature will make the model answer more creatively. A value of 0.1 would be more factual. (Default: 0.1)

embedding:
  mode: ollama

ollama:
  llm_model: mistral
  embedding_model: nomic-embed-text
  api_base: http://ollama:11434
  tfs_z: 1.0              # Tail free sampling is used to reduce the impact of less probable tokens from the output. A higher value (e.g., 2.0) will reduce the impact more, while a value of 1.0 disables this setting.
  top_k: 40               # Reduces the probability of generating nonsense. A higher value (e.g. 100) will give more diverse answers, while a lower value (e.g. 10) will be more conservative. (Default: 40)
  top_p: 0.9              # Works together with top-k. A higher value (e.g., 0.95) will lead to more diverse text, while a lower value (e.g., 0.5) will generate more focused and conservative text. (Default: 0.9)
  repeat_last_n: 64       # Sets how far back for the model to look back to prevent repetition. (Default: 64, 0 = disabled, -1 = num_ctx)
  repeat_penalty: 1.2     # Sets how strongly to penalize repetitions. A higher value (e.g., 1.5) will penalize repetitions more strongly, while a lower value (e.g., 0.9) will be more lenient. (Default: 1.1)
  request_timeout: 120.0  # Time elapsed until ollama times out the request. Default is 120s. Format is float.

vectorstore:
  database: qdrant

qdrant:
  path: local_data/private_gpt/qdrant

@makeSmartio
Copy link

makeSmartio commented Mar 30, 2024

@mrepetto-certx Makes sense to me. Even if people already have Ollama installed this would just be another instance. You'd still need to tackle the addressing problem, though - it would either need to be http://host.docker.internal:11434/ for host installations or http://ollama:11434/ for Dockerized.

Edit: It would also take quite a bit of testing for adding the llm and embedding models for the dockerized method

@mrepetto-certx
Copy link
Contributor Author

Thanks @makeSmartio. I'm experimenting now with the caveat of having:

  ollama:
    image: ollama/ollama:latest
    volumes:
      - ./ollama:/root/.ollama

To avoid the problem of pulling a new model every docker compose. I'll keep you posted.

@mrepetto-certx
Copy link
Contributor Author

mrepetto-certx commented Mar 30, 2024

No way I keep getting:

[WARNING ] llama_index.core.chat_engine.types - Encountered exception writing response to history: [Errno 99] Cannot assign requested address

What is puzzling is that running:

from llama_index.llms.ollama import Ollama
model = Ollama(model="mistral", base_url="http://ollama:11434", request_timeout=120.0)
resp = model.complete("Who is Paul Graham?")
print(resp)

inside the container works.

@mrepetto-certx
Copy link
Contributor Author

Ok, I managed to make it work and pushed a pull request #1812 . The only thing to remember is to run ollama pull the first time to load the models but then they will stay in the host environment, similar to the previous behavior.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants