Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Docker Image Permission Issues #651

Closed
ffalkenberg opened this issue Aug 11, 2023 · 12 comments
Closed

Docker Image Permission Issues #651

ffalkenberg opened this issue Aug 11, 2023 · 12 comments
Assignees

Comments

@ffalkenberg
Copy link
Contributor

ffalkenberg commented Aug 11, 2023

Hello maintainers,

I've encountered a problem when trying to deploy the h2ogpt Docker image on a Kubernetes/OC cluster. The deployment fails due to a Permission denied error. I believe the root cause is the way the Docker image is set up.

Issue Details:

  1. Python is installed and run from the /root directory. This is problematic as many Kubernetes environments restrict running containers as root for security reasons.
  2. The Docker image does not seem to be set up in a manner that's conducive for professional deployments, especially in environments that have strict security policies.

Deployment Configuration:

We are using the following deployment.yaml:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: h2ogpt-deployment
  labels:
    app: h2ogpt
spec:
  replicas: 1
  selector:
    matchLabels:
      app: h2ogpt
  template:
    metadata:
      labels:
        app: h2ogpt
    spec:
      containers:
      - name: h2ogpt-container
        image: gcr.io/vorvan/h2oai/h2ogpt-runtime:latest
        ports:
        - containerPort: 7860
        env:
        - name: HOME
          value: /workspace/
        volumeMounts:
        - name: cache-volume
          mountPath: /workspace/.cache
        - name: save-volume
          mountPath: /workspace/save
        command:
          - tail
          - '-f'
          - /dev/null
      volumes:
      - name: cache-volume
        persistentVolumeClaim:
          claimName: cache-pvc
      - name: save-volume
        persistentVolumeClaim:
          claimName: save-pvc

Temporary Solution:

As a quick fix, we built an additional layer on top of the existing image:

FROM gcr.io/vorvan/h2oai/h2ogpt-runtime:latest

RUN chgrp -R 0 /root && \
    chmod -R g=u /root

RUN chgrp -R 0 /workspace && \
    chmod -R g=u /workspace

However, this is not a sustainable long-term solution.

Request:

I kindly request that the Docker image be restructured to avoid such permission issues, especially when deploying in environments like Kubernetes. Ideally, the image should not rely on the /root directory for operations and should be set up in a way that respects common security practices.

Thank you for your attention to this matter.

Dockerfile: Link
Dockerimage: Link

@pseudotensor
Copy link
Collaborator

Thanks, we will take a look at it ASAP

@pseudotensor
Copy link
Collaborator

Please take a look @achraf-mer and @ChathurindaRanasinghe

@parkeraddison
Copy link
Contributor

Over at https://github.com/h2oai/h2o-llmstudio/blob/main/Dockerfile they're using an "llmstudio" user in the container. Perhaps it would help to do the same here?

@ffalkenberg
Copy link
Contributor Author

ffalkenberg commented Aug 12, 2023

I don't think such a user alone would help, this is from llmstudio container:

  • The user is llmstudio:

    llmstudio@h2o-llmstudio1:/workspace$ whoami
    llmstudio
    
  • Directory permissions under /workspace:

    llmstudio@h2o-llmstudio1:/workspace$ ls -la
    total 328
    drwxr-xr-x 1 root root    598 Aug  2 22:31 .
    drwxr-xr-x 1 root root    240 Aug 12 05:41 ..
    ...
    [Trimmed for brevity]
    ...
    -rw-r--r-- 1 root root   5576 Aug  2 22:23 train_wave.py 
    

@ffalkenberg
Copy link
Contributor Author

However user + changing dir + changing owner could work. Something like this:

# ... (existing Dockerfile content up to the point of Miniconda installation)

# Create a new user 'llmstudio'
RUN useradd -m llmstudio

# Install Miniconda into /usr/bin/miniconda3
RUN wget https://repo.anaconda.com/miniconda/Miniconda3-py310_23.1.0-1-Linux-x86_64.sh && \
    bash ./Miniconda3-py310_23.1.0-1-Linux-x86_64.sh -b -p /usr/bin/miniconda3

# Update PATH environment variable
ENV PATH="/usr/bin/miniconda3/bin:${PATH}"

# Change the ownership of the directories to 'llmstudio'
RUN chown -R llmstudio:llmstudio /workspace && \
    chown -R llmstudio:llmstudio /usr/bin/miniconda3

# Set the working directory to /workspace
WORKDIR /workspace

# Switch to 'llmstudio'
USER llmstudio

# ... (rest of your Dockerfile content)

By making these changes, you ensure that the container runs as the user llmstudio and that this user has the necessary permissions to execute the application and its dependencies from the /workspace directory.

@parkeraddison
Copy link
Contributor

Nice. I'd recommend setting chmod a+rwX so that we avoid permissions errors when the docker container is run as a existing host user, too!

@lamw
Copy link

lamw commented Sep 27, 2023

@ffalkenberg Can you share your manifest files on getting this to work on k8s?

@ffalkenberg
Copy link
Contributor Author

Hello @lamw,

Certainly! Below is an example of the Kubernetes manifest I used to deploy the h2ogpt :

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: cache-pvc
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 30Gi
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: save-pvc
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 10Gi
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: h2ogpt-deployment
  labels:
    app: h2ogpt
spec:
  replicas: 1
  selector:
    matchLabels:
      app: h2ogpt
  template:
    metadata:
      labels:
        app: h2ogpt
    spec:
      containers:
      - name: h2ogpt-container
        image: gcr.io/vorvan/h2oai/h2ogpt-runtime:latest
        ports:
        - containerPort: 7860
        env:
        # proxy address, adjust or remove as necessary
        - name: HTTPS_PROXY
          value: http://your-proxy-address:8080
        - name: HOME
          value: /workspace/
        volumeMounts:
        - name: cache-volume
          mountPath: /workspace/.cache
        - name: save-volume
          mountPath: /workspace/save
        # This command is used to keep the pod running. Replace with h2ogpt command when needed.
        command:
          - tail
          - '-f'
          - /dev/null
        resources:
          limits:
            cpu: '4'
            memory: 25G
            # The NVIDIA MIG configuration specifies how GPU resources should be allocated
            # This depends on the Kubernetes cluster environment and can be skipped if not relevant.
            nvidia.com/mig-7g.40gb: '1'
          requests:
            cpu: '4'
            memory: 25G
            nvidia.com/mig-7g.40gb: '1'
      volumes:
      - name: cache-volume
        persistentVolumeClaim:
          claimName: cache-pvc
      - name: save-volume
        persistentVolumeClaim:
          claimName: save-pvc
---
kind: Service
apiVersion: v1
metadata:
  name: h2ogpt-service
spec:
  ports:
    - protocol: TCP
      port: 80
      targetPort: 7860
  selector:
    app: h2ogpt
---
apiVersion: route.openshift.io/v1
kind: Route
metadata:
  annotations:
    haproxy.router.openshift.io/balance: roundrobin
    haproxy.router.openshift.io/disable_cookies: 'true'
  labels:
    app: h2ogpt-route
  name: h2ogpt-route
spec:
  to:
    kind: Service
    name: h2ogpt-service
    weight: 100
  port:
    targetPort: 7860
  wildcardPolicy: None

You might need to adjust it based on your specific requirements and environment.

Additionally, there's now a Helm chart available as part of the repository. It can greatly simplify the deployment process on Kubernetes. I'd highly recommend checking it out if you haven't already.

I hope this helps! Let me know if you have any further questions or need additional details.

@lamw
Copy link

lamw commented Sep 28, 2023

Thank you @ffalkenberg - I was almost close in my YAML, but used args rather than command

I'm able to get basics up and running but what is the strategy for pre-downloading the models and simply referencing them, same goes with generated DB files, which I've done manually and wanted to include them. Since there's no k8s examples, Its been a bit challenging to figure out the right arguments/flags ...

@ffalkenberg
Copy link
Contributor Author

Hello @lamw,

The nature of Persistent Volume Claims (PVCs) in Kubernetes guarantees that once the models and DB files are downloaded, they will persist and survive pod restarts and evictions. This means that once the files are on the PVC, they won't be downloaded again in subsequent deployments or pod restarts.

However, if you want to avoid downloading the models and DB files entirely within the Kubernetes environment, you can use an initContainer. The initContainer can be set up to copy the files from a predefined location (e.g., a mounted storage or another container) to the PVC before the main application starts.

Here's a basic approach:

  1. Create the PVC: This PVC will store your models and DB files.
  2. Deploy with initContainer:
    • The initContainer accesses the PVC.
    • It copies the models and DB files from a predefined location to the PVC.
  3. Run the Main Application: With the data already on the PVC, the main container(s) start without needing to fetch or download the models and DB files.

This approach ensures that your models and DB files are made available to your application without any download operations within the Kubernetes environment.

To determine exactly where the data gets stored, it would typically be in directories like /workspace/.cache or /workspace/save, depending on your application's configuration.

@umairmalik904
Copy link

Hello @ffalkenberg
I am trying to run h2ogpt using openshift but getting this error "exec /workspace/generate.py: exec formate error". This error is because of architecture differences but I have built the image on amd64 and tried to run it on minikube based on amd64. I can run the image using docker but not with k8s or openshift. What might be causing this error?

@pseudotensor
Copy link
Collaborator

The only arch specific binary wheels we have are llama_cpp_python, auto-awq, auto-gptq, and exllama. If you are using llama_cpp_python, you can do this kind of thing and rebuild the docker image and it should work: #1440 (comment)

But instead of the cuda line you should use AMD one from here:

https://github.com/abetlen/llama-cpp-python?tab=readme-ov-file#supported-backends

CMAKE_ARGS="-DLLAMA_HIPBLAS=on" pip install llama-cpp-python

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants