Docker Image Permission Issues #651

ffalkenberg · 2023-08-11T13:38:33Z

Hello maintainers,

I've encountered a problem when trying to deploy the h2ogpt Docker image on a Kubernetes/OC cluster. The deployment fails due to a Permission denied error. I believe the root cause is the way the Docker image is set up.

Issue Details:

Python is installed and run from the /root directory. This is problematic as many Kubernetes environments restrict running containers as root for security reasons.
The Docker image does not seem to be set up in a manner that's conducive for professional deployments, especially in environments that have strict security policies.

Deployment Configuration:

We are using the following deployment.yaml:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: h2ogpt-deployment
  labels:
    app: h2ogpt
spec:
  replicas: 1
  selector:
    matchLabels:
      app: h2ogpt
  template:
    metadata:
      labels:
        app: h2ogpt
    spec:
      containers:
      - name: h2ogpt-container
        image: gcr.io/vorvan/h2oai/h2ogpt-runtime:latest
        ports:
        - containerPort: 7860
        env:
        - name: HOME
          value: /workspace/
        volumeMounts:
        - name: cache-volume
          mountPath: /workspace/.cache
        - name: save-volume
          mountPath: /workspace/save
        command:
          - tail
          - '-f'
          - /dev/null
      volumes:
      - name: cache-volume
        persistentVolumeClaim:
          claimName: cache-pvc
      - name: save-volume
        persistentVolumeClaim:
          claimName: save-pvc

Temporary Solution:

As a quick fix, we built an additional layer on top of the existing image:

FROM gcr.io/vorvan/h2oai/h2ogpt-runtime:latest

RUN chgrp -R 0 /root && \
    chmod -R g=u /root

RUN chgrp -R 0 /workspace && \
    chmod -R g=u /workspace

However, this is not a sustainable long-term solution.

Request:

I kindly request that the Docker image be restructured to avoid such permission issues, especially when deploying in environments like Kubernetes. Ideally, the image should not rely on the /root directory for operations and should be set up in a way that respects common security practices.

Thank you for your attention to this matter.

Dockerfile: Link
Dockerimage: Link

The text was updated successfully, but these errors were encountered:

pseudotensor · 2023-08-11T21:41:25Z

Thanks, we will take a look at it ASAP

pseudotensor · 2023-08-11T21:41:36Z

Please take a look @achraf-mer and @ChathurindaRanasinghe

parkeraddison · 2023-08-12T01:35:37Z

Over at https://github.com/h2oai/h2o-llmstudio/blob/main/Dockerfile they're using an "llmstudio" user in the container. Perhaps it would help to do the same here?

ffalkenberg · 2023-08-12T05:52:12Z

I don't think such a user alone would help, this is from llmstudio container:

The user is llmstudio:

llmstudio@h2o-llmstudio1:/workspace$ whoami
llmstudio

Directory permissions under /workspace:

llmstudio@h2o-llmstudio1:/workspace$ ls -la
total 328
drwxr-xr-x 1 root root    598 Aug  2 22:31 .
drwxr-xr-x 1 root root    240 Aug 12 05:41 ..
...
[Trimmed for brevity]
...
-rw-r--r-- 1 root root   5576 Aug  2 22:23 train_wave.py

ffalkenberg · 2023-08-12T06:07:32Z

However user + changing dir + changing owner could work. Something like this:

# ... (existing Dockerfile content up to the point of Miniconda installation)

# Create a new user 'llmstudio'
RUN useradd -m llmstudio

# Install Miniconda into /usr/bin/miniconda3
RUN wget https://repo.anaconda.com/miniconda/Miniconda3-py310_23.1.0-1-Linux-x86_64.sh && \
    bash ./Miniconda3-py310_23.1.0-1-Linux-x86_64.sh -b -p /usr/bin/miniconda3

# Update PATH environment variable
ENV PATH="/usr/bin/miniconda3/bin:${PATH}"

# Change the ownership of the directories to 'llmstudio'
RUN chown -R llmstudio:llmstudio /workspace && \
    chown -R llmstudio:llmstudio /usr/bin/miniconda3

# Set the working directory to /workspace
WORKDIR /workspace

# Switch to 'llmstudio'
USER llmstudio

# ... (rest of your Dockerfile content)

By making these changes, you ensure that the container runs as the user llmstudio and that this user has the necessary permissions to execute the application and its dependencies from the /workspace directory.

parkeraddison · 2023-08-13T19:41:38Z

Nice. I'd recommend setting chmod a+rwX so that we avoid permissions errors when the docker container is run as a existing host user, too!

lamw · 2023-09-27T19:25:30Z

@ffalkenberg Can you share your manifest files on getting this to work on k8s?

ffalkenberg · 2023-09-28T06:43:34Z

Hello @lamw,

Certainly! Below is an example of the Kubernetes manifest I used to deploy the h2ogpt :

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: cache-pvc
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 30Gi
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: save-pvc
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 10Gi
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: h2ogpt-deployment
  labels:
    app: h2ogpt
spec:
  replicas: 1
  selector:
    matchLabels:
      app: h2ogpt
  template:
    metadata:
      labels:
        app: h2ogpt
    spec:
      containers:
      - name: h2ogpt-container
        image: gcr.io/vorvan/h2oai/h2ogpt-runtime:latest
        ports:
        - containerPort: 7860
        env:
        # proxy address, adjust or remove as necessary
        - name: HTTPS_PROXY
          value: http://your-proxy-address:8080
        - name: HOME
          value: /workspace/
        volumeMounts:
        - name: cache-volume
          mountPath: /workspace/.cache
        - name: save-volume
          mountPath: /workspace/save
        # This command is used to keep the pod running. Replace with h2ogpt command when needed.
        command:
          - tail
          - '-f'
          - /dev/null
        resources:
          limits:
            cpu: '4'
            memory: 25G
            # The NVIDIA MIG configuration specifies how GPU resources should be allocated
            # This depends on the Kubernetes cluster environment and can be skipped if not relevant.
            nvidia.com/mig-7g.40gb: '1'
          requests:
            cpu: '4'
            memory: 25G
            nvidia.com/mig-7g.40gb: '1'
      volumes:
      - name: cache-volume
        persistentVolumeClaim:
          claimName: cache-pvc
      - name: save-volume
        persistentVolumeClaim:
          claimName: save-pvc
---
kind: Service
apiVersion: v1
metadata:
  name: h2ogpt-service
spec:
  ports:
    - protocol: TCP
      port: 80
      targetPort: 7860
  selector:
    app: h2ogpt
---
apiVersion: route.openshift.io/v1
kind: Route
metadata:
  annotations:
    haproxy.router.openshift.io/balance: roundrobin
    haproxy.router.openshift.io/disable_cookies: 'true'
  labels:
    app: h2ogpt-route
  name: h2ogpt-route
spec:
  to:
    kind: Service
    name: h2ogpt-service
    weight: 100
  port:
    targetPort: 7860
  wildcardPolicy: None

You might need to adjust it based on your specific requirements and environment.

Additionally, there's now a Helm chart available as part of the repository. It can greatly simplify the deployment process on Kubernetes. I'd highly recommend checking it out if you haven't already.

I hope this helps! Let me know if you have any further questions or need additional details.

lamw · 2023-09-28T17:38:59Z

Thank you @ffalkenberg - I was almost close in my YAML, but used args rather than command

I'm able to get basics up and running but what is the strategy for pre-downloading the models and simply referencing them, same goes with generated DB files, which I've done manually and wanted to include them. Since there's no k8s examples, Its been a bit challenging to figure out the right arguments/flags ...

ffalkenberg · 2023-09-28T18:55:16Z

Hello @lamw,

The nature of Persistent Volume Claims (PVCs) in Kubernetes guarantees that once the models and DB files are downloaded, they will persist and survive pod restarts and evictions. This means that once the files are on the PVC, they won't be downloaded again in subsequent deployments or pod restarts.

However, if you want to avoid downloading the models and DB files entirely within the Kubernetes environment, you can use an initContainer. The initContainer can be set up to copy the files from a predefined location (e.g., a mounted storage or another container) to the PVC before the main application starts.

Here's a basic approach:

Create the PVC: This PVC will store your models and DB files.
Deploy with initContainer:
- The initContainer accesses the PVC.
- It copies the models and DB files from a predefined location to the PVC.
Run the Main Application: With the data already on the PVC, the main container(s) start without needing to fetch or download the models and DB files.

This approach ensures that your models and DB files are made available to your application without any download operations within the Kubernetes environment.

To determine exactly where the data gets stored, it would typically be in directories like /workspace/.cache or /workspace/save, depending on your application's configuration.

umairmalik904 · 2024-03-04T15:39:02Z

Hello @ffalkenberg
I am trying to run h2ogpt using openshift but getting this error "exec /workspace/generate.py: exec formate error". This error is because of architecture differences but I have built the image on amd64 and tried to run it on minikube based on amd64. I can run the image using docker but not with k8s or openshift. What might be causing this error?

pseudotensor · 2024-03-04T16:17:44Z

The only arch specific binary wheels we have are llama_cpp_python, auto-awq, auto-gptq, and exllama. If you are using llama_cpp_python, you can do this kind of thing and rebuild the docker image and it should work: #1440 (comment)

But instead of the cuda line you should use AMD one from here:

https://github.com/abetlen/llama-cpp-python?tab=readme-ov-file#supported-backends

CMAKE_ARGS="-DLLAMA_HIPBLAS=on" pip install llama-cpp-python

pseudotensor assigned ChathurindaRanasinghe and achraf-mer Aug 11, 2023

achraf-mer mentioned this issue Aug 14, 2023

Fix docker permissions and allow using a non root user #664

Merged

pseudotensor closed this as completed Aug 14, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Docker Image Permission Issues #651

Docker Image Permission Issues #651

ffalkenberg commented Aug 11, 2023 •

edited

Loading

pseudotensor commented Aug 11, 2023

pseudotensor commented Aug 11, 2023

parkeraddison commented Aug 12, 2023

ffalkenberg commented Aug 12, 2023 •

edited

Loading

ffalkenberg commented Aug 12, 2023

parkeraddison commented Aug 13, 2023

lamw commented Sep 27, 2023

ffalkenberg commented Sep 28, 2023

lamw commented Sep 28, 2023

ffalkenberg commented Sep 28, 2023

umairmalik904 commented Mar 4, 2024

pseudotensor commented Mar 4, 2024

Docker Image Permission Issues #651

Docker Image Permission Issues #651

Comments

ffalkenberg commented Aug 11, 2023 • edited Loading

pseudotensor commented Aug 11, 2023

pseudotensor commented Aug 11, 2023

parkeraddison commented Aug 12, 2023

ffalkenberg commented Aug 12, 2023 • edited Loading

ffalkenberg commented Aug 12, 2023

parkeraddison commented Aug 13, 2023

lamw commented Sep 27, 2023

ffalkenberg commented Sep 28, 2023

lamw commented Sep 28, 2023

ffalkenberg commented Sep 28, 2023

umairmalik904 commented Mar 4, 2024

pseudotensor commented Mar 4, 2024

ffalkenberg commented Aug 11, 2023 •

edited

Loading

ffalkenberg commented Aug 12, 2023 •

edited

Loading