-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Docker Image Permission Issues #651
Comments
Thanks, we will take a look at it ASAP |
Please take a look @achraf-mer and @ChathurindaRanasinghe |
Over at https://github.com/h2oai/h2o-llmstudio/blob/main/Dockerfile they're using an "llmstudio" user in the container. Perhaps it would help to do the same here? |
I don't think such a user alone would help, this is from llmstudio container:
|
However user + changing dir + changing owner could work. Something like this: # ... (existing Dockerfile content up to the point of Miniconda installation)
# Create a new user 'llmstudio'
RUN useradd -m llmstudio
# Install Miniconda into /usr/bin/miniconda3
RUN wget https://repo.anaconda.com/miniconda/Miniconda3-py310_23.1.0-1-Linux-x86_64.sh && \
bash ./Miniconda3-py310_23.1.0-1-Linux-x86_64.sh -b -p /usr/bin/miniconda3
# Update PATH environment variable
ENV PATH="/usr/bin/miniconda3/bin:${PATH}"
# Change the ownership of the directories to 'llmstudio'
RUN chown -R llmstudio:llmstudio /workspace && \
chown -R llmstudio:llmstudio /usr/bin/miniconda3
# Set the working directory to /workspace
WORKDIR /workspace
# Switch to 'llmstudio'
USER llmstudio
# ... (rest of your Dockerfile content) By making these changes, you ensure that the container runs as the user |
Nice. I'd recommend setting chmod a+rwX so that we avoid permissions errors when the docker container is run as a existing host user, too! |
@ffalkenberg Can you share your manifest files on getting this to work on k8s? |
Hello @lamw, Certainly! Below is an example of the Kubernetes manifest I used to deploy the h2ogpt : apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: cache-pvc
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 30Gi
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: save-pvc
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 10Gi
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: h2ogpt-deployment
labels:
app: h2ogpt
spec:
replicas: 1
selector:
matchLabels:
app: h2ogpt
template:
metadata:
labels:
app: h2ogpt
spec:
containers:
- name: h2ogpt-container
image: gcr.io/vorvan/h2oai/h2ogpt-runtime:latest
ports:
- containerPort: 7860
env:
# proxy address, adjust or remove as necessary
- name: HTTPS_PROXY
value: http://your-proxy-address:8080
- name: HOME
value: /workspace/
volumeMounts:
- name: cache-volume
mountPath: /workspace/.cache
- name: save-volume
mountPath: /workspace/save
# This command is used to keep the pod running. Replace with h2ogpt command when needed.
command:
- tail
- '-f'
- /dev/null
resources:
limits:
cpu: '4'
memory: 25G
# The NVIDIA MIG configuration specifies how GPU resources should be allocated
# This depends on the Kubernetes cluster environment and can be skipped if not relevant.
nvidia.com/mig-7g.40gb: '1'
requests:
cpu: '4'
memory: 25G
nvidia.com/mig-7g.40gb: '1'
volumes:
- name: cache-volume
persistentVolumeClaim:
claimName: cache-pvc
- name: save-volume
persistentVolumeClaim:
claimName: save-pvc
---
kind: Service
apiVersion: v1
metadata:
name: h2ogpt-service
spec:
ports:
- protocol: TCP
port: 80
targetPort: 7860
selector:
app: h2ogpt
---
apiVersion: route.openshift.io/v1
kind: Route
metadata:
annotations:
haproxy.router.openshift.io/balance: roundrobin
haproxy.router.openshift.io/disable_cookies: 'true'
labels:
app: h2ogpt-route
name: h2ogpt-route
spec:
to:
kind: Service
name: h2ogpt-service
weight: 100
port:
targetPort: 7860
wildcardPolicy: None
You might need to adjust it based on your specific requirements and environment. Additionally, there's now a Helm chart available as part of the repository. It can greatly simplify the deployment process on Kubernetes. I'd highly recommend checking it out if you haven't already. I hope this helps! Let me know if you have any further questions or need additional details. |
Thank you @ffalkenberg - I was almost close in my YAML, but used I'm able to get basics up and running but what is the strategy for pre-downloading the models and simply referencing them, same goes with generated DB files, which I've done manually and wanted to include them. Since there's no k8s examples, Its been a bit challenging to figure out the right arguments/flags ... |
Hello @lamw, The nature of Persistent Volume Claims (PVCs) in Kubernetes guarantees that once the models and DB files are downloaded, they will persist and survive pod restarts and evictions. This means that once the files are on the PVC, they won't be downloaded again in subsequent deployments or pod restarts. However, if you want to avoid downloading the models and DB files entirely within the Kubernetes environment, you can use an Here's a basic approach:
This approach ensures that your models and DB files are made available to your application without any download operations within the Kubernetes environment. To determine exactly where the data gets stored, it would typically be in directories like /workspace/.cache or /workspace/save, depending on your application's configuration. |
Hello @ffalkenberg |
The only arch specific binary wheels we have are llama_cpp_python, auto-awq, auto-gptq, and exllama. If you are using llama_cpp_python, you can do this kind of thing and rebuild the docker image and it should work: #1440 (comment) But instead of the cuda line you should use AMD one from here: https://github.com/abetlen/llama-cpp-python?tab=readme-ov-file#supported-backends
|
Hello maintainers,
I've encountered a problem when trying to deploy the
h2ogpt
Docker image on a Kubernetes/OC cluster. The deployment fails due to aPermission denied
error. I believe the root cause is the way the Docker image is set up.Issue Details:
/root
directory. This is problematic as many Kubernetes environments restrict running containers as root for security reasons.Deployment Configuration:
We are using the following
deployment.yaml
:Temporary Solution:
As a quick fix, we built an additional layer on top of the existing image:
However, this is not a sustainable long-term solution.
Request:
I kindly request that the Docker image be restructured to avoid such permission issues, especially when deploying in environments like Kubernetes. Ideally, the image should not rely on the
/root
directory for operations and should be set up in a way that respects common security practices.Thank you for your attention to this matter.
Dockerfile: Link
Dockerimage: Link
The text was updated successfully, but these errors were encountered: