Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JIT GitHub docker runner #2

Merged
merged 20 commits into from
Apr 3, 2024
Merged

Conversation

phymbert
Copy link
Contributor

@phymbert phymbert commented Mar 24, 2024

Motivation

In the context of:

A balanced approach between raw ggml-ci and Github self-hosted runner.

Approach

Periodically a python script is pulling jobs waiting for runner, start an ephemer Just In Time Github runner within a docker container with nvidia runtime.

Test

Tested here: https://github.com/phymbert/llama.cpp/actions/runs/8417731437

How to install a new runner manager:

git clone https://github.com/ggml-org/ci
./install-cuda.sh
./install-docker.sh
./start-github-runner-manager.sh REPO TOKEN RUNNER_LABEL

Example:

./start-github-runner-manager.sh phymbert/llama.cpp XYZ Standard_NC4as_T4_v3

ggml-ci: downloading models...
ggml-ci:     --hf-repo ggml-org/models --hf-file phi-2/ggml-model-q4_0.gguf
ggml-ci:     --hf-repo ggml-org/models --hf-file phi-2/ggml-model-q8_0.gguf
ggml-github-runners-manager
ggml-ci: starting github runner manager on repo=phymbert/llama.cpp label=Standard_NC4as_T4_v3...
0ff70e42bae2cc57696a36828c17ae6ffdbac9d8badd78d5addb3d68dc1c78d6
ggml-ci: github runner manager started.
ggml-ci: github runner manager logs:
         CTRL+C to stop logs pulling
ggml-ci: fetching workflows of phymbert/llama.cpp ...
ggml-ci: workflows iteration done.
ggml-ci: fetching workflows of phymbert/llama.cpp ...
ggml-ci: workflows iteration done.
ggml-ci: fetching workflows of phymbert/llama.cpp ...
ggml-ci:     ggml-runner-90932568-23048096941-workflow_dispatch-1711360334 triggered for workflow_name=Benchmark
ggml-ci:     ggml-runner-90932568-23048096941-workflow_dispatch-1711360334 running Github job runner id=165 os=linux labels=['self-hosted', 'X64', 'Standard_NC4as_T4_v3', 'linux']
ggml-ci:     ggml-runner-90932568-23048096941-workflow_dispatch-1711360334 done
ggml-ci: workflows iteration done.
ggml-ci: fetching workflows of phymbert/llama.cpp ...
ggml-ci: workflows iteration done.

@phymbert
Copy link
Contributor Author

@ggerganov @ngxson FYI

 - properly create the user
 - add autoremove and tmpfs
 - add netcat for the workflow to check if the server starts
…llation in the image, lowercase container/runner name
- use a tmpfs for the runner workdir
- add security_opt
- mount the models folder
ci: github-runner-manager: fix tmpfs
ci: github-runner-manager: fix tmpfs exec right, nice logs
# Conflicts:
#	install-docker.sh
@phymbert phymbert requested review from ngxson and ggerganov and removed request for ngxson March 25, 2024 09:37
@phymbert phymbert requested a review from ngxson March 25, 2024 09:40
Co-authored-by: Xuan Son Nguyen <thichthat@gmail.com>
Copy link
Contributor

@ngxson ngxson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks for taking time for this!

(Btw the "Resolve" button doesn't show on my side. Maybe I'm don't have permission. You can "resolve" my comments above if you want)

@ggerganov
Copy link
Member

./install-docker.sh requires:

apt install uidmap

@ggerganov
Copy link
Member

The ./start-github-runner-manager.sh interrupts without error when trying to download the models it seems:

Building github runner manager image...
[+] Building 0.3s (12/12) FINISHED                                                                                                                                                      docker:rootless
 => [internal] load build definition from Dockerfile                                                                                                                                               0.0s
 => => transferring dockerfile: 653B                                                                                                                                                               0.0s
 => [internal] load metadata for docker.io/library/ubuntu:latest                                                                                                                                   0.0s
 => [internal] load .dockerignore                                                                                                                                                                  0.0s
 => => transferring context: 59B                                                                                                                                                                   0.0s
 => [1/7] FROM docker.io/library/ubuntu:latest                                                                                                                                                     0.0s
 => [internal] load build context                                                                                                                                                                  0.0s
 => => transferring context: 99B                                                                                                                                                                   0.0s
 => CACHED [2/7] RUN set -eux ;     apt update ;     apt -y upgrade ;     apt -y install             git             openssh-client             python3             python3-pip             curl   0.0s
 => CACHED [3/7] WORKDIR /ggml-ci                                                                                                                                                                  0.0s
 => CACHED [4/7] ADD requirements.txt ./                                                                                                                                                           0.0s
 => CACHED [5/7] RUN set -eux ;     pip install -r requirements.txt ;                                                                                                                              0.0s
 => CACHED [6/7] ADD manager.py ./                                                                                                                                                                 0.0s
 => CACHED [7/7] ADD entrypoint.sh /                                                                                                                                                               0.0s
 => exporting to image                                                                                                                                                                             0.0s
 => => exporting layers                                                                                                                                                                            0.0s
 => => writing image sha256:56e4ae8104fb3bb75c0325de003e2db4d37d84122f429deccf02f9c92967b1e5                                                                                                       0.0s
 => => naming to docker.io/library/ggml-github-runners-manager                                                                                                                                     0.0s
Building github runner image...
[+] Building 3.2s (14/14) FINISHED                                                                                                                                                      docker:rootless
 => [internal] load build definition from Dockerfile                                                                                                                                               0.0s
 => => transferring dockerfile: 1.78kB                                                                                                                                                             0.0s
 => [internal] load metadata for docker.io/library/ubuntu:latest                                                                                                                                   0.0s
 => [internal] load .dockerignore                                                                                                                                                                  0.0s
 => => transferring context: 59B                                                                                                                                                                   0.0s
 => [1/9] FROM docker.io/library/ubuntu:latest                                                                                                                                                     0.0s
 => [internal] load build context                                                                                                                                                                  0.0s
 => => transferring context: 35B                                                                                                                                                                   0.0s
 => CACHED [2/9] RUN set -eux ;     apt update ;     apt -y upgrade ;     apt -y install             libicu-dev             curl             wget             build-essential             cmake    0.0s
 => CACHED [3/9] RUN set -eux ;     wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64/cuda-keyring_1.1-1_all.deb ;     dpkg -i cuda-keyring_1.1-1_all.deb ;     apt  0.0s
 => CACHED [4/9] RUN set -eux ;     mkdir /ggml-ci /tmp/github-runner ;     chown 1000:1000 /ggml-ci /tmp/github-runner ;                                                                          0.0s
 => CACHED [5/9] WORKDIR /ggml-ci                                                                                                                                                                  0.0s
 => CACHED [6/9] RUN set -eux ;     groupadd --gid 1000 ggml ;     useradd --uid 1000 --gid ggml --shell /bin/bash --create-home ggml ;                                                            0.0s
 => CACHED [7/9] RUN set -eux ;     curl -o actions-runner-linux-x64.tar.gz -L https://github.com/actions/runner/releases/download/v2.314.1/actions-runner-linux-x64-2.314.1.tar.gz ;      echo "  0.0s
 => CACHED [8/9] ADD entrypoint.sh /entrypoint.sh                                                                                                                                                  0.0s
 => CACHED [9/9] WORKDIR /github-runner                                                                                                                                                            0.0s
 => exporting to image                                                                                                                                                                             0.0s
 => => exporting layers                                                                                                                                                                            0.0s
 => => writing image sha256:8e405871f8db791b252bfebe69b0e1c1fc33d7489162217c179e1ae376e839a7                                                                                                       0.0s
 => => naming to docker.io/library/ggml-github-runner                                                                                                                                              0.0s
Building models downloader...
[+] Building 1.1s (10/10) FINISHED                                                                                                                                                      docker:rootless
 => [internal] load build definition from Dockerfile                                                                                                                                               0.0s
 => => transferring dockerfile: 729B                                                                                                                                                               0.0s
 => [internal] load metadata for docker.io/nvidia/cuda:12.2.2-devel-ubuntu22.04                                                                                                                    0.3s
 => [internal] load .dockerignore                                                                                                                                                                  0.1s
 => => transferring context: 59B                                                                                                                                                                   0.0s
 => [1/5] FROM docker.io/nvidia/cuda:12.2.2-devel-ubuntu22.04@sha256:ae8a022c02aec945c4f8c52f65deaf535de7abb58e840350d19391ec683f4980                                                              0.0s
 => [internal] load build context                                                                                                                                                                  0.0s
 => => transferring context: 35B                                                                                                                                                                   0.0s
 => CACHED [2/5] RUN set -eux ;     apt update ;     apt -y install             git             cmake             libcurl4-openssl-dev ;                                                           0.0s
 => CACHED [3/5] WORKDIR /llama.cpp                                                                                                                                                                0.0s
 => CACHED [4/5] RUN set -eux;     git clone https://github.com/ggerganov/llama.cpp.git . ;     mkdir build ;     cd build ;     cmake ..       -DLLAMA_CURL=ON       -DLLAMA_CUBLAS=ON       -DC  0.0s
 => CACHED [5/5] ADD entrypoint.sh /entrypoint.sh                                                                                                                                                  0.0s
 => exporting to image                                                                                                                                                                             0.0s
 => => exporting layers                                                                                                                                                                            0.0s
 => => writing image sha256:686088d20a073e2300ada195309735607552510e1172e96cc82cf5579b748d53                                                                                                       0.0s
 => => naming to docker.io/library/llama.cpp-model-downloader                                                                                                                                      0.0s
ggml-ci: downloading models...
ggml-ci:     --hf-repo ggml-org/models --hf-file phi-2/ggml-model-q4_0.gguf
ggml-ci:     --hf-repo ggml-org/models --hf-file phi-2/ggml-model-q8_0.gguf
ggml@ggml-5-x86-cuda-t4:~/ci$ 

@phymbert
Copy link
Contributor Author

Can you please share the docker logs or remove the stdout redirection ? Maybe a mount issue

@ggerganov
Copy link
Member

It does not have permission:

cat download_model.ggml-model-q4_0.gguf.log 
Wed Mar 27 14:52:34 2024       
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.54.15              Driver Version: 550.54.15      CUDA Version: 12.4     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  Tesla T4                       Off |   00000001:00:00.0 Off |                    0 |
| N/A   36C    P0             27W /   70W |      95MiB /  15360MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
                                                                                         
+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI        PID   Type   Process name                              GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
+-----------------------------------------------------------------------------------------+
HF_REPO ggml-org/models
HF_FILE phi-2/ggml-model-q4_0.gguf
Failed to open logfile 'main.log' with error 'Permission denied'
[1711551154] Log start
[1711551154] Cmd: ./build/bin/main --hf-repo ggml-org/models --hf-file phi-2/ggml-model-q4_0.gguf --model /models/phi-2/ggml-model-q4_0.gguf --random-prompt --n-predict 1
[1711551154] main: build = 2551 (e5b89a44)
[1711551154] main: built with cc (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0 for x86_64-linux-gnu
[1711551154] main: seed  = 1711551154
[1711551154] main: llama backend init
[1711551154] main: load the model and apply lora adapter, if any
llama_download_file: error opening local file for writing: /models/phi-2/ggml-model-q4_0.gguf
llama_init_from_gpt_params: error: failed to load model '/models/phi-2/ggml-model-q4_0.gguf'
[1711551154] main: error: unable to load model

What user is docker using? /mnt has ownership by ggml user

$ ls -l /mnt
total 20
drwx------ 2 root root 16384 Mar 27 14:00 lost+found
drwxrwxr-x 3 ggml ggml  4096 Mar 27 14:52 models

@phymbert
Copy link
Contributor Author

It run with user 1000:1000, is this the ggml uid:gid ?
is docker rootless ?

@ggerganov
Copy link
Member

ggerganov commented Mar 27, 2024

$ id
uid=1000(ggml) gid=1000(ggml)

is docker rootless ?

no idea

@ngxson
Copy link
Contributor

ngxson commented Mar 27, 2024

Seems like it's a mismatch between uid/gid inside/outside of container. Pay attention that docker may use uid mapping which maps user 1000 insider container to something like 1001000 in the host. I'm installing a docker rootless on my side to test if that's the case or not

@phymbert
Copy link
Contributor Author

Please pull, I have added some debug commands for permission and docker service

@ngxson
Copy link
Contributor

ngxson commented Mar 27, 2024

@phymbert I understand the problem now:

  1. uid 1000 is in fact mapped to 100999 in host machine. From runner (inside container) perspective, /models is owned by root, not 1000. There is 2 solutions to fix that:
  • chmod -R 777 /mnt/models ==> works but from an security engineer this is not acceptable
  • chown -R 100999:100999 /mnt/models ==> should work, but I'm not sure if the number 100999 will change in the future or not (seems like it won't) (See below)
  1. inside container, /models is currently mounted as read-only, I thought that we will only read from it and never write, but seems like our test wants to write to that folder. Can we maybe do a cp /models/* /tmp/models/* and only work with /tmp? This solution will also solve the first problem that I mentioned above

@phymbert
Copy link
Contributor Author

phymbert commented Mar 27, 2024

  1. inside container, /models is currently mounted as read-only, I thought that we will only read from it and never write, but seems like our test wants to write to that folder. Can we maybe do a cp /models/* /tmp/models/* and only work with /tmp? This solution will also solve the first problem that I mentioned above

Here we are at model downloader step, it is mounted with rw, let's wait for the logs, we never know

@ngxson
Copy link
Contributor

ngxson commented Mar 27, 2024

Oh I see, then it's the start-github-runner-manager.sh you can simply remove -u "1000:1000" when running llama.cpp-model-downloader. I suggested you this change earlier, but in fact I overlooked it, sorry.

It's true that as you said, we're running docker rootless so -u "1000:1000" in download step will make the actual uid become 100999, not 1000. What I suggested only applicable in rootfull mode.

Ref: https://github.com/ggml-org/ci/pull/2/files#r1537304334

@phymbert
Copy link
Contributor Author

Oh I see, then it's the start-github-runner-manager.sh you can simply remove -u "1000:1000" when running llama.cpp-model-downloader. I suggested you this change earlier, but in fact I overlooked it, sorry.

It's true that as you said, we're running docker rootless so -u "1000:1000" in download step will make the actual uid become 100999, not 1000. What I suggested only applicable in rootfull mode.

Ref: https://github.com/ggml-org/ci/pull/2/files#r1537304334

OK, but it was working fine on the other VM. So probably not this

@ggerganov
Copy link
Member

Logs after pull:

Building github runner manager image...
[+] Building 0.3s (12/12) FINISHED                                                                                                                                                                                                                                                   docker:rootless
 => [internal] load build definition from Dockerfile                                                                                                                                                                                                                                            0.0s
 => => transferring dockerfile: 653B                                                                                                                                                                                                                                                            0.0s
 => [internal] load metadata for docker.io/library/ubuntu:latest                                                                                                                                                                                                                                0.0s
 => [internal] load .dockerignore                                                                                                                                                                                                                                                               0.0s
 => => transferring context: 59B                                                                                                                                                                                                                                                                0.0s
 => [1/7] FROM docker.io/library/ubuntu:latest                                                                                                                                                                                                                                                  0.0s
 => [internal] load build context                                                                                                                                                                                                                                                               0.0s
 => => transferring context: 99B                                                                                                                                                                                                                                                                0.0s
 => CACHED [2/7] RUN set -eux ;     apt update ;     apt -y upgrade ;     apt -y install             git             openssh-client             python3             python3-pip             curl              dbus-user-session              uidmap ;     curl -sSL https://get.docker.com/ |   0.0s
 => CACHED [3/7] WORKDIR /ggml-ci                                                                                                                                                                                                                                                               0.0s
 => CACHED [4/7] ADD requirements.txt ./                                                                                                                                                                                                                                                        0.0s
 => CACHED [5/7] RUN set -eux ;     pip install -r requirements.txt ;                                                                                                                                                                                                                           0.0s
 => CACHED [6/7] ADD manager.py ./                                                                                                                                                                                                                                                              0.0s
 => CACHED [7/7] ADD entrypoint.sh /                                                                                                                                                                                                                                                            0.0s
 => exporting to image                                                                                                                                                                                                                                                                          0.0s
 => => exporting layers                                                                                                                                                                                                                                                                         0.0s
 => => writing image sha256:56e4ae8104fb3bb75c0325de003e2db4d37d84122f429deccf02f9c92967b1e5                                                                                                                                                                                                    0.0s
 => => naming to docker.io/library/ggml-github-runners-manager                                                                                                                                                                                                                                  0.0s
Building github runner image...
[+] Building 0.2s (14/14) FINISHED                                                                                                                                                                                                                                                   docker:rootless
 => [internal] load build definition from Dockerfile                                                                                                                                                                                                                                            0.0s
 => => transferring dockerfile: 1.78kB                                                                                                                                                                                                                                                          0.0s
 => [internal] load metadata for docker.io/library/ubuntu:latest                                                                                                                                                                                                                                0.0s
 => [internal] load .dockerignore                                                                                                                                                                                                                                                               0.0s
 => => transferring context: 59B                                                                                                                                                                                                                                                                0.0s
 => [1/9] FROM docker.io/library/ubuntu:latest                                                                                                                                                                                                                                                  0.0s
 => [internal] load build context                                                                                                                                                                                                                                                               0.0s
 => => transferring context: 35B                                                                                                                                                                                                                                                                0.0s
 => CACHED [2/9] RUN set -eux ;     apt update ;     apt -y upgrade ;     apt -y install             libicu-dev             curl             wget             build-essential             cmake             git             python3-pip             python3-venv             language-pack-en   0.0s
 => CACHED [3/9] RUN set -eux ;     wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64/cuda-keyring_1.1-1_all.deb ;     dpkg -i cuda-keyring_1.1-1_all.deb ;     apt-get update ;     apt-get -y install        cuda-nvcc-12-2       libcublas-dev-12-2;           0.0s
 => CACHED [4/9] RUN set -eux ;     mkdir /ggml-ci /tmp/github-runner ;     chown 1000:1000 /ggml-ci /tmp/github-runner ;                                                                                                                                                                       0.0s
 => CACHED [5/9] WORKDIR /ggml-ci                                                                                                                                                                                                                                                               0.0s
 => CACHED [6/9] RUN set -eux ;     groupadd --gid 1000 ggml ;     useradd --uid 1000 --gid ggml --shell /bin/bash --create-home ggml ;                                                                                                                                                         0.0s
 => CACHED [7/9] RUN set -eux ;     curl -o actions-runner-linux-x64.tar.gz -L https://github.com/actions/runner/releases/download/v2.314.1/actions-runner-linux-x64-2.314.1.tar.gz ;      echo "6c726a118bbe02cd32e222f890e1e476567bf299353a96886ba75b423c1137b5  actions-runner-linux-x64.ta  0.0s
 => CACHED [8/9] ADD entrypoint.sh /entrypoint.sh                                                                                                                                                                                                                                               0.0s
 => CACHED [9/9] WORKDIR /github-runner                                                                                                                                                                                                                                                         0.0s
 => exporting to image                                                                                                                                                                                                                                                                          0.0s
 => => exporting layers                                                                                                                                                                                                                                                                         0.0s
 => => writing image sha256:8e405871f8db791b252bfebe69b0e1c1fc33d7489162217c179e1ae376e839a7                                                                                                                                                                                                    0.0s
 => => naming to docker.io/library/ggml-github-runner                                                                                                                                                                                                                                           0.0s
uid=1000(ggml) gid=1000(ggml) groups=1000(ggml),4(adm),20(dialout),24(cdrom),25(floppy),27(sudo),29(audio),30(dip),44(video),46(plugdev),118(netdev),119(lxd)
/mnt/models:
total 12
drwxrwxr-x 3 ggml ggml 4096 Mar 27 14:52 .
drwxr-xr-x 4 ggml ggml 4096 Mar 27 14:55 ..
drwxrwxr-x 2 ggml ggml 4096 Mar 27 14:52 phi-2

/mnt/models/phi-2:
total 8
drwxrwxr-x 2 ggml ggml 4096 Mar 27 14:52 .
drwxrwxr-x 3 ggml ggml 4096 Mar 27 14:52 ..
● docker.service - Docker Application Container Engine (Rootless)
     Loaded: loaded (/home/ggml/.config/systemd/user/docker.service; enabled; vendor preset: enabled)
     Active: active (running) since Wed 2024-03-27 14:35:18 UTC; 1h 54min ago
       Docs: https://docs.docker.com/go/rootless/
   Main PID: 5388 (rootlesskit)
     CGroup: /user.slice/user-1000.slice/user@1000.service/docker.service
             ├─5388 rootlesskit --state-dir=/run/user/1000/dockerd-rootless --net=slirp4netns --mtu=65520 --slirp4netns-sandbox=auto --slirp4netns-seccomp=auto --disable-host-loopback --port-driver=builtin --copy-up=/etc --copy-up=/run --propagation=rslave /usr/bin/dockerd-rootless.sh
             ├─5399 /proc/self/exe --state-dir=/run/user/1000/dockerd-rootless --net=slirp4netns --mtu=65520 --slirp4netns-sandbox=auto --slirp4netns-seccomp=auto --disable-host-loopback --port-driver=builtin --copy-up=/etc --copy-up=/run --propagation=rslave /usr/bin/dockerd-rootless.sh
             ├─5417 slirp4netns --mtu 65520 -r 3 --disable-host-loopback --enable-sandbox --enable-seccomp 5399 tap0
             ├─5424 dockerd
             └─5444 containerd --config /run/user/1000/docker/containerd/containerd.toml

Mar 27 15:28:22 ggml-5-x86-cuda-t4 dockerd-rootless.sh[5444]: time="2024-03-27T15:28:22.452586824Z" level=info msg="loading plugin \"io.containerd.event.v1.publisher\"..." runtime=io.containerd.runc.v2 type=io.containerd.event.v1
Mar 27 15:28:22 ggml-5-x86-cuda-t4 dockerd-rootless.sh[5444]: time="2024-03-27T15:28:22.452655617Z" level=info msg="loading plugin \"io.containerd.internal.v1.shutdown\"..." runtime=io.containerd.runc.v2 type=io.containerd.internal.v1
Mar 27 15:28:22 ggml-5-x86-cuda-t4 dockerd-rootless.sh[5444]: time="2024-03-27T15:28:22.452664504Z" level=info msg="loading plugin \"io.containerd.ttrpc.v1.task\"..." runtime=io.containerd.runc.v2 type=io.containerd.ttrpc.v1
Mar 27 15:28:22 ggml-5-x86-cuda-t4 dockerd-rootless.sh[5444]: time="2024-03-27T15:28:22.452773995Z" level=info msg="starting signal loop" namespace=moby path=/run/.ro1265498934/user/1000/docker/containerd/daemon/io.containerd.runtime.v2.task/moby/8ee3da0a86eef6f097c10b5c8dac7eb46ecfda37e15ed>
Mar 27 15:28:23 ggml-5-x86-cuda-t4 dockerd-rootless.sh[5444]: time="2024-03-27T15:28:23.234137294Z" level=info msg="shim disconnected" id=8ee3da0a86eef6f097c10b5c8dac7eb46ecfda37e15edbba2460915518842478
Mar 27 15:28:23 ggml-5-x86-cuda-t4 dockerd-rootless.sh[5444]: time="2024-03-27T15:28:23.234178223Z" level=warning msg="cleaning up after shim disconnected" id=8ee3da0a86eef6f097c10b5c8dac7eb46ecfda37e15edbba2460915518842478 namespace=moby
Mar 27 15:28:23 ggml-5-x86-cuda-t4 dockerd-rootless.sh[5444]: time="2024-03-27T15:28:23.234186238Z" level=info msg="cleaning up dead shim"
Mar 27 15:28:23 ggml-5-x86-cuda-t4 dockerd-rootless.sh[5424]: time="2024-03-27T15:28:23.234224813Z" level=info msg="ignoring event" container=8ee3da0a86eef6f097c10b5c8dac7eb46ecfda37e15edbba2460915518842478 module=libcontainerd namespace=moby topic=/tasks/delete type="*events.TaskDelete"
Mar 27 15:28:23 ggml-5-x86-cuda-t4 dockerd-rootless.sh[5444]: time="2024-03-27T15:28:23.239300079Z" level=warning msg="cleanup warnings time=\"2024-03-27T15:28:23Z\" level=info msg=\"starting signal loop\" namespace=moby pid=31563 runtime=io.containerd.runc.v2\n"
Mar 27 15:28:23 ggml-5-x86-cuda-t4 dockerd-rootless.sh[5424]: time="2024-03-27T15:28:23.240170910Z" level=warning msg="failed to close stdin: task 8ee3da0a86eef6f097c10b5c8dac7eb46ecfda37e15edbba2460915518842478 not found: not found"
OK Docker rootless
● docker.service - Docker Application Container Engine
     Loaded: loaded (/lib/systemd/system/docker.service; disabled; vendor preset: enabled)
     Active: inactive (dead)
TriggeredBy: ● docker.socket
       Docs: https://docs.docker.com
OK Docker root stopped
Building models downloader...
[+] Building 0.9s (10/10) FINISHED                                                                                                                                                                                                                                                   docker:rootless
 => [internal] load build definition from Dockerfile                                                                                                                                                                                                                                            0.0s
 => => transferring dockerfile: 729B                                                                                                                                                                                                                                                            0.0s
 => [internal] load metadata for docker.io/nvidia/cuda:12.2.2-devel-ubuntu22.04                                                                                                                                                                                                                 0.7s
 => [internal] load .dockerignore                                                                                                                                                                                                                                                               0.0s
 => => transferring context: 59B                                                                                                                                                                                                                                                                0.0s
 => [1/5] FROM docker.io/nvidia/cuda:12.2.2-devel-ubuntu22.04@sha256:ae8a022c02aec945c4f8c52f65deaf535de7abb58e840350d19391ec683f4980                                                                                                                                                           0.0s
 => [internal] load build context                                                                                                                                                                                                                                                               0.0s
 => => transferring context: 35B                                                                                                                                                                                                                                                                0.0s
 => CACHED [2/5] RUN set -eux ;     apt update ;     apt -y install             git             cmake             libcurl4-openssl-dev ;                                                                                                                                                        0.0s
 => CACHED [3/5] WORKDIR /llama.cpp                                                                                                                                                                                                                                                             0.0s
 => CACHED [4/5] RUN set -eux;     git clone https://github.com/ggerganov/llama.cpp.git . ;     mkdir build ;     cd build ;     cmake ..       -DLLAMA_CURL=ON       -DLLAMA_CUBLAS=ON       -DCMAKE_CUDA_ARCHITECTURES=75       -DLLAMA_NATIVE=OFF       -DCMAKE_BUILD_TYPE=Release;     cma  0.0s
 => CACHED [5/5] ADD entrypoint.sh /entrypoint.sh                                                                                                                                                                                                                                               0.0s
 => exporting to image                                                                                                                                                                                                                                                                          0.0s
 => => exporting layers                                                                                                                                                                                                                                                                         0.0s
 => => writing image sha256:ecef68e568b4a5739d932f4524307e23800cc865f0edf5ed7322afde1fd982a1                                                                                                                                                                                                    0.0s
 => => naming to docker.io/library/llama.cpp-model-downloader                                                                                                                                                                                                                                   0.0s
ggml-ci: downloading models...
ggml-ci:     --hf-repo ggml-org/models --hf-file phi-2/ggml-model-q4_0.gguf
ggml-ci:     --hf-repo ggml-org/models --hf-file phi-2/ggml-model-q8_0.gguf
ggml@ggml-5-x86-cuda-t4:~/ci$ 

@phymbert
Copy link
Contributor Author

cat download_model.ggml-model-q4_0.gguf.log

Looks good from the VM, could you share again please cat download_model.ggml-model-q4_0.gguf.log ?

@ggerganov
Copy link
Member

$  cat download_model.ggml-model-q4_0.gguf.log
Wed Mar 27 16:30:01 2024       
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.54.15              Driver Version: 550.54.15      CUDA Version: 12.4     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  Tesla T4                       Off |   00000001:00:00.0 Off |                    0 |
| N/A   36C    P0             33W /   70W |      96MiB /  15360MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
                                                                                         
+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI        PID   Type   Process name                              GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
+-----------------------------------------------------------------------------------------+
uid=1000 gid=1000 groups=1000
/models:
total 12
drwxrwxr-x 3 root root 4096 Mar 27 14:52 .
drwxr-xr-x 1 root root 4096 Mar 27 16:30 ..
drwxrwxr-x 2 root root 4096 Mar 27 14:52 phi-2

/models/phi-2:
total 8
drwxrwxr-x 2 root root 4096 Mar 27 14:52 .
drwxrwxr-x 3 root root 4096 Mar 27 14:52 ..
HF_REPO ggml-org/models
HF_FILE phi-2/ggml-model-q4_0.gguf
Failed to open logfile 'main.log' with error 'Permission denied'
[1711557001] Log start
[1711557001] Cmd: ./build/bin/main --hf-repo ggml-org/models --hf-file phi-2/ggml-model-q4_0.gguf --model /models/phi-2/ggml-model-q4_0.gguf --random-prompt --n-predict 1
[1711557001] main: build = 2551 (e5b89a44)
[1711557001] main: built with cc (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0 for x86_64-linux-gnu
[1711557001] main: seed  = 1711557001
[1711557001] main: llama backend init
[1711557001] main: load the model and apply lora adapter, if any
llama_download_file: error opening local file for writing: /models/phi-2/ggml-model-q4_0.gguf
llama_init_from_gpt_params: error: failed to load model '/models/phi-2/ggml-model-q4_0.gguf'
[1711557002] main: error: unable to load model

@phymbert
Copy link
Contributor Author

@ggerganov please pull and try again ;)

@ggerganov
Copy link
Member

I think it works now. It sits here:

ggml-ci: downloading models...
ggml-ci:     --hf-repo ggml-org/models --hf-file phi-2/ggml-model-q4_0.gguf
ggml-ci:     --hf-repo ggml-org/models --hf-file phi-2/ggml-model-q8_0.gguf
ggml-ci: starting github runner manager on repo=ggerganov/llama.cpp label=Standard_NC4as_T4_v3...
19dfafa44ee4642dcdaef67d0cb122c048c6305708a2e1be247127711c14516a
ggml-ci: github runner manager started.
ggml-ci: github runner manager logs:
         CTRL+C to stop logs pulling
ggml-ci: fetching workflows of ggerganov/llama.cpp ...

@phymbert
Copy link
Contributor Author

I think it works now. It sits here:

ggml-ci: downloading models...
ggml-ci:     --hf-repo ggml-org/models --hf-file phi-2/ggml-model-q4_0.gguf
ggml-ci:     --hf-repo ggml-org/models --hf-file phi-2/ggml-model-q8_0.gguf
ggml-ci: starting github runner manager on repo=ggerganov/llama.cpp label=Standard_NC4as_T4_v3...
19dfafa44ee4642dcdaef67d0cb122c048c6305708a2e1be247127711c14516a
ggml-ci: github runner manager started.
ggml-ci: github runner manager logs:
         CTRL+C to stop logs pulling
ggml-ci: fetching workflows of ggerganov/llama.cpp ...

Scheduling: https://github.com/ggerganov/llama.cpp/actions/runs/8434231566?pr=6283

@phymbert
Copy link
Contributor Author

phymbert commented Apr 3, 2024

@ggerganov Should we merge this ?

Copy link
Member

@ggerganov ggerganov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, thanks for the reminder. Good job!

@ggerganov ggerganov merged commit b96b89b into ggml-org:master Apr 3, 2024
@phymbert phymbert deleted the hp/github-runner branch April 3, 2024 17:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants