JIT GitHub docker runner #2

phymbert · 2024-03-24T14:10:45Z

Motivation

In the context of:

server: bench: continuous performance testing llama.cpp#6233

A balanced approach between raw ggml-ci and Github self-hosted runner.

Approach

Periodically a python script is pulling jobs waiting for runner, start an ephemer Just In Time Github runner within a docker container with nvidia runtime.

Test

Tested here: https://github.com/phymbert/llama.cpp/actions/runs/8417731437

How to install a new runner manager:

git clone https://github.com/ggml-org/ci
./install-cuda.sh
./install-docker.sh
./start-github-runner-manager.sh REPO TOKEN RUNNER_LABEL

Example:

./start-github-runner-manager.sh phymbert/llama.cpp XYZ Standard_NC4as_T4_v3

ggml-ci: downloading models...
ggml-ci:     --hf-repo ggml-org/models --hf-file phi-2/ggml-model-q4_0.gguf
ggml-ci:     --hf-repo ggml-org/models --hf-file phi-2/ggml-model-q8_0.gguf
ggml-github-runners-manager
ggml-ci: starting github runner manager on repo=phymbert/llama.cpp label=Standard_NC4as_T4_v3...
0ff70e42bae2cc57696a36828c17ae6ffdbac9d8badd78d5addb3d68dc1c78d6
ggml-ci: github runner manager started.
ggml-ci: github runner manager logs:
         CTRL+C to stop logs pulling
ggml-ci: fetching workflows of phymbert/llama.cpp ...
ggml-ci: workflows iteration done.
ggml-ci: fetching workflows of phymbert/llama.cpp ...
ggml-ci: workflows iteration done.
ggml-ci: fetching workflows of phymbert/llama.cpp ...
ggml-ci:     ggml-runner-90932568-23048096941-workflow_dispatch-1711360334 triggered for workflow_name=Benchmark
ggml-ci:     ggml-runner-90932568-23048096941-workflow_dispatch-1711360334 running Github job runner id=165 os=linux labels=['self-hosted', 'X64', 'Standard_NC4as_T4_v3', 'linux']
ggml-ci:     ggml-runner-90932568-23048096941-workflow_dispatch-1711360334 done
ggml-ci: workflows iteration done.
ggml-ci: fetching workflows of phymbert/llama.cpp ...
ggml-ci: workflows iteration done.

phymbert · 2024-03-24T14:12:49Z

@ggerganov @ngxson FYI

…package list

images/github-runners-manager/Dockerfile

images/github-runner/Dockerfile

images/github-runners-manager/manager.py

images/github-runner/Dockerfile

install-docker.sh

- properly create the user - add autoremove and tmpfs - add netcat for the workflow to check if the server starts

…llation in the image, lowercase container/runner name

images/github-runners-manager/manager.py

- use a tmpfs for the runner workdir - add security_opt - mount the models folder

ci: github-runner-manager: fix tmpfs

images/github-runners-manager/manager.py

ci: github-runner-manager: fix tmpfs exec right, nice logs

# Conflicts: # install-docker.sh

start-github-runner-manager.sh

Co-authored-by: Xuan Son Nguyen <thichthat@gmail.com>

ngxson

LGTM, thanks for taking time for this!

(Btw the "Resolve" button doesn't show on my side. Maybe I'm don't have permission. You can "resolve" my comments above if you want)

images/github-runners-manager/manager.py

Co-authored-by: Xuan Son Nguyen <thichthat@gmail.com>

ggerganov · 2024-03-27T14:40:33Z

./install-docker.sh requires:

apt install uidmap

ggerganov · 2024-03-27T14:51:14Z

The ./start-github-runner-manager.sh interrupts without error when trying to download the models it seems:

Building github runner manager image...
[+] Building 0.3s (12/12) FINISHED                                                                                                                                                      docker:rootless
 => [internal] load build definition from Dockerfile                                                                                                                                               0.0s
 => => transferring dockerfile: 653B                                                                                                                                                               0.0s
 => [internal] load metadata for docker.io/library/ubuntu:latest                                                                                                                                   0.0s
 => [internal] load .dockerignore                                                                                                                                                                  0.0s
 => => transferring context: 59B                                                                                                                                                                   0.0s
 => [1/7] FROM docker.io/library/ubuntu:latest                                                                                                                                                     0.0s
 => [internal] load build context                                                                                                                                                                  0.0s
 => => transferring context: 99B                                                                                                                                                                   0.0s
 => CACHED [2/7] RUN set -eux ;     apt update ;     apt -y upgrade ;     apt -y install             git             openssh-client             python3             python3-pip             curl   0.0s
 => CACHED [3/7] WORKDIR /ggml-ci                                                                                                                                                                  0.0s
 => CACHED [4/7] ADD requirements.txt ./                                                                                                                                                           0.0s
 => CACHED [5/7] RUN set -eux ;     pip install -r requirements.txt ;                                                                                                                              0.0s
 => CACHED [6/7] ADD manager.py ./                                                                                                                                                                 0.0s
 => CACHED [7/7] ADD entrypoint.sh /                                                                                                                                                               0.0s
 => exporting to image                                                                                                                                                                             0.0s
 => => exporting layers                                                                                                                                                                            0.0s
 => => writing image sha256:56e4ae8104fb3bb75c0325de003e2db4d37d84122f429deccf02f9c92967b1e5                                                                                                       0.0s
 => => naming to docker.io/library/ggml-github-runners-manager                                                                                                                                     0.0s
Building github runner image...
[+] Building 3.2s (14/14) FINISHED                                                                                                                                                      docker:rootless
 => [internal] load build definition from Dockerfile                                                                                                                                               0.0s
 => => transferring dockerfile: 1.78kB                                                                                                                                                             0.0s
 => [internal] load metadata for docker.io/library/ubuntu:latest                                                                                                                                   0.0s
 => [internal] load .dockerignore                                                                                                                                                                  0.0s
 => => transferring context: 59B                                                                                                                                                                   0.0s
 => [1/9] FROM docker.io/library/ubuntu:latest                                                                                                                                                     0.0s
 => [internal] load build context                                                                                                                                                                  0.0s
 => => transferring context: 35B                                                                                                                                                                   0.0s
 => CACHED [2/9] RUN set -eux ;     apt update ;     apt -y upgrade ;     apt -y install             libicu-dev             curl             wget             build-essential             cmake    0.0s
 => CACHED [3/9] RUN set -eux ;     wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64/cuda-keyring_1.1-1_all.deb ;     dpkg -i cuda-keyring_1.1-1_all.deb ;     apt  0.0s
 => CACHED [4/9] RUN set -eux ;     mkdir /ggml-ci /tmp/github-runner ;     chown 1000:1000 /ggml-ci /tmp/github-runner ;                                                                          0.0s
 => CACHED [5/9] WORKDIR /ggml-ci                                                                                                                                                                  0.0s
 => CACHED [6/9] RUN set -eux ;     groupadd --gid 1000 ggml ;     useradd --uid 1000 --gid ggml --shell /bin/bash --create-home ggml ;                                                            0.0s
 => CACHED [7/9] RUN set -eux ;     curl -o actions-runner-linux-x64.tar.gz -L https://github.com/actions/runner/releases/download/v2.314.1/actions-runner-linux-x64-2.314.1.tar.gz ;      echo "  0.0s
 => CACHED [8/9] ADD entrypoint.sh /entrypoint.sh                                                                                                                                                  0.0s
 => CACHED [9/9] WORKDIR /github-runner                                                                                                                                                            0.0s
 => exporting to image                                                                                                                                                                             0.0s
 => => exporting layers                                                                                                                                                                            0.0s
 => => writing image sha256:8e405871f8db791b252bfebe69b0e1c1fc33d7489162217c179e1ae376e839a7                                                                                                       0.0s
 => => naming to docker.io/library/ggml-github-runner                                                                                                                                              0.0s
Building models downloader...
[+] Building 1.1s (10/10) FINISHED                                                                                                                                                      docker:rootless
 => [internal] load build definition from Dockerfile                                                                                                                                               0.0s
 => => transferring dockerfile: 729B                                                                                                                                                               0.0s
 => [internal] load metadata for docker.io/nvidia/cuda:12.2.2-devel-ubuntu22.04                                                                                                                    0.3s
 => [internal] load .dockerignore                                                                                                                                                                  0.1s
 => => transferring context: 59B                                                                                                                                                                   0.0s
 => [1/5] FROM docker.io/nvidia/cuda:12.2.2-devel-ubuntu22.04@sha256:ae8a022c02aec945c4f8c52f65deaf535de7abb58e840350d19391ec683f4980                                                              0.0s
 => [internal] load build context                                                                                                                                                                  0.0s
 => => transferring context: 35B                                                                                                                                                                   0.0s
 => CACHED [2/5] RUN set -eux ;     apt update ;     apt -y install             git             cmake             libcurl4-openssl-dev ;                                                           0.0s
 => CACHED [3/5] WORKDIR /llama.cpp                                                                                                                                                                0.0s
 => CACHED [4/5] RUN set -eux;     git clone https://github.com/ggerganov/llama.cpp.git . ;     mkdir build ;     cd build ;     cmake ..       -DLLAMA_CURL=ON       -DLLAMA_CUBLAS=ON       -DC  0.0s
 => CACHED [5/5] ADD entrypoint.sh /entrypoint.sh                                                                                                                                                  0.0s
 => exporting to image                                                                                                                                                                             0.0s
 => => exporting layers                                                                                                                                                                            0.0s
 => => writing image sha256:686088d20a073e2300ada195309735607552510e1172e96cc82cf5579b748d53                                                                                                       0.0s
 => => naming to docker.io/library/llama.cpp-model-downloader                                                                                                                                      0.0s
ggml-ci: downloading models...
ggml-ci:     --hf-repo ggml-org/models --hf-file phi-2/ggml-model-q4_0.gguf
ggml-ci:     --hf-repo ggml-org/models --hf-file phi-2/ggml-model-q8_0.gguf
ggml@ggml-5-x86-cuda-t4:~/ci$

phymbert · 2024-03-27T14:53:57Z

Can you please share the docker logs or remove the stdout redirection ? Maybe a mount issue

ggerganov · 2024-03-27T14:57:37Z

It does not have permission:

cat download_model.ggml-model-q4_0.gguf.log 
Wed Mar 27 14:52:34 2024       
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.54.15              Driver Version: 550.54.15      CUDA Version: 12.4     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  Tesla T4                       Off |   00000001:00:00.0 Off |                    0 |
| N/A   36C    P0             27W /   70W |      95MiB /  15360MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
                                                                                         
+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI        PID   Type   Process name                              GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
+-----------------------------------------------------------------------------------------+
HF_REPO ggml-org/models
HF_FILE phi-2/ggml-model-q4_0.gguf
Failed to open logfile 'main.log' with error 'Permission denied'
[1711551154] Log start
[1711551154] Cmd: ./build/bin/main --hf-repo ggml-org/models --hf-file phi-2/ggml-model-q4_0.gguf --model /models/phi-2/ggml-model-q4_0.gguf --random-prompt --n-predict 1
[1711551154] main: build = 2551 (e5b89a44)
[1711551154] main: built with cc (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0 for x86_64-linux-gnu
[1711551154] main: seed  = 1711551154
[1711551154] main: llama backend init
[1711551154] main: load the model and apply lora adapter, if any
llama_download_file: error opening local file for writing: /models/phi-2/ggml-model-q4_0.gguf
llama_init_from_gpt_params: error: failed to load model '/models/phi-2/ggml-model-q4_0.gguf'
[1711551154] main: error: unable to load model

What user is docker using? /mnt has ownership by ggml user

$ ls -l /mnt
total 20
drwx------ 2 root root 16384 Mar 27 14:00 lost+found
drwxrwxr-x 3 ggml ggml  4096 Mar 27 14:52 models

phymbert · 2024-03-27T15:03:52Z

It run with user 1000:1000, is this the ggml uid:gid ?
is docker rootless ?

ggerganov · 2024-03-27T15:07:32Z

$ id
uid=1000(ggml) gid=1000(ggml)

is docker rootless ?

no idea

ngxson · 2024-03-27T15:17:00Z

Seems like it's a mismatch between uid/gid inside/outside of container. Pay attention that docker may use uid mapping which maps user 1000 insider container to something like 1001000 in the host. I'm installing a docker rootless on my side to test if that's the case or not

phymbert · 2024-03-27T15:26:38Z

Please pull, I have added some debug commands for permission and docker service

ngxson · 2024-03-27T15:49:53Z

@phymbert I understand the problem now:

uid 1000 is in fact mapped to 100999 in host machine. From runner (inside container) perspective, /models is owned by root, not 1000. There is 2 solutions to fix that:

~~chmod -R 777 /mnt/models ==> works but from an security engineer this is not acceptable~~
~~chown -R 100999:100999 /mnt/models ==> should work, but I'm not sure if the number 100999 will change in the future or not (seems like it won't)~~ (See below)

inside container, /models is currently mounted as read-only, I thought that we will only read from it and never write, but seems like our test wants to write to that folder. Can we maybe do a cp /models/* /tmp/models/* and only work with /tmp? This solution will also solve the first problem that I mentioned above

phymbert · 2024-03-27T16:06:49Z

inside container, /models is currently mounted as read-only, I thought that we will only read from it and never write, but seems like our test wants to write to that folder. Can we maybe do a cp /models/* /tmp/models/* and only work with /tmp? This solution will also solve the first problem that I mentioned above

Here we are at model downloader step, it is mounted with rw, let's wait for the logs, we never know

ngxson · 2024-03-27T16:17:30Z

Oh I see, then it's the start-github-runner-manager.sh you can simply remove -u "1000:1000" when running llama.cpp-model-downloader. I suggested you this change earlier, but in fact I overlooked it, sorry.

It's true that as you said, we're running docker rootless so -u "1000:1000" in download step will make the actual uid become 100999, not 1000. What I suggested only applicable in rootfull mode.

Ref: https://github.com/ggml-org/ci/pull/2/files#r1537304334

phymbert · 2024-03-27T16:20:29Z

Oh I see, then it's the start-github-runner-manager.sh you can simply remove -u "1000:1000" when running llama.cpp-model-downloader. I suggested you this change earlier, but in fact I overlooked it, sorry.

It's true that as you said, we're running docker rootless so -u "1000:1000" in download step will make the actual uid become 100999, not 1000. What I suggested only applicable in rootfull mode.

Ref: https://github.com/ggml-org/ci/pull/2/files#r1537304334

OK, but it was working fine on the other VM. So probably not this

ggerganov · 2024-03-27T16:31:11Z

Logs after pull:

Building github runner manager image...
[+] Building 0.3s (12/12) FINISHED                                                                                                                                                                                                                                                   docker:rootless
 => [internal] load build definition from Dockerfile                                                                                                                                                                                                                                            0.0s
 => => transferring dockerfile: 653B                                                                                                                                                                                                                                                            0.0s
 => [internal] load metadata for docker.io/library/ubuntu:latest                                                                                                                                                                                                                                0.0s
 => [internal] load .dockerignore                                                                                                                                                                                                                                                               0.0s
 => => transferring context: 59B                                                                                                                                                                                                                                                                0.0s
 => [1/7] FROM docker.io/library/ubuntu:latest                                                                                                                                                                                                                                                  0.0s
 => [internal] load build context                                                                                                                                                                                                                                                               0.0s
 => => transferring context: 99B                                                                                                                                                                                                                                                                0.0s
 => CACHED [2/7] RUN set -eux ;     apt update ;     apt -y upgrade ;     apt -y install             git             openssh-client             python3             python3-pip             curl              dbus-user-session              uidmap ;     curl -sSL https://get.docker.com/ |   0.0s
 => CACHED [3/7] WORKDIR /ggml-ci                                                                                                                                                                                                                                                               0.0s
 => CACHED [4/7] ADD requirements.txt ./                                                                                                                                                                                                                                                        0.0s
 => CACHED [5/7] RUN set -eux ;     pip install -r requirements.txt ;                                                                                                                                                                                                                           0.0s
 => CACHED [6/7] ADD manager.py ./                                                                                                                                                                                                                                                              0.0s
 => CACHED [7/7] ADD entrypoint.sh /                                                                                                                                                                                                                                                            0.0s
 => exporting to image                                                                                                                                                                                                                                                                          0.0s
 => => exporting layers                                                                                                                                                                                                                                                                         0.0s
 => => writing image sha256:56e4ae8104fb3bb75c0325de003e2db4d37d84122f429deccf02f9c92967b1e5                                                                                                                                                                                                    0.0s
 => => naming to docker.io/library/ggml-github-runners-manager                                                                                                                                                                                                                                  0.0s
Building github runner image...
[+] Building 0.2s (14/14) FINISHED                                                                                                                                                                                                                                                   docker:rootless
 => [internal] load build definition from Dockerfile                                                                                                                                                                                                                                            0.0s
 => => transferring dockerfile: 1.78kB                                                                                                                                                                                                                                                          0.0s
 => [internal] load metadata for docker.io/library/ubuntu:latest                                                                                                                                                                                                                                0.0s
 => [internal] load .dockerignore                                                                                                                                                                                                                                                               0.0s
 => => transferring context: 59B                                                                                                                                                                                                                                                                0.0s
 => [1/9] FROM docker.io/library/ubuntu:latest                                                                                                                                                                                                                                                  0.0s
 => [internal] load build context                                                                                                                                                                                                                                                               0.0s
 => => transferring context: 35B                                                                                                                                                                                                                                                                0.0s
 => CACHED [2/9] RUN set -eux ;     apt update ;     apt -y upgrade ;     apt -y install             libicu-dev             curl             wget             build-essential             cmake             git             python3-pip             python3-venv             language-pack-en   0.0s
 => CACHED [3/9] RUN set -eux ;     wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64/cuda-keyring_1.1-1_all.deb ;     dpkg -i cuda-keyring_1.1-1_all.deb ;     apt-get update ;     apt-get -y install        cuda-nvcc-12-2       libcublas-dev-12-2;           0.0s
 => CACHED [4/9] RUN set -eux ;     mkdir /ggml-ci /tmp/github-runner ;     chown 1000:1000 /ggml-ci /tmp/github-runner ;                                                                                                                                                                       0.0s
 => CACHED [5/9] WORKDIR /ggml-ci                                                                                                                                                                                                                                                               0.0s
 => CACHED [6/9] RUN set -eux ;     groupadd --gid 1000 ggml ;     useradd --uid 1000 --gid ggml --shell /bin/bash --create-home ggml ;                                                                                                                                                         0.0s
 => CACHED [7/9] RUN set -eux ;     curl -o actions-runner-linux-x64.tar.gz -L https://github.com/actions/runner/releases/download/v2.314.1/actions-runner-linux-x64-2.314.1.tar.gz ;      echo "6c726a118bbe02cd32e222f890e1e476567bf299353a96886ba75b423c1137b5  actions-runner-linux-x64.ta  0.0s
 => CACHED [8/9] ADD entrypoint.sh /entrypoint.sh                                                                                                                                                                                                                                               0.0s
 => CACHED [9/9] WORKDIR /github-runner                                                                                                                                                                                                                                                         0.0s
 => exporting to image                                                                                                                                                                                                                                                                          0.0s
 => => exporting layers                                                                                                                                                                                                                                                                         0.0s
 => => writing image sha256:8e405871f8db791b252bfebe69b0e1c1fc33d7489162217c179e1ae376e839a7                                                                                                                                                                                                    0.0s
 => => naming to docker.io/library/ggml-github-runner                                                                                                                                                                                                                                           0.0s
uid=1000(ggml) gid=1000(ggml) groups=1000(ggml),4(adm),20(dialout),24(cdrom),25(floppy),27(sudo),29(audio),30(dip),44(video),46(plugdev),118(netdev),119(lxd)
/mnt/models:
total 12
drwxrwxr-x 3 ggml ggml 4096 Mar 27 14:52 .
drwxr-xr-x 4 ggml ggml 4096 Mar 27 14:55 ..
drwxrwxr-x 2 ggml ggml 4096 Mar 27 14:52 phi-2

/mnt/models/phi-2:
total 8
drwxrwxr-x 2 ggml ggml 4096 Mar 27 14:52 .
drwxrwxr-x 3 ggml ggml 4096 Mar 27 14:52 ..
● docker.service - Docker Application Container Engine (Rootless)
     Loaded: loaded (/home/ggml/.config/systemd/user/docker.service; enabled; vendor preset: enabled)
     Active: active (running) since Wed 2024-03-27 14:35:18 UTC; 1h 54min ago
       Docs: https://docs.docker.com/go/rootless/
   Main PID: 5388 (rootlesskit)
     CGroup: /user.slice/user-1000.slice/user@1000.service/docker.service
             ├─5388 rootlesskit --state-dir=/run/user/1000/dockerd-rootless --net=slirp4netns --mtu=65520 --slirp4netns-sandbox=auto --slirp4netns-seccomp=auto --disable-host-loopback --port-driver=builtin --copy-up=/etc --copy-up=/run --propagation=rslave /usr/bin/dockerd-rootless.sh
             ├─5399 /proc/self/exe --state-dir=/run/user/1000/dockerd-rootless --net=slirp4netns --mtu=65520 --slirp4netns-sandbox=auto --slirp4netns-seccomp=auto --disable-host-loopback --port-driver=builtin --copy-up=/etc --copy-up=/run --propagation=rslave /usr/bin/dockerd-rootless.sh
             ├─5417 slirp4netns --mtu 65520 -r 3 --disable-host-loopback --enable-sandbox --enable-seccomp 5399 tap0
             ├─5424 dockerd
             └─5444 containerd --config /run/user/1000/docker/containerd/containerd.toml

Mar 27 15:28:22 ggml-5-x86-cuda-t4 dockerd-rootless.sh[5444]: time="2024-03-27T15:28:22.452586824Z" level=info msg="loading plugin \"io.containerd.event.v1.publisher\"..." runtime=io.containerd.runc.v2 type=io.containerd.event.v1
Mar 27 15:28:22 ggml-5-x86-cuda-t4 dockerd-rootless.sh[5444]: time="2024-03-27T15:28:22.452655617Z" level=info msg="loading plugin \"io.containerd.internal.v1.shutdown\"..." runtime=io.containerd.runc.v2 type=io.containerd.internal.v1
Mar 27 15:28:22 ggml-5-x86-cuda-t4 dockerd-rootless.sh[5444]: time="2024-03-27T15:28:22.452664504Z" level=info msg="loading plugin \"io.containerd.ttrpc.v1.task\"..." runtime=io.containerd.runc.v2 type=io.containerd.ttrpc.v1
Mar 27 15:28:22 ggml-5-x86-cuda-t4 dockerd-rootless.sh[5444]: time="2024-03-27T15:28:22.452773995Z" level=info msg="starting signal loop" namespace=moby path=/run/.ro1265498934/user/1000/docker/containerd/daemon/io.containerd.runtime.v2.task/moby/8ee3da0a86eef6f097c10b5c8dac7eb46ecfda37e15ed>
Mar 27 15:28:23 ggml-5-x86-cuda-t4 dockerd-rootless.sh[5444]: time="2024-03-27T15:28:23.234137294Z" level=info msg="shim disconnected" id=8ee3da0a86eef6f097c10b5c8dac7eb46ecfda37e15edbba2460915518842478
Mar 27 15:28:23 ggml-5-x86-cuda-t4 dockerd-rootless.sh[5444]: time="2024-03-27T15:28:23.234178223Z" level=warning msg="cleaning up after shim disconnected" id=8ee3da0a86eef6f097c10b5c8dac7eb46ecfda37e15edbba2460915518842478 namespace=moby
Mar 27 15:28:23 ggml-5-x86-cuda-t4 dockerd-rootless.sh[5444]: time="2024-03-27T15:28:23.234186238Z" level=info msg="cleaning up dead shim"
Mar 27 15:28:23 ggml-5-x86-cuda-t4 dockerd-rootless.sh[5424]: time="2024-03-27T15:28:23.234224813Z" level=info msg="ignoring event" container=8ee3da0a86eef6f097c10b5c8dac7eb46ecfda37e15edbba2460915518842478 module=libcontainerd namespace=moby topic=/tasks/delete type="*events.TaskDelete"
Mar 27 15:28:23 ggml-5-x86-cuda-t4 dockerd-rootless.sh[5444]: time="2024-03-27T15:28:23.239300079Z" level=warning msg="cleanup warnings time=\"2024-03-27T15:28:23Z\" level=info msg=\"starting signal loop\" namespace=moby pid=31563 runtime=io.containerd.runc.v2\n"
Mar 27 15:28:23 ggml-5-x86-cuda-t4 dockerd-rootless.sh[5424]: time="2024-03-27T15:28:23.240170910Z" level=warning msg="failed to close stdin: task 8ee3da0a86eef6f097c10b5c8dac7eb46ecfda37e15edbba2460915518842478 not found: not found"
OK Docker rootless
● docker.service - Docker Application Container Engine
     Loaded: loaded (/lib/systemd/system/docker.service; disabled; vendor preset: enabled)
     Active: inactive (dead)
TriggeredBy: ● docker.socket
       Docs: https://docs.docker.com
OK Docker root stopped
Building models downloader...
[+] Building 0.9s (10/10) FINISHED                                                                                                                                                                                                                                                   docker:rootless
 => [internal] load build definition from Dockerfile                                                                                                                                                                                                                                            0.0s
 => => transferring dockerfile: 729B                                                                                                                                                                                                                                                            0.0s
 => [internal] load metadata for docker.io/nvidia/cuda:12.2.2-devel-ubuntu22.04                                                                                                                                                                                                                 0.7s
 => [internal] load .dockerignore                                                                                                                                                                                                                                                               0.0s
 => => transferring context: 59B                                                                                                                                                                                                                                                                0.0s
 => [1/5] FROM docker.io/nvidia/cuda:12.2.2-devel-ubuntu22.04@sha256:ae8a022c02aec945c4f8c52f65deaf535de7abb58e840350d19391ec683f4980                                                                                                                                                           0.0s
 => [internal] load build context                                                                                                                                                                                                                                                               0.0s
 => => transferring context: 35B                                                                                                                                                                                                                                                                0.0s
 => CACHED [2/5] RUN set -eux ;     apt update ;     apt -y install             git             cmake             libcurl4-openssl-dev ;                                                                                                                                                        0.0s
 => CACHED [3/5] WORKDIR /llama.cpp                                                                                                                                                                                                                                                             0.0s
 => CACHED [4/5] RUN set -eux;     git clone https://github.com/ggerganov/llama.cpp.git . ;     mkdir build ;     cd build ;     cmake ..       -DLLAMA_CURL=ON       -DLLAMA_CUBLAS=ON       -DCMAKE_CUDA_ARCHITECTURES=75       -DLLAMA_NATIVE=OFF       -DCMAKE_BUILD_TYPE=Release;     cma  0.0s
 => CACHED [5/5] ADD entrypoint.sh /entrypoint.sh                                                                                                                                                                                                                                               0.0s
 => exporting to image                                                                                                                                                                                                                                                                          0.0s
 => => exporting layers                                                                                                                                                                                                                                                                         0.0s
 => => writing image sha256:ecef68e568b4a5739d932f4524307e23800cc865f0edf5ed7322afde1fd982a1                                                                                                                                                                                                    0.0s
 => => naming to docker.io/library/llama.cpp-model-downloader                                                                                                                                                                                                                                   0.0s
ggml-ci: downloading models...
ggml-ci:     --hf-repo ggml-org/models --hf-file phi-2/ggml-model-q4_0.gguf
ggml-ci:     --hf-repo ggml-org/models --hf-file phi-2/ggml-model-q8_0.gguf
ggml@ggml-5-x86-cuda-t4:~/ci$

phymbert · 2024-03-27T16:33:02Z

cat download_model.ggml-model-q4_0.gguf.log

Looks good from the VM, could you share again please cat download_model.ggml-model-q4_0.gguf.log ?

ggerganov · 2024-03-27T16:36:18Z

$  cat download_model.ggml-model-q4_0.gguf.log
Wed Mar 27 16:30:01 2024       
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.54.15              Driver Version: 550.54.15      CUDA Version: 12.4     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  Tesla T4                       Off |   00000001:00:00.0 Off |                    0 |
| N/A   36C    P0             33W /   70W |      96MiB /  15360MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
                                                                                         
+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI        PID   Type   Process name                              GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
+-----------------------------------------------------------------------------------------+
uid=1000 gid=1000 groups=1000
/models:
total 12
drwxrwxr-x 3 root root 4096 Mar 27 14:52 .
drwxr-xr-x 1 root root 4096 Mar 27 16:30 ..
drwxrwxr-x 2 root root 4096 Mar 27 14:52 phi-2

/models/phi-2:
total 8
drwxrwxr-x 2 root root 4096 Mar 27 14:52 .
drwxrwxr-x 3 root root 4096 Mar 27 14:52 ..
HF_REPO ggml-org/models
HF_FILE phi-2/ggml-model-q4_0.gguf
Failed to open logfile 'main.log' with error 'Permission denied'
[1711557001] Log start
[1711557001] Cmd: ./build/bin/main --hf-repo ggml-org/models --hf-file phi-2/ggml-model-q4_0.gguf --model /models/phi-2/ggml-model-q4_0.gguf --random-prompt --n-predict 1
[1711557001] main: build = 2551 (e5b89a44)
[1711557001] main: built with cc (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0 for x86_64-linux-gnu
[1711557001] main: seed  = 1711557001
[1711557001] main: llama backend init
[1711557001] main: load the model and apply lora adapter, if any
llama_download_file: error opening local file for writing: /models/phi-2/ggml-model-q4_0.gguf
llama_init_from_gpt_params: error: failed to load model '/models/phi-2/ggml-model-q4_0.gguf'
[1711557002] main: error: unable to load model

phymbert · 2024-03-27T17:03:20Z

@ggerganov please pull and try again ;)

ggerganov · 2024-03-27T17:11:43Z

I think it works now. It sits here:

ggml-ci: downloading models...
ggml-ci:     --hf-repo ggml-org/models --hf-file phi-2/ggml-model-q4_0.gguf
ggml-ci:     --hf-repo ggml-org/models --hf-file phi-2/ggml-model-q8_0.gguf
ggml-ci: starting github runner manager on repo=ggerganov/llama.cpp label=Standard_NC4as_T4_v3...
19dfafa44ee4642dcdaef67d0cb122c048c6305708a2e1be247127711c14516a
ggml-ci: github runner manager started.
ggml-ci: github runner manager logs:
         CTRL+C to stop logs pulling
ggml-ci: fetching workflows of ggerganov/llama.cpp ...

phymbert · 2024-03-27T17:20:06Z

I think it works now. It sits here:

ggml-ci: downloading models...
ggml-ci:     --hf-repo ggml-org/models --hf-file phi-2/ggml-model-q4_0.gguf
ggml-ci:     --hf-repo ggml-org/models --hf-file phi-2/ggml-model-q8_0.gguf
ggml-ci: starting github runner manager on repo=ggerganov/llama.cpp label=Standard_NC4as_T4_v3...
19dfafa44ee4642dcdaef67d0cb122c048c6305708a2e1be247127711c14516a
ggml-ci: github runner manager started.
ggml-ci: github runner manager logs:
         CTRL+C to stop logs pulling
ggml-ci: fetching workflows of ggerganov/llama.cpp ...

Scheduling: https://github.com/ggerganov/llama.cpp/actions/runs/8434231566?pr=6283

phymbert · 2024-04-03T17:06:03Z

@ggerganov Should we merge this ?

ggerganov

Yes, thanks for the reminder. Good job!

phymbert added 3 commits March 22, 2024 20:37

ci: add install-docker.sh

fbc9de8

ci: docker: move docker cache to large disk

349490f

ci: github runner

7c51732

phymbert force-pushed the hp/github-runner branch from 9f0583a to 7c51732 Compare March 24, 2024 14:12

phymbert mentioned this pull request Mar 24, 2024

server: bench: continuous performance testing ggml-org/llama.cpp#6233

Closed

16 tasks

phymbert added 2 commits March 24, 2024 15:49

ci: github runner: install cuda

d42e4db

ci: github runner: install cuda, downgrade to 12.2, reduce installed …

da68779

…package list

ngxson reviewed Mar 24, 2024

View reviewed changes

images/github-runners-manager/Dockerfile Show resolved Hide resolved

phymbert commented Mar 24, 2024

View reviewed changes

images/github-runner/Dockerfile Show resolved Hide resolved

ngxson reviewed Mar 24, 2024

View reviewed changes

images/github-runners-manager/manager.py Show resolved Hide resolved

ngxson reviewed Mar 24, 2024

View reviewed changes

images/github-runner/Dockerfile Show resolved Hide resolved

ngxson reviewed Mar 24, 2024

View reviewed changes

install-docker.sh Outdated Show resolved Hide resolved

phymbert added 2 commits March 24, 2024 17:05

ci: github runner: PR feedback:

adc8241

- properly create the user - add autoremove and tmpfs - add netcat for the workflow to check if the server starts

ci: github runner: set good GPU capabilities, remove the driver insta…

23c3a60

…llation in the image, lowercase container/runner name

phymbert mentioned this pull request Mar 24, 2024

server: continuous performance monitoring and PR comment ggml-org/llama.cpp#6283

Merged

8 tasks

phymbert commented Mar 24, 2024

View reviewed changes

images/github-runners-manager/manager.py Outdated Show resolved Hide resolved

ngxson reviewed Mar 24, 2024

View reviewed changes

images/github-runners-manager/manager.py Outdated Show resolved Hide resolved

phymbert added 3 commits March 25, 2024 07:02

ci: github runner manager:

b5c8c35

- use a tmpfs for the runner workdir - add security_opt - mount the models folder

ci: model downloader

19d7d85

ci: github runner: fix image missing cmake

202f6d0

ci: github-runner-manager: fix tmpfs

ngxson reviewed Mar 25, 2024

View reviewed changes

images/github-runners-manager/manager.py Show resolved Hide resolved

phymbert added 2 commits March 25, 2024 10:34

ci: github runner: move to tmpfs workdir, nicer logs

104cb78

ci: github-runner-manager: fix tmpfs exec right, nice logs

Merge branch 'master' into hp/github-runner

17ee86f

# Conflicts: # install-docker.sh

phymbert requested review from ngxson and ggerganov and removed request for ngxson March 25, 2024 09:37

ngxson reviewed Mar 25, 2024

View reviewed changes

start-github-runner-manager.sh Show resolved Hide resolved

phymbert requested a review from ngxson March 25, 2024 09:40

start the download model container with non priviledged user

f6319b1

Co-authored-by: Xuan Son Nguyen <thichthat@gmail.com>

ngxson approved these changes Mar 25, 2024

View reviewed changes

images/github-runners-manager/manager.py Outdated Show resolved Hide resolved

images/github-runners-manager/manager.py Outdated Show resolved Hide resolved

phymbert and others added 3 commits March 25, 2024 10:47

fix unused variable interpolation in manager strings

f46b826

Co-authored-by: Xuan Son Nguyen <thichthat@gmail.com>

fix unused variable interpolation in manager strings

8400ef3

Co-authored-by: Xuan Son Nguyen <thichthat@gmail.com>

ci: start-github-runner-manager.sh: fix missing EOL escape, better logs

99394a5

ci: start-github-runner-manager.sh: add debug info

5ffcb0b

ci: start-github-runner-manager.sh: remove lower id for model downloader

e160162

phymbert added 2 commits March 30, 2024 07:53

ci: install-docker.sh add uidmap

df19df5

ci: start-github-runner-manager.sh add noblock in systemctl commands

ba71b16

ggerganov approved these changes Apr 3, 2024

View reviewed changes

ggerganov merged commit b96b89b into ggml-org:master Apr 3, 2024

phymbert deleted the hp/github-runner branch April 3, 2024 17:10

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

JIT GitHub docker runner #2

JIT GitHub docker runner #2

phymbert commented Mar 24, 2024 •

edited

Loading

phymbert commented Mar 24, 2024

ngxson left a comment •

edited

Loading

ggerganov commented Mar 27, 2024

ggerganov commented Mar 27, 2024

phymbert commented Mar 27, 2024

ggerganov commented Mar 27, 2024

phymbert commented Mar 27, 2024

ggerganov commented Mar 27, 2024 •

edited

Loading

ngxson commented Mar 27, 2024 •

edited

Loading

phymbert commented Mar 27, 2024

ngxson commented Mar 27, 2024 •

edited

Loading

phymbert commented Mar 27, 2024 •

edited

Loading

ngxson commented Mar 27, 2024

phymbert commented Mar 27, 2024

ggerganov commented Mar 27, 2024

phymbert commented Mar 27, 2024

ggerganov commented Mar 27, 2024

phymbert commented Mar 27, 2024

ggerganov commented Mar 27, 2024

phymbert commented Mar 27, 2024

phymbert commented Apr 3, 2024

ggerganov left a comment •

edited

Loading

JIT GitHub docker runner #2

JIT GitHub docker runner #2

Conversation

phymbert commented Mar 24, 2024 • edited Loading

Motivation

Approach

Test

How to install a new runner manager:

phymbert commented Mar 24, 2024

ngxson left a comment • edited Loading

Choose a reason for hiding this comment

ggerganov commented Mar 27, 2024

ggerganov commented Mar 27, 2024

phymbert commented Mar 27, 2024

ggerganov commented Mar 27, 2024

phymbert commented Mar 27, 2024

ggerganov commented Mar 27, 2024 • edited Loading

ngxson commented Mar 27, 2024 • edited Loading

phymbert commented Mar 27, 2024

ngxson commented Mar 27, 2024 • edited Loading

phymbert commented Mar 27, 2024 • edited Loading

ngxson commented Mar 27, 2024

phymbert commented Mar 27, 2024

ggerganov commented Mar 27, 2024

phymbert commented Mar 27, 2024

ggerganov commented Mar 27, 2024

phymbert commented Mar 27, 2024

ggerganov commented Mar 27, 2024

phymbert commented Mar 27, 2024

phymbert commented Apr 3, 2024

ggerganov left a comment • edited Loading

Choose a reason for hiding this comment

phymbert commented Mar 24, 2024 •

edited

Loading

ngxson left a comment •

edited

Loading

ggerganov commented Mar 27, 2024 •

edited

Loading

ngxson commented Mar 27, 2024 •

edited

Loading

ngxson commented Mar 27, 2024 •

edited

Loading

phymbert commented Mar 27, 2024 •

edited

Loading

ggerganov left a comment •

edited

Loading