-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
self hosted runner stuck on queued #69
Comments
Just curious, but does it work with the summerwind runner image? At glance this seems more like a github issue, as you say you can see the runner is registered. All the controller does for you is to register runner pods for you, and anything after that depends on your runner image and github to work. One thing I'm wondering though is, do you have any stale runner or runner deployment resources in your k8s cluster? If so, could you try deleting them all and then creating only the needed one, to see if it resolves your issue? |
Yes the summerwind runner image worked fine, it only started happening when i switched to custom image. As soon as i get rid of the custom image field and rerun the workflow it works :/ but not sure why its not working with my custom image. Ive tried deleting the entire system and recreating it and recreating runnerdeployment with a custom image. But it still get stuck. It looks like their is an issue when using custom images? |
Can someone else check and confirm for me that custom images actually work? I've tried a few different things now and i simply can't get this to work. |
This comment has been minimized.
This comment has been minimized.
So i noticed that the pod terminates shortly after running the workflow with custom image (im not sure why its attemtping to update something and then shutdown ONLY after running the workflow)
without custom image ( in this case i believe its normal because the workflow finished)
|
@kaykhancheckpoint Thanks, that makes sense. You need to rebuild your custom image from the latest summerwind image contains the latest runner agent installed, or update the agent in your Dockerfile. In #33 we're trying to add support for the runner update, but had no luck so far. Also, the runner update seems not supported by the upstream after reading actions/runner#246. |
Updating to the latest image |
and there was a newer version from Github runner |
@rezmuh Of course there is a new runner https://hub.docker.com/r/summerwind/actions-runner/tags |
This auto-update behaviour is a bit of a concern though. The summerwind image should always be up-to-date, otherwise the runner will do an uto-update (and restart with the old container, do an auto-update, etc and loop forever) Though at the moment, I noticed the summerwind image is already on a pre-release image. This will probably be good enough to not trigger an auto-update (I hope - I'm not 100% sure about the exact update behaviour though). |
CMIIW, but it looks like summerwind's newest image is still on |
Experienced the same issue recently as the base image was updated recently so it meant i had to rebuild my custom image. |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
I'm running into this issue as well with the latest image. Was there a fix for this? |
@kkmoslehpour I bet there are many underlying causes and fundamental issues although all those issues shared in this issue look the same to each other. That said, I think I've encountered this when my custom runner image was outdated and it triggered an auto-update in every runner pod/container. Could you try rebuilding your custom runner image, if you're using one? If not, I think this is generally an issue in actions/runner, not actions-runner-controller. |
hey @mumoshu used FROM summerwind/actions-runner:latest in my custom image. |
@YatinGulati94 It's working fine for me so it's probably due to some issues in your GHES deployment or your GitHub cloud tenant. |
Hey @mumoshu trying since yesterday but the result is same . Have deleted my cluster twice as well. |
@mumoshu Its very important for me to resolve this . If you could look into my setup then it would be great |
this doesn't mean anything, you could have last built your custom image months ago from Have you tried disabling the runner self-update process? https://github.com/actions-runner-controller/actions-runner-controller/blob/master/docs/detailed-docs.md#runner-entrypoint-features. Be aware of #1914 (comment) |
@toast-gear FYI have re-build my image in today's morning itself. And currently doing testing on it. Couldn't get any luck that's why I have commented here. |
Please do report back the results, I'm highly suspicious of the self-update process as it's caused tonnes of verified problems. We're tempted to start recommending people disable it by default. |
@toast-gear Unfortunately the result is still same. Even I had disabled runner_update in my runnderdeployment.yml. |
Have tried everything since yesterday. But container is getting terminated automatically when i launch with my custom image which is created using "FROM summerwind/actions-runner:latest" today's morning |
show me your Dockerfile |
can we connect over short call ? |
FROM summerwind/actions-runner:latest USER root Install Node.js v14.xRUN apt-get update -qq && RUN apt-get install nodejs -y RUN apt-get install npm -y Install OpenJDK-8RUN apt-get update -qq && ENV JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64/ Install PythonRUN apt-get update -qq && Install BS4RUN pip3 install beautifulsoup4 && Install latest chrome dev package and fonts to support major charsets (Chinese, Japanese, Arabic, Hebrew, Thai and a few others)Note: this installs the necessary libs to make the bundled version of Chromium that Puppeteerinstalls, work.RUN apt-get update Set XDG environment variables explicitly so that GitHub Actions does not applydefault paths that do not point to the plugins directoryhttps://specifications.freedesktop.org/basedir-spec/basedir-spec-latest.htmlENV XDG_DATA_HOME=/sfdx_plugins/.local/share Create isolated plugins directory with rwx permission for all usersAzure pipelines switches to a container-user which does not have accessto the root directory where plugins are normally installedRUN mkdir -p $XDG_DATA_HOME && RUN export XDG_DATA_HOME && Install SFDX CLIInstall AWS CLI for executing the commandsRUN npm install sfdx-cli --global |
nothing obvious, raise a new ticket with all your manifests + Dockerfile + environment details Please use the the backtick syntax https://docs.github.com/en/get-started/writing-on-github/working-with-advanced-formatting/creating-and-highlighting-code-blocks |
@toast-gear this is my runner deployment.yml file apiVersion: actions.summerwind.dev/v1alpha1 |
@toast-gear can u please update ?? |
. |
@toast-gear where i need to raise a ticket |
@toast-gear have generated one ticket. But guess will take time to resolve . In the meanwhile can u help me in resolving the issues |
I have a self hosted runner and its using a custom image. This has been deployed and i can see in the pod logs that its listening for jobs. I can see in my organisation that their is an idle runner. But when i run my pipeline it is stuck.
runner.yml
Pod logs:
pipeline.yml
You can see the custom docker i am using here it just contains aws and mysql cli.
dockerfile.yml
It only became stuck like this when i added the custom image
The text was updated successfully, but these errors were encountered: