Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gce: add scripts for CI test and nightly builds #445

Merged
merged 2 commits into from
May 19, 2021

Conversation

cpanato
Copy link
Member

@cpanato cpanato commented Dec 3, 2020

The first part to add GCE image builder to run on CI

This PR adds the script that will be used in prow to build the images for CAPI/GCE

Also adds scripts to run nightly builds

Fixes: #437

Will open the test-infra PR to add the job

Please let me know if there is a better way to use boskos

/assign @detiber @vincepri

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Dec 3, 2020
@k8s-ci-robot k8s-ci-robot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Dec 3, 2020
@cpanato
Copy link
Member Author

cpanato commented Dec 3, 2020

Also, do we need to create any accounts?

@cpanato
Copy link
Member Author

cpanato commented Dec 3, 2020

/hold for discussion and pending items

@k8s-ci-robot k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Dec 3, 2020
@detiber
Copy link
Member

detiber commented Dec 3, 2020

Also, do we need to create any accounts?

Shouldn't we should be able to leverage existing accounts similar to what we are doing with CAPG :)

Copy link
Member

@detiber detiber left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

images/capi/hack/checkout_account.py Outdated Show resolved Hide resolved
images/capi/hack/checkin_account.py Outdated Show resolved Hide resolved
images/capi/hack/checkout_account.py Outdated Show resolved Hide resolved
images/capi/hack/checkout_account.py Outdated Show resolved Hide resolved
images/capi/hack/heartbeat_account.py Outdated Show resolved Hide resolved
images/capi/packer/gce/scripts/ci-gce.sh Outdated Show resolved Hide resolved
@cpanato cpanato force-pushed the GH-437 branch 2 times, most recently from fc17458 to f281516 Compare December 5, 2020 10:15
images/capi/hack/boskos.py Outdated Show resolved Hide resolved
images/capi/packer/gce/scripts/ci-gce.sh Outdated Show resolved Hide resolved
images/capi/hack/boskos.py Outdated Show resolved Hide resolved
images/capi/hack/boskos.py Outdated Show resolved Hide resolved
images/capi/hack/boskos.py Outdated Show resolved Hide resolved
Copy link
Member

@detiber detiber left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few suggestions around the exception handling, otherwise this is looking pretty good

images/capi/hack/boskos.py Outdated Show resolved Hide resolved
images/capi/hack/boskos.py Outdated Show resolved Hide resolved
images/capi/hack/boskos.py Outdated Show resolved Hide resolved
@cpanato cpanato force-pushed the GH-437 branch 3 times, most recently from 788009b to d2e19aa Compare December 12, 2020 14:30
@cpanato cpanato requested a review from detiber December 15, 2020 13:23
@cpanato cpanato force-pushed the GH-437 branch 3 times, most recently from a4a5f5e to 5b3fa6a Compare February 11, 2021 15:49
@detiber
Copy link
Member

detiber commented Mar 8, 2021

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Mar 8, 2021
@CecileRobertMichon
Copy link
Contributor

Is there an associated test-infra PR to add the job as optional so we can use it to test the PR before merging?

Copy link
Contributor

@codenrhoden codenrhoden left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think everything I have is a nit, but it would be nice to see the typo fixed first.

images/capi/packer/gce/ci/nightly/overwrite-1-18.json Outdated Show resolved Hide resolved
images/capi/packer/gce/ci/nightly/overwrite-1-19.json Outdated Show resolved Hide resolved
images/capi/packer/gce/ci/nightly/overwrite-1-20.json Outdated Show resolved Hide resolved

# build image for 1.18
# using PACKER_FLAGS=-force to overwrite the previous image and keep the same name
su - packer -c "bash -c 'cd /home/prow/go/src/sigs.k8s.io/image-builder/images/capi && PATH=$PATH:~packer/.local/bin:/home/prow/go/src/sigs.k8s.io/image-builder/images/capi/.local/bin GCP_PROJECT_ID=$GCP_PROJECT GOOGLE_APPLICATION_CREDENTIALS=$GOOGLE_APPLICATION_CREDENTIALS PACKER_VAR_FILES=packer/gce/ci/nightly/overwrite-1-18.json PACKER_FLAGS=-force make deps-gce build-gce-all'"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

deps-gce is already a dependency of the gce-all build target, so adding that explicitly ins't strictly necessary. But I have no problem with it being listed out.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i will keep that just for visibility

su - packer -c "bash -c 'cd /home/prow/go/src/sigs.k8s.io/image-builder/images/capi && PATH=$PATH:~packer/.local/bin:/home/prow/go/src/sigs.k8s.io/image-builder/images/capi/.local/bin GCP_PROJECT_ID=$GCP_PROJECT GOOGLE_APPLICATION_CREDENTIALS=$GOOGLE_APPLICATION_CREDENTIALS PACKER_VAR_FILES=packer/gce/ci/nightly/overwrite-1-20.json PACKER_FLAGS=-force make deps-gce build-gce-all'"

echo "Displaying the generated image information"
filter="name~cluster-api-ubuntu-1804-*"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I notice this filter includes 1804 only, but the ci-gce script does not (just has cluster-api-ubuntu-*. Do we want to make this one the same so that it picks up 2004? I see that 20.04 Ubuntu builds are currently commented out in the Makefile (however, not completely because it still shows up in make help), but if we change this filter I think it would mean that you wouldn't need to change anything when Ubuntu 20.04 does get reinstated?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good catch! thanks!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for your review and feedback @codenrhoden
made the changes requested

@codenrhoden
Copy link
Contributor

/test pull-azure-vhds
/lgtm

That windows Azure failure (I want to make some changes soon that are a little more sane about what tests run depending on what changed - there's no need for Azure tests in this case for example) doesn't quite look like a flake to me, but we'll see what happens.

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label May 10, 2021
@spiffxp
Copy link
Member

spiffxp commented May 10, 2021

Have you all looked into https://cloud.google.com/build/docs/building/build-vm-images-with-packer at all? My glance of the CI scripts is that they boil down to packer calls

https://cloud.google.com/build/docs/building/build-vm-images-with-packer#required_iam_permissions is something we would want scripted for the projects that need this

https://cloud.google.com/build/docs/building/build-vm-images-with-packer#creating_a_packer_builder_image is something we might want to push to k8s-staging-test-infra or something

@cpanato
Copy link
Member Author

cpanato commented May 11, 2021

@spiffxp to be honest this is the first time I'm seeing this, looks very nice.

I tried the scripts using my personal GCP account and everything worked fine, and regarding the API I just enable the compute one.

I think this looks interesting, can we do that in a follow up maybe? I think we need to change some things in the image builder to make this works
but if you think this should be the approach for now, I can refactor everything, I don't have strong opinions here.

@k8s-ci-robot k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label May 16, 2021
@cpanato
Copy link
Member Author

cpanato commented May 16, 2021

/test pull-image-builder-gcp-all

@cpanato
Copy link
Member Author

cpanato commented May 16, 2021

I think we can override the azure-vhds tests and investigate that in a follow you because that is not part of this PR

WDYT @codenrhoden ?

and to close this PR need some final reviews/approval:
/assign @vincepri @detiber @CecileRobertMichon

@spiffxp
Copy link
Member

spiffxp commented May 16, 2021

I am ok merging this to unblock and finding an alternate approach as followup

@spiffxp
Copy link
Member

spiffxp commented May 16, 2021

I'll review more closely at keyboard tomorrow

@cpanato
Copy link
Member Author

cpanato commented May 17, 2021

I am ok merging this to unblock and finding an alternate approach as followup

I'm checking what are the changes need to apply here to use cloudbuild together with packer to build the image

@cpanato
Copy link
Member Author

cpanato commented May 17, 2021

open a 🧵 to discuss better the cloudbuild option to build CAPG GCE images: https://kubernetes.slack.com/archives/CCK68P2Q2/p1621237438253400

@codenrhoden
Copy link
Contributor

/lgtm
/approve

Agreed that merging this now and exploring other options as follow-up makes sense. Will see what happens with the flaky Azure test.

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label May 18, 2021
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: codenrhoden, cpanato

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label May 18, 2021
@CecileRobertMichon
Copy link
Contributor

The error in the failed pull-azure-vhds job above is:

�[0;32m    vhd-windows-2019: fatal: [default]: FAILED! => {"changed": true, "cmd": "$exists=docker network ls -f name=host -q\nif (-not $exists) { docker network create -d nat host }", "delta": "0:00:01.455251", "end": "2021-05-16 08:12:06.223301", "msg": "non-zero return code", "rc": 1, "start": "2021-05-16 08:12:04.768050", "stderr": "error during connect: Get http://%2F%2F.%2Fpipe%2Fdocker_engine/v1.40/networks?filters=%7B%22name%22%3A%7B%22host%22%3Atrue%7D%7D: open //./pipe/docker_engine: The system cannot find the file specified. In the default daemon configuration on Windows, the docker client must be run elevated to connect. This error may also indicate that the docker daemon is not running.\nerror during connect: Post http://%2F%2F.%2Fpipe%2Fdocker_engine/v1.40/networks/create: open //./pipe/docker_engine: The system cannot find the file specified. In the default daemon configuration on Windows, the docker client must be run elevated to connect. This error may also indicate that the docker daemon is not running.", "stderr_lines": ["error during connect: Get http://%2F%2F.%2Fpipe%2Fdocker_engine/v1.40/networks?filters=%7B%22name%22%3A%7B%22host%22%3Atrue%7D%7D: open //./pipe/docker_engine: The system cannot find the file specified. In the default daemon configuration on Windows, the docker client must be run elevated to connect. This error may also indicate that the docker daemon is not running.", "error during connect: Post http://%2F%2F.%2Fpipe%2Fdocker_engine/v1.40/networks/create: open //./pipe/docker_engine: The system cannot find the file specified. In the default daemon configuration on Windows, the docker client must be run elevated to connect. This error may also indicate that the docker daemon is not running."], "stdout": "", "stdout_lines": []}�[0m

@jsturtevant @perithompson have you seen this before?

@cpanato
Copy link
Member Author

cpanato commented May 19, 2021

will override this failure

@cpanato
Copy link
Member Author

cpanato commented May 19, 2021

/override pull-azure-vhds

@cpanato
Copy link
Member Author

cpanato commented May 19, 2021

I cant :/

/test pull-azure-vhds

@k8s-ci-robot k8s-ci-robot merged commit 517f780 into kubernetes-sigs:master May 19, 2021
@cpanato cpanato deleted the GH-437 branch May 19, 2021 08:43
@perithompson
Copy link
Contributor

The error in the failed pull-azure-vhds job above is:

�[0;32m    vhd-windows-2019: fatal: [default]: FAILED! => {"changed": true, "cmd": "$exists=docker network ls -f name=host -q\nif (-not $exists) { docker network create -d nat host }", "delta": "0:00:01.455251", "end": "2021-05-16 08:12:06.223301", "msg": "non-zero return code", "rc": 1, "start": "2021-05-16 08:12:04.768050", "stderr": "error during connect: Get http://%2F%2F.%2Fpipe%2Fdocker_engine/v1.40/networks?filters=%7B%22name%22%3A%7B%22host%22%3Atrue%7D%7D: open //./pipe/docker_engine: The system cannot find the file specified. In the default daemon configuration on Windows, the docker client must be run elevated to connect. This error may also indicate that the docker daemon is not running.\nerror during connect: Post http://%2F%2F.%2Fpipe%2Fdocker_engine/v1.40/networks/create: open //./pipe/docker_engine: The system cannot find the file specified. In the default daemon configuration on Windows, the docker client must be run elevated to connect. This error may also indicate that the docker daemon is not running.", "stderr_lines": ["error during connect: Get http://%2F%2F.%2Fpipe%2Fdocker_engine/v1.40/networks?filters=%7B%22name%22%3A%7B%22host%22%3Atrue%7D%7D: open //./pipe/docker_engine: The system cannot find the file specified. In the default daemon configuration on Windows, the docker client must be run elevated to connect. This error may also indicate that the docker daemon is not running.", "error during connect: Post http://%2F%2F.%2Fpipe%2Fdocker_engine/v1.40/networks/create: open //./pipe/docker_engine: The system cannot find the file specified. In the default daemon configuration on Windows, the docker client must be run elevated to connect. This error may also indicate that the docker daemon is not running."], "stdout": "", "stdout_lines": []}�[0m

@jsturtevant @perithompson have you seen this before?

No this is a new one on me, looks like the docker service didn't start on this node?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add CI job for GCP
10 participants