Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

releng: Teach jobs using latest-fast marker to extract from K8s Infra #19841

Merged
merged 1 commit into from
Nov 4, 2020

Conversation

justaugustus
Copy link
Member

@justaugustus justaugustus commented Nov 4, 2020

As we continue to migrate release-blocking jobs to a dedicated K8s Infra
cluster, jobs that use the latest-fast marker need to extract builds
from gs://k8s-release-dev, which is the K8s Infra equivalent of
gs://kubernetes-release-dev.

A new flag (--extract-ci-bucket=k8s-release-dev) was added to support
this transitional use case, so we employ it here.

Exceptions:
Of note, the ci-kubernetes-e2e-gce-master-new-gci-kubectl-skew-serial is
not included in this PR.

This job does two extractions:

  • --extract=ci/k8s-stable1
  • --extract=ci/latest-fast

As the generic version markers (like k8s-stable1) have not been
migrated to K8s Infra, we cannot take advantage of this flag.

We'll plan to fix this job in a follow-up.

(May mitigate #19838.)

This is part of migrating release-blocking jobs to K8s Infra (ref: #19484, #18549).


Previous thought process (2b282b3) is explained here:

This reverts commit f1fd414. (ref: #19660).

tl;dr of what's happening is that any existing job that extracts the latest-fast version marker is extracting the marker from gs://kubernetes-release-dev/ci/fast (Google Infra) instead of gs://k8s-release-dev/ci/fast (K8s Infra), which means they are using stale builds.

We need to make an accompanying change to the extract logic, which will take some time to roll out new images for.
This is the quickest course of action in the meantime.

/assign @cpanato @saschagrunert @hasheddan
/priority critical-urgent
cc: @kubernetes/sig-scalability @kubernetes/release-engineering

@k8s-ci-robot k8s-ci-robot added priority/critical-urgent Highest priority. Must be actively worked on as someone's top priority right now. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/M Denotes a PR that changes 30-99 lines, ignoring generated files. approved Indicates a PR has been approved by an approver from all required OWNERS files. labels Nov 4, 2020
@k8s-ci-robot k8s-ci-robot added area/config Issues or PRs related to code in /config area/jobs area/release-eng Issues or PRs related to the Release Engineering subproject sig/release Categorizes an issue or PR as relevant to SIG Release. sig/testing Categorizes an issue or PR as relevant to SIG Testing. labels Nov 4, 2020
@justaugustus
Copy link
Member Author

@kubernetes/ci-signal -- please keep an eye on these jobs.

@justaugustus
Copy link
Member Author

justaugustus commented Nov 4, 2020

Actually... @spiffxp may have already provided a mechanism for fixing this with the --extract-ci-bucket=k8s-release-dev flag.

ref: #19484 (comment)

Now that #19634 has confirmed the new bucket works, remaining steps are:

* wait for #19631 to merge or bump kubekins in jobs to be migrated (see #19632)

* any jobs that have `--extract=ci/latest-fast` should have `--extract-ci-bucket=k8s-release-dev` added (see #19634)

* conform migrated jobs work

* update canary to match old job name and annotations, delete old job

Let me rework this PR.
/hold

@k8s-ci-robot k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Nov 4, 2020
Copy link
Member

@saschagrunert saschagrunert left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Nov 4, 2020
Copy link
Member

@cpanato cpanato left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@justaugustus justaugustus force-pushed the revert-fast-k8s-infra branch from 2b282b3 to c6cd9b7 Compare November 4, 2020 14:05
@k8s-ci-robot k8s-ci-robot added size/S Denotes a PR that changes 10-29 lines, ignoring generated files. area/provider/gcp Issues or PRs related to gcp provider sig/cloud-provider Categorizes an issue or PR as relevant to SIG Cloud Provider. sig/network Categorizes an issue or PR as relevant to SIG Network. sig/scalability Categorizes an issue or PR as relevant to SIG Scalability. and removed lgtm "Looks good to me", indicates that a PR is ready to be merged. size/M Denotes a PR that changes 30-99 lines, ignoring generated files. approved Indicates a PR has been approved by an approver from all required OWNERS files. labels Nov 4, 2020
@justaugustus justaugustus changed the title releng: Revert moving build-fast jobs running on K8s Infra releng: Teach jobs using latest-fast marker to extract from K8s Infra Nov 4, 2020
@@ -23,6 +23,7 @@ periodics:
- --scenario=kubernetes_e2e
- --
- --extract=ci/latest-fast
- --extract-ci-bucket=k8s-release-dev
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@justaugustus do we always want to get version markers from k8s-release-dev now, or are some jobs (not necessarily owned by sig-release) still requiring usage of kubenertes-release-dev?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hasheddan -- explained in the updated description, but pasting to fire the email notification:

As we continue to migrate release-blocking jobs to a dedicated K8s Infra
cluster, jobs that use the latest-fast marker need to extract builds
from gs://k8s-release-dev, which is the K8s Infra equivalent of
gs://kubernetes-release-dev.

A new flag (--extract-ci-bucket=k8s-release-dev) was added to support
this transitional use case, so we employ it here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@justaugustus gotcha 👍 was just wondering if we were ready to modify the extract logic to use this by default

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hasheddan -- Not yet. The primary (non-fast) build jobs still need to be migrated. That's being tracked in #19483.

As we continue to migrate release-blocking jobs to a dedicated K8s Infra
cluster, jobs that use the latest-fast marker need to extract builds
from gs://k8s-release-dev, which is the K8s Infra equivalent of
gs://kubernetes-release-dev.

A new flag ('--extract-ci-bucket=k8s-release-dev') was added to support
this transitional use case, so we employ it here.

Exceptions:
Of note, the ci-kubernetes-e2e-gce-master-new-gci-kubectl-skew-serial is
not included in this PR.

This job does two extractions:
- --extract=ci/k8s-stable1
- --extract=ci/latest-fast

As the generic version markers (like 'k8s-stable1') have not been
migrated to K8s Infra, we cannot take advantage of this flag.

We'll plan to fix this job in a follow-up.

Signed-off-by: Stephen Augustus <saugustus@vmware.com>
@justaugustus justaugustus force-pushed the revert-fast-k8s-infra branch from c6cd9b7 to 0cb4807 Compare November 4, 2020 14:10
Copy link
Member

@xmudrii xmudrii left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Nov 4, 2020
@justaugustus
Copy link
Member Author

Need a config/ approver now...
/assign @dims

Copy link
Contributor

@hasheddan hasheddan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@wojtek-t
Copy link
Member

wojtek-t commented Nov 4, 2020

/approve

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: cpanato, hasheddan, justaugustus, saschagrunert, wojtek-t, xmudrii

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Nov 4, 2020
@wojtek-t
Copy link
Member

wojtek-t commented Nov 4, 2020

/hold

@justaugustus - are we sure that this will fix all jobs?

@wojtek-t
Copy link
Member

wojtek-t commented Nov 4, 2020

/hold cancel

copying from slack discussion:

so I guess it only impacts those that are using "--extract=ci/latest-fast" (as opposed to "--extract=ci/latest" which is the majority of jobs are still using), right?
exactly 🙂

@k8s-ci-robot k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Nov 4, 2020
@k8s-ci-robot k8s-ci-robot merged commit fbee538 into kubernetes:master Nov 4, 2020
@k8s-ci-robot k8s-ci-robot added this to the v1.20 milestone Nov 4, 2020
@k8s-ci-robot
Copy link
Contributor

@justaugustus: Updated the job-config configmap in namespace default at cluster test-infra-trusted using the following files:

  • key gce-conformance.yaml using file config/jobs/kubernetes/sig-cloud-provider/gcp/gce-conformance.yaml
  • key gcp-gce.yaml using file config/jobs/kubernetes/sig-cloud-provider/gcp/gcp-gce.yaml
  • key gpu-gce.yaml using file config/jobs/kubernetes/sig-cloud-provider/gcp/gpu/gpu-gce.yaml
  • key sig-network-misc.yaml using file config/jobs/kubernetes/sig-network/sig-network-misc.yaml
  • key sig-scalability-release-blocking-jobs.yaml using file config/jobs/kubernetes/sig-scalability/sig-scalability-release-blocking-jobs.yaml

In response to this:

As we continue to migrate release-blocking jobs to a dedicated K8s Infra
cluster, jobs that use the latest-fast marker need to extract builds
from gs://k8s-release-dev, which is the K8s Infra equivalent of
gs://kubernetes-release-dev.

A new flag (--extract-ci-bucket=k8s-release-dev) was added to support
this transitional use case, so we employ it here.

Exceptions:
Of note, the ci-kubernetes-e2e-gce-master-new-gci-kubectl-skew-serial is
not included in this PR.

This job does two extractions:

  • --extract=ci/k8s-stable1
  • --extract=ci/latest-fast

As the generic version markers (like k8s-stable1) have not been
migrated to K8s Infra, we cannot take advantage of this flag.

We'll plan to fix this job in a follow-up.

(May mitigate #19838.)

This is part of migrating release-blocking jobs to K8s Infra (ref: #19484, #18549).


Previous thought process (2b282b3) is explained here:

This reverts commit f1fd414. (ref: #19660).

tl;dr of what's happening is that any existing job that extracts the latest-fast version marker is extracting the marker from gs://kubernetes-release-dev/ci/fast (Google Infra) instead of gs://k8s-release-dev/ci/fast (K8s Infra), which means they are using stale builds.

We need to make an accompanying change to the extract logic, which will take some time to roll out new images for.
This is the quickest course of action in the meantime.

/assign @cpanato @saschagrunert @hasheddan
/priority critical-urgent
cc: @kubernetes/sig-scalability @kubernetes/release-engineering

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@justaugustus
Copy link
Member Author

@justaugustus - are we sure that this will fix all jobs?

Fairly certain!
ci-kubernetes-e2e-gci-gce is already using this flag and you can see from its most recent successful run that it's using a build version a few commits ahead on v1.20.0-beta.1 (which was released yesterday).

W1104 14:02:14.106] 2020/11/04 14:02:14 extract_k8s.go:295: U=https://storage.googleapis.com/k8s-release-dev/ci/fast R=v1.20.0-beta.1.49+71fea80155e5c1 get-kube.sh
W1104 14:02:14.107] 2020/11/04 14:02:14 process.go:153: Running: ./get-kube.sh
I1104 14:02:14.207] Downloading kubernetes release v1.20.0-beta.1.49+71fea80155e5c1
I1104 14:02:14.207]   from https://storage.googleapis.com/k8s-release-dev/ci/fast/v1.20.0-beta.1.49+71fea80155e5c1/kubernetes.tar.gz
I1104 14:02:14.208]   to /workspace/kubernetes.tar.gz
W1104 14:02:15.408] Copying gs://k8s-release-dev/ci/fast/v1.20.0-beta.1.49+71fea80155e5c1/kubernetes.tar.gz...
W1104 14:02:15.559] / [0 files][    0.0 B/498.3 KiB]                                                
/ [1 files][498.3 KiB/498.3 KiB]                                                
W1104 14:02:15.560] Operation completed over 1 objects/498.3 KiB.                                    
I1104 14:02:15.821] Unpacking kubernetes release v1.20.0-beta.1.49+71fea80155e5c1
I1104 14:02:15.889] Kubernetes release: v1.20.0-beta.1.49+71fea80155e5c1 

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/config Issues or PRs related to code in /config area/jobs area/provider/gcp Issues or PRs related to gcp provider area/release-eng Issues or PRs related to the Release Engineering subproject cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. priority/critical-urgent Highest priority. Must be actively worked on as someone's top priority right now. sig/cloud-provider Categorizes an issue or PR as relevant to SIG Cloud Provider. sig/network Categorizes an issue or PR as relevant to SIG Network. sig/release Categorizes an issue or PR as relevant to SIG Release. sig/scalability Categorizes an issue or PR as relevant to SIG Scalability. sig/testing Categorizes an issue or PR as relevant to SIG Testing. size/S Denotes a PR that changes 10-29 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants