Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bump CNI to v0.6.0 #446

Closed
wants to merge 1 commit into from
Closed

Conversation

dixudx
Copy link
Member

@dixudx dixudx commented Oct 19, 2017

xref kubernetes/kubernetes#49480

The debs/rpms in kubernetes/release needs to be updated as well.

/cc @luxas

@k8s-ci-robot k8s-ci-robot requested a review from luxas October 19, 2017 02:18
@k8s-ci-robot k8s-ci-robot added size/S Denotes a PR that changes 10-29 lines, ignoring generated files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Oct 19, 2017
@dixudx
Copy link
Member Author

dixudx commented Oct 20, 2017

/assign @luxas @ixdy

@ixdy
Copy link
Member

ixdy commented Oct 20, 2017

seems fine to me; not sure if there's someone who understand this better, though.

@dims
Copy link
Member

dims commented Oct 21, 2017

@ixdy +1 Let's ship it!

@dims
Copy link
Member

dims commented Oct 25, 2017

/assign @mikedanese

@dims
Copy link
Member

dims commented Oct 25, 2017

@mikedanese @luxas @ixdy - Can we please ship this?

@dims
Copy link
Member

dims commented Oct 25, 2017

cc @timothysc

Copy link
Member

@luxas luxas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @dixudx, LGTM
However, we can't merge this yet as we need to figure out how to make k8s v1.8 NOT pick up this change...

@dixudx
Copy link
Member Author

dixudx commented Oct 26, 2017

@luxas Seems we can't restrict installing kubelet (< v1.9) not to install kubernetes-cni (>= 0.6.0), if kubernetes-cni is not present on the machine.

@dims
Copy link
Member

dims commented Nov 2, 2017

@luxas So what's the plan here?

@timothysc
Copy link
Member

However, we can't merge this yet as we need to figure out how to make k8s v1.8 NOT pick up this change...

That's just a spec file version dep change >= 0.6.0 .
If the release repo is flat then there is a problem. If it's branched then it should be fine.

@enisoc
Copy link
Member

enisoc commented Nov 6, 2017

If the release repo is flat then there is a problem.

It's flat. :(

@david-mcmahon
Copy link
Contributor

There hasn't been a need so far to branch this repo.
Any branching comes with a (maintenance) cost of course. It would be good to keep this repo agnostic, with maybe a few branch-specific files. It looks like there's some precedence here for that.

@k8s-ci-robot k8s-ci-robot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. and removed size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Nov 7, 2017
@dixudx
Copy link
Member Author

dixudx commented Nov 7, 2017

Thanks @david-mcmahon for the link.

@luxas @mikedanese Updated. PTAL. Thanks.

debian/build.go Outdated
if sv.GTE(v190) {
return fmt.Sprintf(">= %s", cniVersion), nil
}
return fmt.Sprintf("< %s", cniVersion), nil
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we pin to exactly 0.5.1 for k8s <v1.9? We probably don't want this to move the next time we bump cniVersion.

Copy link
Member Author

@dixudx dixudx Nov 16, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reasonable. Agree.

curl -sSL --fail --retry 5 \
"https://dl.k8s.io/network-plugins/cni-{{ .Arch }}-$(CNI_RELEASE).tar.gz" \
| tar xz
"https://dl.k8s.io/network-plugins/cni-plugins-{{ .Arch }}-$(CNI_RELEASE).tgz" \
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we change CNI_RELEASE to CNI_VERSION?

rpm/kubelet.spec Outdated
@@ -24,12 +24,12 @@ Source1: kubelet.service
Source2: https://dl.k8s.io/v%{KUBE_VERSION}/bin/linux/%{ARCH}/kubectl
Source3: https://dl.k8s.io/v%{KUBE_VERSION}/bin/linux/%{ARCH}/kubeadm
Source4: 10-kubeadm.conf
Source5: https://dl.k8s.io/network-plugins/cni-%{ARCH}-%{CNI_RELEASE}.tar.gz
Source5: https://dl.k8s.io/network-plugins/cni-plugins-%{ARCH}-%{CNI_VERESION}.tgz
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a typo in CNI_VERSION.

rpm/kubelet.spec Outdated


BuildRequires: curl
Requires: iptables >= 1.4.21
Requires: kubernetes-cni >= 0.5.1
Requires: kubernetes-cni >= 0.6.0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't this need to be pinned for k8s <1.9 like for debs?

if sv.GTE(v190) {
return fmt.Sprintf(">= %s", cniVersion), nil
}
return fmt.Sprint("<= 0.5.1"), nil
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's a little weird to have this constant floating here, but otherwise this looks fine to me.

@pipejakob
Copy link
Contributor

Hey @dixudx, I'm taking a look now, but I'm hoping you can comment on how much testing this has gone through. Is the current expectation that with this PR, previous versions of Kubernetes should have their CNI versions untouched, and upgrading/downgrading with 1.9.x works as expected?

Also, can we use the updated scripts to cut prerelease debs/rpms against 1.8.x and 1.9.x so that we can have independent verification that everything works as expected? Who would be the right person to QA the new packages from a networking perspective? Thanks!

@pipejakob
Copy link
Contributor

@dixudx In order to better understand the flow, I cloned your branch locally, but am actually seeing errors when trying to build RPMs for either 1.8.0 or 1.9.0. This build happens in a container, so the failures should be reproducible, and are certainly consistent in my local testing.

If I leave kubelet.spec as-is in your PR (at version 1.8.0), I get the following error:

release/rpm$ ./docker-build.sh
... snipped ...
warning: bogus date in %changelog: Mon Jun 30 2017 Mike Danese <mikedanese@google.com> - 1.7.0
Executing(%prep): /bin/sh -e /var/tmp/rpm-tmp.DP641c
+ umask 022
+ cd /root/rpmbuild/BUILD
+ ln -s 10-kubeadm-post-1.8.conf /root/rpmbuild/SOURCES/x86_64/10-kubeadm.conf
+ Source5: https://dl.k8s.io/network-plugins/cni-amd64-0799f5732f2a11b329d9e3d51b9c8f2e3759f2ff.tar.gz
/var/tmp/rpm-tmp.DP641c: line 43: Source5:: command not found
error: Bad exit status from /var/tmp/rpm-tmp.DP641c (%prep)


RPM build errors:
    bogus date in %changelog: Mon Jun 30 2017 Mike Danese <mikedanese@google.com> - 1.7.0
    Bad exit status from /var/tmp/rpm-tmp.DP641c (%prep)

I think that the changelog warning is a red herring, and the real failure is around Source5.

If I change kubelet.spec to use 1.9.0 (and change the source URLs to actually use 1.9.0-beta.2, since the real release doesn't exist yet), I get a completely different error:

release/rpm$ ./docker-build.sh
... snipped ...
+ install -m 755 -d /root/rpmbuild/BUILDROOT/kubelet-1.9.0-1.x86_64/opt/cni/bin
+ mv ./ /root/rpmbuild/BUILDROOT/kubelet-1.9.0-1.x86_64/opt/cni/bin/
mv: cannot move './' to '/root/rpmbuild/BUILDROOT/kubelet-1.9.0-1.x86_64/opt/cni/bin/.': Device or resource busy
error: Bad exit status from /var/tmp/rpm-tmp.ndchbk (%install)


RPM build errors:
    bogus date in %changelog: Mon Jun 30 2017 Mike Danese <mikedanese@google.com> - 1.7.0
    Bad exit status from /var/tmp/rpm-tmp.ndchbk (%install)

Again, the changelog failure looks like a warning, and this time the error is from the mv command.

@pipejakob
Copy link
Contributor

I was able to overcome the 1.8.x build errors with a local patch to move the Source5 version switch to the appropriate place, but am less sure how the mv ./ line should be updated. I certainly think that attempting to move the current working directory is expected to be an error, and a simple test confirms:

$ mkdir /tmp/blah
$ cd /tmp/blah
$ mv . /tmp
mv: cannot move ‘.’ to ‘/tmp/.’: Device or resource busy

Even if we semantically wanted to move everything from the CWD into /opt/cni/bin, the CWD contains all of the other source files, like the kubeadm/kubectl/kubelet binaries, configuration files, etc., not just the contents of the cni-plugins archive, so this seems doubly-broken. I'll see if there's a better way to extract the archive to a stand-alone directory, or else we can just enumerate all of the individual files we want moved into /opt/cni/bin.

@pipejakob
Copy link
Contributor

Okay, I believe I have workarounds for the rpm build problems, and will open another PR with my patches for this diff. I still think it's worthwhile to set up test clusters using the 1.8 and 1.9 artifacts to vet that they're doing the right thing.

@pipejakob
Copy link
Contributor

Another issue I found: when building the rpms for Kubernetes 1.8.0, even though it looks like it pulls the old version of CNI, it will still name the output rpm kubernetes-cni-0.6.0-1.$ARCH.rpm (instead of 0.5.1), which is definitely going to cause problems. Looking at the diff, it doesn't seem like there's any version switching for the %package section where this gets set.

@pipejakob
Copy link
Contributor

I've created #486 to build on top of this PR and fix the rpm build problems I found. To view just my changes, see its second commit.

I would still like to do some cluster end-to-end tests using the OS packages before the real release, though.

@dixudx
Copy link
Member Author

dixudx commented Dec 12, 2017

@pipejakob Thanks for your work.

@pipejakob
Copy link
Contributor

I tried building and installing the debs using 1.9.0-beta.2, but get an error:

dpkg: dependency problems prevent configuration of kubelet:
 kubelet depends on kubernetes-cni (<= 0.5.1); however:
  Version of kubernetes-cni on system is 0.6.0-00.

I think the v190 var in getKubeletCNIVersion might need to be set to 1.9.0-alpha.0 to get the comparison right for prerelease versions?

@dixudx
Copy link
Member Author

dixudx commented Dec 12, 2017

I think the v190 var in getKubeletCNIVersion might need to be set to 1.9.0-alpha.0 to get the comparison right for prerelease versions?

@pipejakob This will help solve the problem. But I am not sure whether we should keep this in the later official release.

keep open on this. For such prerelease versions, this seems to be the only way to fix it.

@pipejakob
Copy link
Contributor

It also seems that if you try to build debs for older releases, it will always build CNI 0.6.0 instead of 0.5.1. I believe the kubelet dependency on kubernetes-cni is still correct, so I'm not sure how important it is to be able to build kubernetes-cni 0.5.1 on HEAD (since the 0.5.1 version is already published in our stable Debian repository).

@dixudx
Copy link
Member Author

dixudx commented Dec 12, 2017

I'm not sure how important it is to be able to build kubernetes-cni 0.5.1 on HEAD (since the 0.5.1 version is already published in our stable Debian repository)

Since we've already shipped kubernetes-cni 0.5.1, IMO there is no need to build that on HEAD.

@danehans
Copy link

@dixudx does #486 need to get merged in order to merge this pr?

@pipejakob
Copy link
Contributor

@dixudx If my additions in #486 look good to you, how about you just pull those commits into this PR so we can get the full change LGTMd and merged?

@pipejakob
Copy link
Contributor

@enisoc reminded me about a problem that was brought up earlier on this PR: preventing 1.8 users from accidentally picking up kubernetes-cni 0.6.0, since the old debs/rpms give the range ">= 0.5.1".

TL;DR: I think we should build new OS packages for older releases using the new, corrected CNI dependency, change that dependency to always be "=" instead of ">=" or "<=", and make sure we publish them before the 1.9 release.

Even if we rebuilt old 1.8.x artifacts to fix the cni dependency to use "= 0.5.1" or "<= 0.5.1", I don't think we can overwrite them in the repository in such a way that client-side caching is going to pick up the new definitions for package versions it already knows about. I'll do some testing to verify that. It also seems a little dirty to overwrite previously-published versions anyway. However, we can roll forward by publishing new versions with a new package revision and optionally deleting the old ones (although I don't think that's really necessary).

How about this plan: we build new debs/rpms for old < 1.9 releases using a new package revision (e.g. if we already published 1.8.4-00, then we create a new 1.8.4-01), like we do when we find an error in the OS packaging that needs fixing, but doesn't actually require Kubernetes changes. These artifacts will have the corrected kubernetes-cni dependency, and will be published to the repositories first, ahead of the 1.9.0 / CNI 0.6.0 OS packages release.

Then afterward, we publish the 1.9.0 and CNI 0.6.0 OS packages. If my understanding is correct, this will leave no window in time where a full system OS package upgrade would erroneously pick up kubernetes-cni 0.6.0 while leaving kubelet < 1.9, and a user would have to go out of their way to get into a bad state. These are the two cases I think that could leave a user in a bad state:

  1. User already has Kubernetes 1.8.x (or earlier) installed. Instead of doing apt-get upgrade of everything, the user chooses to specifically do apt-get upgrade kubernetes-cni.
  2. User doesn't have Kubernetes installed, and deliberately wants to use 1.8.x instead of 1.9, so they do apt-get install kubelet=1.8.4-00, but select an older package revision instead of the newest one.

I think (1) is unlikely because we have kept CNI frozen for so long, I can't imagine someone specifically wanting to upgrade CNI while keeping Kubernetes untouched. It is much more common to do apt-get upgrade or apt-get dist-upgrade to upgrade all available packages, and these will handle our new dependencies fine. I think wanting to deliberately install an older version of Kubernetes might be more likely, but (2) can be prevented completely if we remove these old versions from the repository when we publish their replacements. I still think (2) is unlikely, though, since someone who has picky requirements likely also knows how to use commands like apt-cache madison to print all available versions of a package, and they would see the newer 1.8.4-01 entry and opt for that instead.

I also think that none of this would have been necessary if our dependencies used = <version> instead of >= <version> to begin with for CNI. I don't think we ever had a usecase where it was desirable to upgrade CNI independently of kubelet, so I would err on the side of always using kubernetes-cni = <version> until we have a real-world request to support more flexibility, which we could always add in future package builds when we understand that usecase.

@enisoc
Copy link
Member

enisoc commented Dec 12, 2017

@pipejakob That plan SGTM.

@jdumars
Copy link
Member

jdumars commented Dec 12, 2017 via email

@danehans
Copy link

@pipejakob thank you for taking ownership of this pkg'ing issue.

pipejakob added a commit to pipejakob/release that referenced this pull request Dec 13, 2017
@pipejakob
Copy link
Contributor

I'm working on a script to backfill the old debs/rpms with new builds with the updated dependency. Unfortunately, since this process is still Google-internal, I can only do an internal code/process review for it.

@pipejakob
Copy link
Contributor

As an update, I've sent out my backfill script for review internally, and it has already been running to rebuild old versions. I've been spot-checking the artifacts, and things look good, so when it's done, I can upload them to our unstable channels for more testing before promoting to stable.

It would be great to get this and #486 checked in, though. I don't have permissions to the release repo, and I don't think we have a bot running here to autosubmit, so we still need someone else to review/approve. @ixdy, do you have bandwidth to review these?

@danehans
Copy link

@thockin can you help with the request from @pipejakob to bring cni 0.6.0 over the finish line?

@enisoc
Copy link
Member

enisoc commented Dec 14, 2017

I believe we included these commits in #486 which has been merged.

@enisoc enisoc closed this Dec 14, 2017
@dixudx dixudx deleted the bump_cni_v0.6.0 branch December 15, 2017 01:23
justaugustus pushed a commit to justaugustus/release that referenced this pull request Jun 10, 2019
justaugustus pushed a commit to justaugustus/release that referenced this pull request Jun 10, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/M Denotes a PR that changes 30-99 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.