Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MCS: change machine-config-server ports #368

Merged

Conversation

kikisdeliveryservice
Copy link
Contributor

@kikisdeliveryservice kikisdeliveryservice commented Feb 1, 2019

Change machine-config-server ports from 49500/49501 -> 22623/22624
to avoid conflict with local port and node port ranges.

Closes: #166
Related-to: openshift/installer#1180

@openshift-ci-robot openshift-ci-robot added the size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. label Feb 1, 2019
@cgwalters
Copy link
Member

Don't we need to listen on both ports at least temporarily? See #166 (comment)

@abhinavdahiya
Copy link
Contributor

Don't we need to listen on both ports at least temporarily? See #166 (comment)

+1

@kikisdeliveryservice kikisdeliveryservice changed the title change machine-config-server ports WIP: change machine-config-server ports Feb 1, 2019
@openshift-ci-robot openshift-ci-robot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Feb 1, 2019
@cgwalters
Copy link
Member

The problem is we'd need to land these PRs simultaneously for it to work, and there's no way to really do that today with our CI system.

Also, we want to be nice to e.g. people who built an installer from a few days ago and are pulling a new update payload.

So I think the path here is something like having the MCS listen on both ports, then merge the installer PR, and then drop the old port from this repo?

@kikisdeliveryservice
Copy link
Contributor Author

Thanks @cgwalters @abhinavdahiya ! Will do!

@openshift-ci-robot openshift-ci-robot added size/S Denotes a PR that changes 10-29 lines, ignoring generated files. and removed size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. labels Feb 1, 2019
@kikisdeliveryservice kikisdeliveryservice changed the title WIP: change machine-config-server ports MCS: change machine-config-server ports Feb 1, 2019
@openshift-ci-robot openshift-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Feb 1, 2019
@kikisdeliveryservice
Copy link
Contributor Author

/retest

@ashcrow
Copy link
Member

ashcrow commented Feb 4, 2019

/retest

@openshift-ci-robot openshift-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Feb 4, 2019
@cgwalters cgwalters changed the title MCS: change machine-config-server ports WIP: MCS: change machine-config-server ports Feb 4, 2019
@openshift-ci-robot openshift-ci-robot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Feb 4, 2019
@cgwalters cgwalters changed the title WIP: MCS: change machine-config-server ports MCS: change machine-config-server ports Feb 4, 2019
@openshift-ci-robot openshift-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Feb 4, 2019
@cgwalters
Copy link
Member

/approve

Sorry about changing this to WIP, I had missed the updated patch.
It looks good to me.

Now let's talk about testing.

As far as I understand it in order to get "real" coverage of this one would need to build a custom update payload and run it through the installer. I'm guessing you didn't do that (I tried and failed at that...need to get back to trying again).

The test cases (e2e-aws and e2e-aws-op) though will boot a new cluster with this. Both are currently failing which could be flakes or could be real issues. At the moment it's only e2e-aws that will gather full artifacts from a failed install.

https://storage.googleapis.com/origin-ci-test/pr-logs/pull/openshift_machine-config-operator/368/pull-ci-openshift-machine-config-operator-master-e2e-aws/1455/artifacts/e2e-aws/bootstrap/bootkube.service
looks relevant here; we're failing to contact etcd on the new masters I think?

Not much more info than that I can find though 😦. If for example the booted masters weren't able to contact the original port they'd end up stalling in Ignition and I don't think we have a way to get logs out of that scenario today (other than possibly calling out to EC2 to scrape the console?).

@kikisdeliveryservice kikisdeliveryservice changed the title MCS: change machine-config-server ports WIP: MCS: change machine-config-server ports Feb 4, 2019
@openshift-ci-robot openshift-ci-robot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Feb 4, 2019
@cgwalters
Copy link
Member

/lgtm

@openshift-ci-robot openshift-ci-robot removed the lgtm Indicates that a PR is ready to be merged. label Feb 5, 2019
@kikisdeliveryservice kikisdeliveryservice changed the title MCS: change machine-config-server ports WIP: MCS: change machine-config-server ports Feb 5, 2019
@openshift-ci-robot openshift-ci-robot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Feb 5, 2019
@kikisdeliveryservice
Copy link
Contributor Author

this needs change in cmd/machine-config-server/bootstrap.go

Thanks @abhinavdahiya , I'll update this tomorrow.

@kikisdeliveryservice kikisdeliveryservice changed the title WIP: MCS: change machine-config-server ports MCS: change machine-config-server ports Feb 5, 2019
@kikisdeliveryservice kikisdeliveryservice removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Feb 5, 2019
@cgwalters
Copy link
Member

/lgtm

Transition machine-config-server ports from 49500/49501 -> 22623/22624
to avoid conflict with local port and node port ranges. Listeners added
for legacy ports until installer transitions to using the new ports.

Closes: openshift#166
@kikisdeliveryservice
Copy link
Contributor Author

@cgwalters there is something wrong with CI so I repushed to try to kick it off and it removed your lgtm. Didn't help though.

@openshift-merge-robot
Copy link
Contributor

/retest

1 similar comment
@openshift-merge-robot
Copy link
Contributor

/retest

@cgwalters
Copy link
Member

/lgtm

@openshift-ci-robot openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Feb 6, 2019
@openshift-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: ashcrow, cgwalters, kikisdeliveryservice

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:
  • OWNERS [ashcrow,cgwalters,kikisdeliveryservice]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@kikisdeliveryservice
Copy link
Contributor Author

/retest

@kikisdeliveryservice
Copy link
Contributor Author

/test e2e-aws-op

@kikisdeliveryservice
Copy link
Contributor Author

kikisdeliveryservice commented Feb 7, 2019

e2e-aws now passes, attempting to get the e2e-aws-op test unstuck from yesterday's run...
update: yay got it running!

@kikisdeliveryservice
Copy link
Contributor Author

So e2e-aws-op successfully got thru bootstrap to master and workers but timed out. Seeing in master logs:
Feb 07 00:59:53 ip-10-0-7-112 hyperkube[4433]: E0207 00:59:53.715901 4433 pod_workers.go:186] Error syncing pod 920aaf30-2a6f-11e9-8d74-0e1b167d187a ("apiserver-lxhz4_openshift-apiserver(920aaf30-2a6f-11e9-8d74-0e1b167d187a)"), skipping: timeout expired waiting for volumes to attach or mount for pod "openshift-apiserver"/"apiserver-lxhz4". list of unmounted volumes=[serving-cert]. list of unattached volumes=[aggregator-client-ca client-ca config etcd-client etcd-serving-ca image-import-ca serving-cert openshift-apiserver-sa-token-6fdbt]

/test e2e-aws-op

@openshift-merge-robot openshift-merge-robot merged commit 89c82cf into openshift:master Feb 7, 2019
@joelsmith
Copy link
Contributor

@kikisdeliveryservice Thanks for this fix. I just ran into a conflict again on the (now legacy) port. Is there a separate issue tracking the removal of the legacy ports? Until they are fully removed, we still have the potential for problems where conflicts will prevent the machine config server from starting. If so, is there an ETA on when the legacy ports might be removed?

@runcom
Copy link
Member

runcom commented Feb 11, 2019

We never GA'ed, the eta should still be before 4.0 afaict, let's have an issue tracking it

@kikisdeliveryservice
Copy link
Contributor Author

@joelsmith The legacy port will be removed here once openshift/installer#1180 lands in the installer.

@joelsmith
Copy link
Contributor

Thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged. size/S Denotes a PR that changes 10-29 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants