Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

needed changes to install 1-master-2-worker fedora coreos31 on libvirt #5987

Closed
wants to merge 4 commits into from
Closed

needed changes to install 1-master-2-worker fedora coreos31 on libvirt #5987

wants to merge 4 commits into from

Conversation

aanno
Copy link

@aanno aanno commented Apr 19, 2020

What type of PR is this?

This is not a PR.

Uncomment only one /kind <> line, hit enter to put that in a new line, and remove leading whitespaces from that line:

/kind api-change
/kind cleanup
/kind design
/kind documentation
/kind failing-test
/kind feature
/kind flake

/kind bug

Instead I use it to document the changes I have needed to install kube based on fedora coreos f31 on 3 libvirt machines (1 master, 2 workers).

It fixes #5988

Rationale

First of all: Thank you very much for this great ansible stuff to set up kubernetes! I found it very helpful!

I tried to set up kube based on fedora coreos f31 on 3 libvirt machines (1 master, 2 workers). However, it was not possible because of a few glitches. But I was able to correct them.

Hence, here is what I did. I guess that I did not fix the problems the way they should be fixed. But I would like to learn what to 'really' to do and I'm really interested in some feedback.

PS:
I tried to do the cri-o variant install first, but I did not get it working. So I switched back to the docker variant.

@k8s-ci-robot
Copy link
Contributor

Thanks for your pull request. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

📝 Please follow instructions at https://git.k8s.io/community/CLA.md#the-contributor-license-agreement to sign the CLA.

It may take a couple minutes for the CLA signature to be fully registered; after that, please reply here with a new comment and we'll verify. Thanks.


Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@k8s-ci-robot k8s-ci-robot added the cncf-cla: no Indicates the PR's author has not signed the CNCF CLA. label Apr 19, 2020
@k8s-ci-robot
Copy link
Contributor

Welcome @aanno!

It looks like this is your first PR to kubernetes-sigs/kubespray 🎉. Please refer to our pull request process documentation to help your PR have a smooth ride to approval.

You will be prompted by a bot to use commands during the review process. Do not be afraid to follow the prompts! It is okay to experiment. Here is the bot commands documentation.

You can also check if kubernetes-sigs/kubespray has its own contribution guidelines.

You may want to refer to our testing guide if you run into trouble with your tests not passing.

If you are having difficulty getting your pull request seen, please follow the recommended escalation practices. Also, for tips and tricks in the contribution process you may want to read the Kubernetes contributor cheat sheet. We want to make sure your contribution gets all the attention it needs!

Thank you, and welcome to Kubernetes. 😃

@k8s-ci-robot k8s-ci-robot added the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label Apr 19, 2020
@k8s-ci-robot
Copy link
Contributor

Hi @aanno. Thanks for your PR.

I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: aanno
To complete the pull request process, please assign atoms
You can assign the PR to them by writing /assign @atoms in a comment when ready.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot requested review from bozzo and holmsten April 19, 2020 09:25
@k8s-ci-robot k8s-ci-robot added the size/S Denotes a PR that changes 10-29 lines, ignoring generated files. label Apr 19, 2020
cluster.yml Outdated
@@ -12,6 +12,8 @@
- check
vars:
ansible_connection: local
# TODO aanno: where is the right place for this?
kubelet_cgroup_driver: systemd
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is probably the wrong place to set the variable. But what would be the right place?

Copy link
Contributor

@LuckySB LuckySB Apr 19, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You offer very bad things.

Assigning variables to a playbook is bad practice
Changing the script so that it works only on one os - even worse

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not in the change set any more (use 'Files changed' tab).

BTW, I asked were the right place to put that would be...

@@ -37,7 +37,7 @@

from collections import OrderedDict
from ipaddress import ip_address
from ruamel.yaml import YAML
from ruamel_yaml import YAML
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With ruamel.yaml I was not able to call the python script on my fedora 32 host. But ruamel_yaml has worked for me.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is ok on my system, but it seems to fail in your CI pipeline. Has anyone an idea on this?

# name: libselinux-python3
name: python3-libselinux
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The package name libselinux-python3 seems to be wrong with fedora 31 coreos (it is wrong with fedora 32 as well). It must be python3-libselinux. However, I have no idea when fedora changed the package name.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I checked this on F30 and F31, dnf install libselinux-python3 installs python3-libselinux

Like:

dnf install libselinux-python3
Last metadata expiration check: 0:01:01 ago on Sun Apr 19 13:17:57 2020.
Dependencies resolved.

Package Architecture Version Repository Size

Installing:
python3-libselinux x86_64 2.9-3.1.fc30 updates 162 k

Transaction Summary

Install 1 Package

Total download size: 162 k
Installed size: 635 k
Is this ok [y/N]:
So good to use the canonical package name. 👍

state: present
use: dnf
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Without this, the install has not worked.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you elaborate why boostrap-fedora.yml is called? With fedora coreos this yml should not not be called but bootstrap-fedora-coreos.yml.
for fedora-coreos:

- include_tasks: bootstrap-fedora-coreos.yml
  when: '"ID=fedora" in os_release.stdout and "VARIANT_ID=coreos" in os_release.stdout'
..
- include_tasks: bootstrap-fedora.yml
  when:
    - '"Fedora" in os_release.stdout'
    - '"VARIANT_ID=coreos" not in os_release.stdout'

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting question. I'm afraid that the answer it: It from my very first tries: At the very beginning, I tried to install kube on my fedora 32 host system (as master node) and on 2 libvirt fedora coreos VMs (as workers). Hence it is a problem, but wrong in this PR?!?

Copy link
Contributor

@spaced spaced Apr 21, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if not any job fails with, im fine with..it may also be a fedora 32/ansible version support thing.
from ansible package:

use: The default 'auto' will use existing facts or try to autodetect it

and fedora 32 is not yet tested/supported by kubespray/ansible.

@@ -17,7 +17,7 @@ containerd_package: 'containerd.io'
containerd_cfg_dir: /etc/containerd

# Path to runc binary
runc_binary: /usr/sbin/runc
runc_binary: /usr/bin/runc
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is probably the wrong place to set the variable. But what would be the right place?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OS specific variables are in vars/*.yml, I don't think it's necessary to change the default value for all OS platforms.

@@ -5,4 +5,4 @@ crio_seccomp_profile: "/etc/crio/seccomp.json"

crio_cgroup_manager: "{{ kubelet_cgroup_driver | default('cgroupfs') }}"

crio_runc_path: "/usr/sbin/runc"
crio_runc_path: "/usr/bin/runc"
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is probably the wrong place to set the variable. But what would be the right place?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OS specific variables are in vars/*.yml, I don't think it's necessary to change the default value for all OS platforms.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not in the change set any more (use 'Files changed' tab).

@@ -1,5 +1,6 @@
---
required_pkgs:
- libselinux-python3
# - libselinux-python3
- python3-libselinux
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The package name libselinux-python3 seems to be wrong with fedora 31 coreos (it is wrong with fedora 32 as well). It must be python3-libselinux. However, I have no idea when fedora changed the package name.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I checked this on F30 and F31, dnf install libselinux-python3 installs python3-libselinux

Like:

dnf install libselinux-python3
Last metadata expiration check: 0:01:01 ago on Sun Apr 19 13:17:57 2020.
Dependencies resolved.
===========================================================================================
 Package                     Architecture    Version                Repository        Size
===========================================================================================
Installing:
 python3-libselinux          x86_64          2.9-3.1.fc30           updates          162 k

Transaction Summary
===========================================================================================
Install  1 Package

Total download size: 162 k
Installed size: 635 k
Is this ok [y/N]:

So good to use the canonical package name. 👍

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The change set now renames all occurences of libselinux-python3 to python3-libselinux.

@@ -86,7 +86,7 @@
- "fedora-modular"

- name: Install cri-o packages with osttree
command: "rpm-ostree install {{ crio_packages|join(' ') }}"
command: "rpm-ostree install --idempotent {{ crio_packages|join(' ') }}"
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Without the --idempotent I encountered problems if the package has already be installed from a proceeding (unsuccessful) ansible run.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a check

- name: Check if already installed
  stat:
    path: "/bin/crio"
  register: need_bootstrap_crio
  when: is_ostree

to avoid install and reboot. Can you elaborate what encountered problems are?

Copy link
Author

@aanno aanno Apr 21, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, I think it has stroked me because of my working flow: I tried to get kube running, stepped upon a problem, tried to fix it and then rerun the ansible scripts on the old VMs.

There are several packages involved in installing crio and if one of the is already installed (for whatever reason) rpm-ostree will indicate an error without the --idempotent. I think ansible scripts should be as idempotent as possible.

However, at the end I restraint from the crio/podman variant (I have not found a way to get it running). Hence it is a problem, but wrong in this PR?!?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no, just wondering why i never struggled with and i just want understand why it was not idempotent before. if need_boostrap_crio == /bin/crio file exits, it will install and reboot, after rerun ansible script it should not try to reinstall it..

Comment on lines 14 to 15
# TODO aanno: should use kubelet_group_driver (if available)
ExecStart={{ bin_dir }}/kubelet --cgroup-driver=systemd \
Copy link
Author

@aanno aanno Apr 19, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On fedora coreos 31, kubelet will error out (and stop running) if --cgroup-driver=systemd is not given when started.

Of course, this way to get it running is all wrong. There should be a env variable like for all other options.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the kubelet_cgroup_driver var should be used instead.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not in the change set any more (use 'Files changed' tab).

@aanno
Copy link
Author

aanno commented Apr 19, 2020

I signed it (just to see if this triggers the cncf-cla switch...)

@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. and removed cncf-cla: no Indicates the PR's author has not signed the CNCF CLA. labels Apr 19, 2020
@@ -1,5 +1,6 @@
---
required_pkgs:
- libselinux-python3
# - libselinux-python3
- python3-libselinux
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I checked this on F30 and F31, dnf install libselinux-python3 installs python3-libselinux

Like:

dnf install libselinux-python3
Last metadata expiration check: 0:01:01 ago on Sun Apr 19 13:17:57 2020.
Dependencies resolved.
===========================================================================================
 Package                     Architecture    Version                Repository        Size
===========================================================================================
Installing:
 python3-libselinux          x86_64          2.9-3.1.fc30           updates          162 k

Transaction Summary
===========================================================================================
Install  1 Package

Total download size: 162 k
Installed size: 635 k
Is this ok [y/N]:

So good to use the canonical package name. 👍

Comment on lines 14 to 15
# TODO aanno: should use kubelet_group_driver (if available)
ExecStart={{ bin_dir }}/kubelet --cgroup-driver=systemd \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the kubelet_cgroup_driver var should be used instead.

@@ -5,4 +5,4 @@ crio_seccomp_profile: "/etc/crio/seccomp.json"

crio_cgroup_manager: "{{ kubelet_cgroup_driver | default('cgroupfs') }}"

crio_runc_path: "/usr/sbin/runc"
crio_runc_path: "/usr/bin/runc"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OS specific variables are in vars/*.yml, I don't think it's necessary to change the default value for all OS platforms.

@@ -17,7 +17,7 @@ containerd_package: 'containerd.io'
containerd_cfg_dir: /etc/containerd

# Path to runc binary
runc_binary: /usr/sbin/runc
runc_binary: /usr/bin/runc
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OS specific variables are in vars/*.yml, I don't think it's necessary to change the default value for all OS platforms.

# name: libselinux-python3
name: python3-libselinux
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I checked this on F30 and F31, dnf install libselinux-python3 installs python3-libselinux

Like:

dnf install libselinux-python3
Last metadata expiration check: 0:01:01 ago on Sun Apr 19 13:17:57 2020.
Dependencies resolved.

Package Architecture Version Repository Size

Installing:
python3-libselinux x86_64 2.9-3.1.fc30 updates 162 k

Transaction Summary

Install 1 Package

Total download size: 162 k
Installed size: 635 k
Is this ok [y/N]:
So good to use the canonical package name. 👍

@k8s-ci-robot k8s-ci-robot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. and removed size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Apr 19, 2020
@aanno
Copy link
Author

aanno commented Apr 19, 2020

@Miouge1 : thank you for reviewing this, I tried hard to improve things with your suggestions

@k8s-ci-robot k8s-ci-robot added the kind/bug Categorizes issue or PR as related to a bug. label Apr 19, 2020
cluster.yml Outdated
@@ -12,6 +12,8 @@
- check
vars:
ansible_connection: local
# TODO aanno: where is the right place for this?
kubelet_cgroup_driver: systemd
Copy link
Contributor

@LuckySB LuckySB Apr 19, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You offer very bad things.

Assigning variables to a playbook is bad practice
Changing the script so that it works only on one os - even worse

@@ -17,7 +17,7 @@ containerd_package: 'containerd.io'
containerd_cfg_dir: /etc/containerd

# Path to runc binary
runc_binary: /usr/bin/runc
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why change ?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not in the change set any more (use 'Files changed' tab).

@k8s-ci-robot k8s-ci-robot added do-not-merge/invalid-commit-message Indicates that a PR should not merge because it has an invalid commit message. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. labels Apr 21, 2020
@spaced
Copy link
Contributor

spaced commented Apr 28, 2020

@aanno : the good part: crio is now tested with CI, bad part, the cri-o role was rewritten. i really recommend you to rebase

@k8s-ci-robot
Copy link
Contributor

Keywords which can automatically close issues and at(@) mentions are not allowed in commit messages.

The list of commits with invalid commit messages:

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Apr 30, 2020
@aanno
Copy link
Author

aanno commented Apr 30, 2020

@spaced: rebased this PR as you suggested

One of the commits has a wrong commit message but I don't know how to tackle the problem...

@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. labels Jul 29, 2020
@k8s-ci-robot
Copy link
Contributor

@aanno: PR needs rebase.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@spaced
Copy link
Contributor

spaced commented Aug 11, 2020

@aanno: several places are reworked (crio), can you rebase this PR?

@fejta-bot
Copy link

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle rotten

@k8s-ci-robot k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Sep 10, 2020
@fejta-bot
Copy link

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

@k8s-ci-robot
Copy link
Contributor

@fejta-bot: Closed this PR.

In response to this:

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

spaced pushed a commit to spaced/kubespray that referenced this pull request Jun 10, 2024
New Features:

NGINX 1.19.2
New configmap option enable-real-ip to enable realip_module
Use k8s.gcr.io vanity domain
Go 1.15
client-go v0.18.6
Migrate to klog v2
Changes:

 kubernetes-sigs#5887 Add force-enable-realip-module
 kubernetes-sigs#5888 Update dev-env.sh script
 kubernetes-sigs#5923 Fix error in grpcbin deployment and enable e2e test
 kubernetes-sigs#5924 Validate endpoints are ready in e2e tests
 kubernetes-sigs#5931 Add opentracing operation name settings
 kubernetes-sigs#5933 Update opentracing nginx module
 kubernetes-sigs#5946 Do not add namespace to cluster-scoped resources
 kubernetes-sigs#5951 Use env expansion to provide namespace in container args
 kubernetes-sigs#5952 Refactor shutdown e2e tests
 kubernetes-sigs#5957 bump fsnotify to v1.4.9
 kubernetes-sigs#5958 Disable enable-access-log-for-default-backend e2e test
 kubernetes-sigs#5984 Fix panic in ingress class validation
 kubernetes-sigs#5986 Migrate to klog v2
 kubernetes-sigs#5987 Fix wait times in e2e tests
 kubernetes-sigs#5990 Fix nginx command env variable reference
 kubernetes-sigs#6004 Update nginx to 1.19.2
 kubernetes-sigs#6006 Update nginx image
 kubernetes-sigs#6007 Update e2e-test-runner image
 kubernetes-sigs#6008 Rollback update of Jaeger library to 0.5.0 and update datadog to 1.2.0
 kubernetes-sigs#6014 Update go dependencies
 kubernetes-sigs#6039 Add configurable serviceMonitor metricRelabelling and targetLabels
 kubernetes-sigs#6046 Add new Dockerfile label org.opencontainers.image.revision
 kubernetes-sigs#6047 Increase wait times in e2e tests
 kubernetes-sigs#6049 Improve docs and logging for --ingress-class usage
 kubernetes-sigs#6052 Fix flaky e2e test
 kubernetes-sigs#6056 Rollback to Poll instead of PollImmediate
 kubernetes-sigs#6062 Adjust e2e timeouts
 kubernetes-sigs#6063 Remove file system paths executables
 kubernetes-sigs#6080 Use k8s.gcr.io vanity domain
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. do-not-merge/invalid-commit-message Indicates that a PR should not merge because it has an invalid commit message. kind/bug Categorizes issue or PR as related to a bug. lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. size/M Denotes a PR that changes 30-99 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

needed changes to install 1-master-2-worker fedora coreos31 on libvirt
6 participants