Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add gather facts on all hosts in inventory #5936

Closed
wants to merge 4 commits into from

Conversation

LuckySB
Copy link
Contributor

@LuckySB LuckySB commented Apr 10, 2020

/kind bug
What this PR does / why we need it:
playbook failed with scale.yml

no facts gather for kube-masters group, but need in fallback_ip_base vars definition

Which issue(s) this PR fixes:
Fixes #5927

@k8s-ci-robot k8s-ci-robot added kind/bug Categorizes issue or PR as related to a bug. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Apr 10, 2020
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: LuckySB

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot requested review from bozzo and holmsten April 10, 2020 15:00
@k8s-ci-robot k8s-ci-robot added approved Indicates a PR has been approved by an approver from all required OWNERS files. size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Apr 10, 2020
@LuckySB
Copy link
Contributor Author

LuckySB commented Apr 10, 2020

/assign @mattymo

@mattymo
Copy link
Contributor

mattymo commented Apr 10, 2020

This won't solve the related issue because you usually run scale.yml with --limit argument. As a result, you won't gather facts on all hosts here. We had previously a much worse task that used delegate_to to gather facts on all nodes. It slowed deployments tremendously and I had to remove it.

@LuckySB
Copy link
Contributor Author

LuckySB commented Apr 10, 2020

Then we need to refactor the assembly cycle for variable falback_ips_base

limit is evil

@EppO
Copy link
Contributor

EppO commented Apr 10, 2020

can you run the scale playbook without limit? what would it do on the other nodes?
Using a new variable to list the new nodes to add would help, no?

@fktkrt
Copy link
Contributor

fktkrt commented Apr 10, 2020

This won't solve the related issue because you usually run scale.yml with --limit argument. As a result, you won't gather facts on all hosts here. We had previously a much worse task that used delegate_to to gather facts on all nodes. It slowed deployments tremendously and I had to remove it.

Can using the setup module be a viable option to fine-tune this?

@k8s-ci-robot k8s-ci-robot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. and removed size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Apr 10, 2020
@mattymo
Copy link
Contributor

mattymo commented Apr 11, 2020

@LuckySB Correct. For some reason the lookup in hostvars is failing for nodes without cache. That should be improved to fall back to 127.0.0.1

Copy link
Contributor

@fktkrt fktkrt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a comment regarding hostvars lookup.

@@ -49,7 +49,7 @@
setup:
delegate_to: "{{ item }}"
delegate_facts: True
when: hostvars[item]['ansible_machine_id'] is not defined
when: hostvars[item]['ansible_default_ipv4'] is not defined
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this be ansible_default_ipv4.address?

Or using something like this:
when: (hostvars[item] is defined) and (hostvars[item]['ansible_default_ipv4'] is defined) and (hostvars[item]['ansible_default_ipv4']['address'] is defined)

@EppO
Copy link
Contributor

EppO commented Apr 11, 2020

Do we have a CI test for the scale playbook? I know we have one for the upgrade but couldn't find anything for scale up/down.

@LuckySB
Copy link
Contributor Author

LuckySB commented Apr 11, 2020

@LuckySB Correct. For some reason the lookup in hostvars is failing for nodes without cache. That should be improved to fall back to 127.0.0.1

final result will depend on --limit flag
It is not idempotent

@LuckySB
Copy link
Contributor Author

LuckySB commented Apr 11, 2020

Do we have a CI test for the scale playbook? I know we have one for the upgrade but couldn't find anything for scale up/down.

no

@champtar
Copy link
Contributor

should we also add meta: clear_facts, so we are sure we don't have some old set_facts around ?

@LuckySB
Copy link
Contributor Author

LuckySB commented Apr 14, 2020

well.
try to transfer the fallback_ips_base variable assembly to
roles /kubespray-defaults/tasks
with delegate_to: localhost
and run_once: true.

and set_facts: fallback_ips: hostvar.localhost.fallback_ips

@LuckySB
Copy link
Contributor Author

LuckySB commented Apr 14, 2020

/hold

@k8s-ci-robot k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Apr 14, 2020
delegate_to: "{{ item }}"
delegate_facts: True
when: hostvars[item]['ansible_default_ipv4'] is not defined
with_items: "{{ groups['all'] }}"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You need to ignore errors, else we can't run the playbook when a host is down

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. kind/bug Categorizes issue or PR as related to a bug. size/M Denotes a PR that changes 30-99 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add new node using scale.yml fails
6 participants