-
Notifications
You must be signed in to change notification settings - Fork 24
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use the firewall role and the selinux role from the ha_cluster role #63
Conversation
nhosoi
commented
Aug 24, 2022
•
edited
Loading
edited
- Introduce ha_cluster_manage_firewall to use the firewall role to manage the high-availability service and the fence-virt port. ha_cluster_manage_firewall is set to true, by default.
- Introduce ha_cluster_manage_selinux to use the selinux role to manage the ports in the high-availability service. ha_cluster_manage_selinux is set to true, by default.
- Add the test check task tasks/check_firewall_selinux.yml for verify the ports status.
- Add meta/collection-requirements.yml.
README.md
Outdated
|
||
boolean, default: `false` | ||
|
||
Manage the `firewall high-availability service` as well as the `fence-virt port`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now that firewall is not configured by default, README must also specify what users do when they want to manage firewall manually. I.e. enable the high-availability service and the 1229/tcp port.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If firewall was previously configured by default, then we should make the default value for ha_cluster_manage_firewall
to be true
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right. That's what I was worried. Otherwise ha_cluster_manage_firewall: false
would wipe it out... I'm adding the code.
@richm, now I'm wondering how we could keep the ability to "disable" the ports... How about this?
ha_cluster_manage_firewall is not defined, by default.
if (firewall is previously configured and
ha_cluster_manage_firewall is not defined):
ha_cluster_manage_firewall is set to true.
if (firewall is not previously configured and
ha_cluster_manage_firewall is not defined):
Skip the firewall management.
# Please note that this fits in the current behavior.
if ha_cluster_manage_firewall is true:
enable firewall ports
else:
disable firewall ports
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please take a look at 0ae86fb.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would prefer to have ha_cluster_manage_firewall
default to true
and just have
if ha_cluster_manage_firewall is true:
enable firewall ports
else:
disable firewall ports
I'm not sure what is the problem that we're trying to solve by adding the more complex code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the default value is true
, it always configures and enables the firewall, which is different from the current behavior. Is it ok?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the default value is
true
, it always configures and enables the firewall, which is different from the current behavior. Is it ok?
I think so - I think that's what the author originally intended (@tomjelinek ?), but we didn't have a firewall role at that time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Assuming it's ok to always configure firewall, I pushed a new commit. Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the default value is
true
, it always configures and enables the firewall, which is different from the current behavior. Is it ok?I think so - I think that's what the author originally intended (@tomjelinek ?), but we didn't have a firewall role at that time.
Agreed. If we want the role to create a working cluster by default, and I think we do, then we need to open the high-availability ports in firewall by default.
Advanced users, who set custom corosync or other ports, can set ha_cluster_manage_firewall
to false
and run the firewall role with custom settings.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for your input, @tomjelinek. Glad to hear it does not break our customers' needs.
README.md
Outdated
@@ -836,6 +836,12 @@ ha_cluster_constraints_ticket: | |||
You may take a look at | |||
[an example](#creating-a-cluster-with-resource-constraints). | |||
|
|||
#### `ha_cluster_manage_firewall` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In mssql, I use mssql_firewall_configure
variable for this purpose. That's because historically, I was using the same format for mssql_ha_configure
. Do we want to make this consistent across all roles?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
mssql_manage_firewall
would make more sense however because firewall is a service. I can change mssql. I did the previous release yesterday, so this should be ok.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would put this variable description to the top, perhaps right bellow ha_cluster_enable_repos
. That's where system-like settings are described. Here, at the bottom, cluster configuration is documented.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot, @tomjelinek! I've pushed another commit to fix it. (I will or please do squash before merging this pr.)
BTW, here's the list of the
|
Not sure - @tomjelinek ? |
This PR will be updated once this selinux pr/122 is completed and merged. |
f4f62ff
to
e7ed732
Compare
Notes:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Apart from the mentioned minor issues this looks good to me.
defaults/main.yml
Outdated
|
||
# If true, manage the ports belonging to the high-availability service | ||
# and the fence-virt using the selinux role. | ||
ha_cluster_manage_selinux: true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want this variable to be documented in README.md? If not, can we consider moving it to vars/main.yml?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot, @tomjelinek! I've pushed another commit to fix it. (I will or please do squash before merging this pr.)
Please feel free to improve the sentences.
README.md
Outdated
@@ -836,6 +836,12 @@ ha_cluster_constraints_ticket: | |||
You may take a look at | |||
[an example](#creating-a-cluster-with-resource-constraints). | |||
|
|||
#### `ha_cluster_manage_firewall` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would put this variable description to the top, perhaps right bellow ha_cluster_enable_repos
. That's where system-like settings are described. Here, at the bottom, cluster configuration is documented.
833ad73
to
4f37b5c
Compare
Thanks so much for reviewing this pr, @tomjelinek! I've squashed the multiple commits into one. Actually, due to the BaseOS CI status, we have not run the CI tests yet. |
[citest] |
4f37b5c
to
40d07ce
Compare
8a61047
to
451535f
Compare
You will probably need to rebase now that #64 is merged |
Thank you, @richm. I will. Before doing that, I'd like to discuss about having two variables The issue is when the ports are managed by the firewall service (in ha_cluster case |
tasks/selinux.yml
Outdated
'fence-agents-all' in ha_cluster_extra_packages | ||
) | ||
|
||
- name: Get the high-availability tcp service ports |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you can use firewall-cmd --info-service=high-availability
to get the ports - here is the output:
>sudo firewall-cmd --info-service=high-availability
high-availability
ports: 2224/tcp 3121/tcp 5403/tcp 5404/udp 5405-5412/udp 9929/tcp 9929/udp 21064/tcp
protocols:
source-ports:
modules:
destination:
includes:
helpers:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In fact, you can eliminate using set_fact
altogether:
- name: get firewall ports
command: firewall-cmd --info-service=high-availability
register: __out
- name: construct selinux args
debug:
msg: selinux_ports {{ selinux_ports | to_nice_json }}
vars:
_state_value: enabled
__pat: '(?sm)^.* ports: ([^\n]+)\n.*$'
__port_list: "{{ __out.stdout | regex_replace(__pat, '\\1') | split |
map('split', '/') }}"
selinux_ports: |
{% set selinux_ports = [] %}
{% for port_proto in __port_list %}
{% set item = {'ports': port_proto[0], 'proto': port_proto[1],
'setype': 'cluster_port_t',
'state': _state_value, 'local': true} %}
{% set _ = selinux_ports.append(item) %}
{% endfor %}
{{ selinux_ports }}
This is for example purpose, but basically you construct the port list by regex and split on the firewall-cmd output, then pass that to a jinja loop that constructs the arguments for selinux_ports
This is the output:
selinux_ports [
{
"local": true,
"ports": "2224",
"proto": "tcp",
"setype": "cluster_port_t",
"state": "enabled"
},
{
"local": true,
"ports": "3121",
"proto": "tcp",
"setype": "cluster_port_t",
"state": "enabled"
},
{
"local": true,
"ports": "5403",
"proto": "tcp",
"setype": "cluster_port_t",
"state": "enabled"
},
{
"local": true,
"ports": "5404",
"proto": "udp",
"setype": "cluster_port_t",
"state": "enabled"
},
{
"local": true,
"ports": "5405-5412",
"proto": "udp",
"setype": "cluster_port_t",
"state": "enabled"
},
{
"local": true,
"ports": "9929",
"proto": "tcp",
"setype": "cluster_port_t",
"state": "enabled"
},
{
"local": true,
"ports": "9929",
"proto": "udp",
"setype": "cluster_port_t",
"state": "enabled"
},
{
"local": true,
"ports": "21064",
"proto": "tcp",
"setype": "cluster_port_t",
"state": "enabled"
}
]
Or - the ha_cluster role will have to use the
I think we should have both - there may be a case for the user wanting |
Ok!
You mean, it's ok to fail as long as we tell the customer to specify |
Yes, in the README and in the error message. |
Sorry... I must have been sleeping... In the current implementation, there's no way not to install |
Yes, that's ok. |
d523da3
to
c0cd6fc
Compare
[citest] |
c0cd6fc
to
454082f
Compare
[citest] |
454082f
to
def26e1
Compare
[citest] |
README.md
Outdated
@@ -42,7 +42,7 @@ manage the firewall. | |||
|
|||
NOTE: `ha_cluster_manage_firewall` is limited to *adding* ports. | |||
It cannot be used for *removing* ports. | |||
If you want to remove ports, you will need to use the firewall selinux system | |||
If you want to remove ports, you will need to use the firewall system | |||
roles directly. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
roles directly. | |
role directly. |
README.md
Outdated
@@ -62,7 +62,7 @@ firewall is not installed, managing selinux policy is skipped. | |||
|
|||
NOTE: `ha_cluster_manage_selinux` is limited to *adding* policy. | |||
It cannot be used for *removing* policy. | |||
If you want to remove policy, you will need to use the selinux selinux system | |||
If you want to remove policy, you will need to use the selinux system | |||
roles directly. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
roles directly. | |
role directly. |
[citest] |
f74ed70
to
d598299
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
May take several hours for CI tests to run . . .
d598299
to
ed61ee7
Compare
[citest] |
2 similar comments
[citest] |
[citest] |
Sorry, needed to rerun CI, we had an outage at that time |
vars: | ||
selinux_ports: "{{ _ha_cluster_selinux }}" | ||
when: | ||
- '"firewalld.service" in ansible_facts.services' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm - seems this condition is not sufficient - not only does the firewalld.service have to be present, it must also be running - here is the state in the test: https://dl.fedoraproject.org/pub/alt/linuxsystemroles/logs/lsr-citool_ha_cluster-63-ed61ee7_CentOS-Stream-8_20220923-130045/artifacts/tests_cib_constraints_create-FAILED.log
"firewalld.service": {
"name": "firewalld.service",
"source": "systemd",
"state": "inactive",
"status": "disabled"
},
here is what a running service looks like:
"crond.service": {
"name": "crond.service",
"source": "systemd",
"state": "running",
"status": "enabled"
},
so maybe a condition like
- '"firewalld.service" in ansible_facts.services' | |
- '"firewalld.service" in ansible_facts.services' | |
- ansible_facts.services["firewalld.service"]["state"] == "running" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you, @richm. I will fix it as you suggested.
I'm wondering why I could not capture this failure in this local test... (;_;)
tox -e qemu-ansible-core-2.13 -- --image-name rhel-9 --log-level info tests/tests_cib_constraints_create.yml
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you, @richm. I will fix it as you suggested. I'm wondering why I could not capture this failure in this local test... (;_;)
tox -e qemu-ansible-core-2.13 -- --image-name rhel-9 --log-level info tests/tests_cib_constraints_create.yml
It must have something to do with how the images are built. I'm guessing the qcow2 images we use from the composes are built with firewalld active by default, but the openstack images used by baseos ci do not have firewalld active by default?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok. I removed firewalld before running the test and made sure the test passed. (no worry, I do not plan to include it in the CI. :)
- name: Remove firewall package
package:
name: firewalld
state: absent
ed61ee7
to
e5ad980
Compare
- Introduce ha_cluster_manage_firewall to use the firewall role to manage the high-availability service and the fence-virt port. Default to false - means the firewall role is not used. - Introduce ha_cluster_manage_selinux to use the selinux role to manage the ports in the high-availability service. Assign cluster_port_t to the high-availability service ports. Default to false - means the selinux role is not used. - Add the test check task tasks/check_firewall_selinux.yml for verify the ports status. - Add meta/collection-requirements.yml. Note: This pr changes the ha_cluster role's behavior slightly. It used to configure firewall without any settings if the firewall service is enabled. With this change made by this pr, unless ha_cluster_manage_firewall is set to true, the firewall is not configured.
e5ad980
to
27843f5
Compare
[citest] |
@nhosoi looks good - ready to merge? |
Yes, please merge. Thanks a lot for your review comments, @richm, @tomjelinek, and @spetrosi. |
|
||
- name: Ensure the service and the ports status with the firewall role | ||
include_role: | ||
name: fedora.linux_system_roles.firewall |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This causes the role to fail on my VMs as the firewall role cannot be found. It seems like ansible collections need to be installed for this to work. Is the rhel-system-roles
packages going to depend on ansible collections? Should this be linux-system-roles.firewall
instead, so that it uses the firewall role from the rhel-system-roles
package?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This causes the role to fail on my VMs
How are you running your VMs? The tox-lsr
tox plugin using -e qemu[-ansible-core-2.x]
supports installing collection dependencies for the test.
We are following https://linux-system-roles.github.io/documentation/role-requirements.html
And if you use https://github.com/linux-system-roles/tox-lsr#qemu-testing e.g.
tox -e qemu-ansible-core-2.13 -- --image-name centos-9 tests/tests_cluster_basic_new_psks.yml
then tox will automatically install the dependencies from meta/collection-requirements.yml
in your test environment (i.e. not in ~/.ansible/collections and "pollute" your local workstation).
as the firewall role cannot be found. It seems like ansible collections need to be installed for this to work.
Yes.
Is the
rhel-system-roles
packages going to depend on ansible collections?
For upstream (users installing the ha_cluster role from Galaxy), users will need to follow directions like this: https://github.com/linux-system-roles/storage/#requirements
@nhosoi The ha_cluster README needs to be updated with a Requirements
section
Users using the ha_cluster role as part of the fedora.linux_system_roles collection will not need to do this, nor users using the role from the Fedora RPM package.
For downstream, the dependencies will be automatically handled, so the user will need to do nothing.
Should this be
linux-system-roles.firewall
instead, so that it uses the firewall role from therhel-system-roles
package?
No, see above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, thanks for explaining. I don't use tox. I'm actually developing and testing the role on VMs which I use for developing pcs as well. Once I installed the collection on the VMs, everything started working just fine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the confirmation, @tomjelinek!
I'm updating the README.md with the new requirements and the examples as suggested by @richm and @spetrosi. (stay tuned...)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tomjelinek, @richm, @spetrosi, could you please take a look at #69?
Thanks!