Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use the firewall role and the selinux role from the ha_cluster role #63

Merged
merged 1 commit into from
Sep 26, 2022

Conversation

nhosoi
Copy link
Contributor

@nhosoi nhosoi commented Aug 24, 2022

  • Introduce ha_cluster_manage_firewall to use the firewall role to manage the high-availability service and the fence-virt port. ha_cluster_manage_firewall is set to true, by default.
  • Introduce ha_cluster_manage_selinux to use the selinux role to manage the ports in the high-availability service. ha_cluster_manage_selinux is set to true, by default.
  • Add the test check task tasks/check_firewall_selinux.yml for verify the ports status.
  • Add meta/collection-requirements.yml.

README.md Outdated

boolean, default: `false`

Manage the `firewall high-availability service` as well as the `fence-virt port`.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now that firewall is not configured by default, README must also specify what users do when they want to manage firewall manually. I.e. enable the high-availability service and the 1229/tcp port.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If firewall was previously configured by default, then we should make the default value for ha_cluster_manage_firewall to be true

Copy link
Contributor Author

@nhosoi nhosoi Aug 24, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right. That's what I was worried. Otherwise ha_cluster_manage_firewall: false would wipe it out... I'm adding the code.

@richm, now I'm wondering how we could keep the ability to "disable" the ports... How about this?

ha_cluster_manage_firewall is not defined, by default.

if (firewall is previously configured and
    ha_cluster_manage_firewall is not defined):
    ha_cluster_manage_firewall is set to true.
if (firewall is not previously configured and
    ha_cluster_manage_firewall is not defined):
    Skip the firewall management.
# Please note that this fits in the current behavior.

if ha_cluster_manage_firewall is true:
    enable firewall ports
else:
    disable firewall ports

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please take a look at 0ae86fb.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer to have ha_cluster_manage_firewall default to true
and just have

if ha_cluster_manage_firewall is true:
    enable firewall ports
else:
    disable firewall ports

I'm not sure what is the problem that we're trying to solve by adding the more complex code.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the default value is true, it always configures and enables the firewall, which is different from the current behavior. Is it ok?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the default value is true, it always configures and enables the firewall, which is different from the current behavior. Is it ok?

I think so - I think that's what the author originally intended (@tomjelinek ?), but we didn't have a firewall role at that time.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assuming it's ok to always configure firewall, I pushed a new commit. Thanks!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the default value is true, it always configures and enables the firewall, which is different from the current behavior. Is it ok?

I think so - I think that's what the author originally intended (@tomjelinek ?), but we didn't have a firewall role at that time.

Agreed. If we want the role to create a working cluster by default, and I think we do, then we need to open the high-availability ports in firewall by default.

Advanced users, who set custom corosync or other ports, can set ha_cluster_manage_firewall to false and run the firewall role with custom settings.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your input, @tomjelinek. Glad to hear it does not break our customers' needs.

tasks/firewall.yml Outdated Show resolved Hide resolved
README.md Outdated
@@ -836,6 +836,12 @@ ha_cluster_constraints_ticket:
You may take a look at
[an example](#creating-a-cluster-with-resource-constraints).

#### `ha_cluster_manage_firewall`
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In mssql, I use mssql_firewall_configure variable for this purpose. That's because historically, I was using the same format for mssql_ha_configure. Do we want to make this consistent across all roles?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mssql_manage_firewall would make more sense however because firewall is a service. I can change mssql. I did the previous release yesterday, so this should be ok.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would put this variable description to the top, perhaps right bellow ha_cluster_enable_repos. That's where system-like settings are described. Here, at the bottom, cluster configuration is documented.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot, @tomjelinek! I've pushed another commit to fix it. (I will or please do squash before merging this pr.)

@nhosoi
Copy link
Contributor Author

nhosoi commented Aug 24, 2022

BTW, here's the list of the semanage port -l result of the ports associated with this firewall role support. As listed, most of them are not managed for selinux. (Please note that 1229, 5404 and 5405 are defined in the policy.) Do we want to add the selinux support to the rest of the ports? (I'd think the answer is yes.) If so, what port type we should give? netsupport_port_t?

# semanage port -l | egrep "1229|2224|3121|5403|5404|5405|21064|9929"
netsupport_port_t              tcp      5404, 5405
netsupport_port_t              udp      5404, 5405
zented_port_t                  tcp      1229
zented_port_t                  udp      1229

@richm
Copy link
Contributor

richm commented Aug 24, 2022

BTW, here's the list of the semanage port -l result of the ports associated with this firewall role support. As listed, most of them are not managed for selinux. (Please note that 1229, 5404 and 5405 are defined in the policy.) Do we want to add the selinux support to the rest of the ports? (I'd think the answer is yes.) If so, what port type we should give? netsupport_port_t?

# semanage port -l | egrep "1229|2224|3121|5403|5404|5405|21064|9929"
netsupport_port_t              tcp      5404, 5405
netsupport_port_t              udp      5404, 5405
zented_port_t                  tcp      1229
zented_port_t                  udp      1229

Not sure - @tomjelinek ?

@nhosoi
Copy link
Contributor Author

nhosoi commented Aug 25, 2022

This PR will be updated once this selinux pr/122 is completed and merged.

@nhosoi nhosoi changed the title [WIP] Use the firewall role from the ha_cluster role [WIP] Use the firewall role and the selinux role from the ha_cluster role Sep 1, 2022
@nhosoi
Copy link
Contributor Author

nhosoi commented Sep 1, 2022

Notes:

@nhosoi nhosoi changed the title [WIP] Use the firewall role and the selinux role from the ha_cluster role Use the firewall role and the selinux role from the ha_cluster role Sep 1, 2022
Copy link
Member

@tomjelinek tomjelinek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apart from the mentioned minor issues this looks good to me.


# If true, manage the ports belonging to the high-availability service
# and the fence-virt using the selinux role.
ha_cluster_manage_selinux: true
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want this variable to be documented in README.md? If not, can we consider moving it to vars/main.yml?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot, @tomjelinek! I've pushed another commit to fix it. (I will or please do squash before merging this pr.)

Please feel free to improve the sentences.

README.md Outdated
@@ -836,6 +836,12 @@ ha_cluster_constraints_ticket:
You may take a look at
[an example](#creating-a-cluster-with-resource-constraints).

#### `ha_cluster_manage_firewall`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would put this variable description to the top, perhaps right bellow ha_cluster_enable_repos. That's where system-like settings are described. Here, at the bottom, cluster configuration is documented.

@nhosoi
Copy link
Contributor Author

nhosoi commented Sep 2, 2022

Thanks so much for reviewing this pr, @tomjelinek! I've squashed the multiple commits into one.

Actually, due to the BaseOS CI status, we have not run the CI tests yet.
I'm going to launch it manually. This pr requires new enhancements in the BaseOS CI. Unless they are provisioned in the production system, the tests for this pr would fail. I'd think we should wait to merge this pr until the tests on the BaseOS CI env successfully pass.

@nhosoi
Copy link
Contributor Author

nhosoi commented Sep 2, 2022

[citest]

@richm
Copy link
Contributor

richm commented Sep 8, 2022

You will probably need to rebase now that #64 is merged

@nhosoi
Copy link
Contributor Author

nhosoi commented Sep 8, 2022

You will probably need to rebase now that #64 is merged

Thank you, @richm. I will.

Before doing that, I'd like to discuss about having two variables ha_cluster_manage_firewall and ha_cluster_manage_selinux, which is shared with other roles, such as metrics and vpn (so far :).

The issue is when the ports are managed by the firewall service (in ha_cluster case high-availability), we read the firewall service xml file to get the port numbers. If ha_cluster_manage_firewall is false, it may no exist unless the firewalld is already installed on the managed host prior to running the ha_cluster role. In that case, we have to give up the selinux configuration. Do you think having the two variables is reasonable? Or we'd better have just one to cover both? Can we think of a use case configuring just one of firewalld and selinux and not the other???

'fence-agents-all' in ha_cluster_extra_packages
)

- name: Get the high-availability tcp service ports
Copy link
Contributor

@richm richm Sep 8, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can use firewall-cmd --info-service=high-availability to get the ports - here is the output:

>sudo firewall-cmd --info-service=high-availability
high-availability
  ports: 2224/tcp 3121/tcp 5403/tcp 5404/udp 5405-5412/udp 9929/tcp 9929/udp 21064/tcp
  protocols: 
  source-ports: 
  modules: 
  destination: 
  includes: 
  helpers: 

Copy link
Contributor

@richm richm Sep 8, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In fact, you can eliminate using set_fact altogether:

    - name: get firewall ports
      command: firewall-cmd --info-service=high-availability
      register: __out
    - name: construct selinux args
      debug:
        msg: selinux_ports {{ selinux_ports | to_nice_json }}
      vars:
        _state_value: enabled
        __pat: '(?sm)^.* ports: ([^\n]+)\n.*$'
        __port_list: "{{ __out.stdout | regex_replace(__pat, '\\1') | split |
          map('split', '/') }}"
        selinux_ports: |
          {% set selinux_ports = [] %}
          {% for port_proto in __port_list %}
          {%   set item = {'ports': port_proto[0], 'proto': port_proto[1],
                           'setype': 'cluster_port_t',
                           'state': _state_value, 'local': true} %}
          {%   set _ = selinux_ports.append(item) %}
          {% endfor %}
          {{ selinux_ports }}

This is for example purpose, but basically you construct the port list by regex and split on the firewall-cmd output, then pass that to a jinja loop that constructs the arguments for selinux_ports
This is the output:

selinux_ports [
    {
        "local": true,
        "ports": "2224",
        "proto": "tcp",
        "setype": "cluster_port_t",
        "state": "enabled"
    },
    {
        "local": true,
        "ports": "3121",
        "proto": "tcp",
        "setype": "cluster_port_t",
        "state": "enabled"
    },
    {
        "local": true,
        "ports": "5403",
        "proto": "tcp",
        "setype": "cluster_port_t",
        "state": "enabled"
    },
    {
        "local": true,
        "ports": "5404",
        "proto": "udp",
        "setype": "cluster_port_t",
        "state": "enabled"
    },
    {
        "local": true,
        "ports": "5405-5412",
        "proto": "udp",
        "setype": "cluster_port_t",
        "state": "enabled"
    },
    {
        "local": true,
        "ports": "9929",
        "proto": "tcp",
        "setype": "cluster_port_t",
        "state": "enabled"
    },
    {
        "local": true,
        "ports": "9929",
        "proto": "udp",
        "setype": "cluster_port_t",
        "state": "enabled"
    },
    {
        "local": true,
        "ports": "21064",
        "proto": "tcp",
        "setype": "cluster_port_t",
        "state": "enabled"
    }
]

@richm
Copy link
Contributor

richm commented Sep 8, 2022

You will probably need to rebase now that #64 is merged

Thank you, @richm. I will.

Before doing that, I'd like to discuss about having two variables ha_cluster_manage_firewall and ha_cluster_manage_selinux, which is shared with other roles, such as metrics and vpn (so far :).

The issue is when the ports are managed by the firewall service (in ha_cluster case high-availability), we read the firewall service xml file to get the port numbers. If ha_cluster_manage_firewall is false, it may no exist unless the firewalld is already installed on the managed host prior to running the ha_cluster role.

Or - the ha_cluster role will have to use the firewall role first, if enabled.

In that case, we have to give up the selinux configuration. Do you think having the two variables is reasonable? Or we'd better have just one to cover both? Can we think of a use case configuring just one of firewalld and selinux and not the other???

I think we should have both - there may be a case for the user wanting ha_cluster_manage_firewall without ha_cluster_manage_selinux. However, as you point out, there is no way to dynamically determine the high-availability service ports without firewall being installed. In that case, if the user specifies ha_cluster_manage_selinux: true, we should always use firewall-cmd --info-service=high-availability to get the ports, and the role will fail if that command fails.

@nhosoi
Copy link
Contributor Author

nhosoi commented Sep 8, 2022

I think we should have both - there may be a case for the user wanting ha_cluster_manage_firewall without ha_cluster_manage_selinux.

Ok!

However, as you point out, there is no way to dynamically determine the high-availability service ports without firewall being installed. In that case, if the user specifies ha_cluster_manage_selinux: true, we should always use firewall-cmd --service=high-availability --get-ports to get the ports, and the role will fail if that command fails.

You mean, it's ok to fail as long as we tell the customer to specify ha_cluster_manage_firewall: true (maybe in the error message?) if they want to configure selinux?

@richm
Copy link
Contributor

richm commented Sep 8, 2022

I think we should have both - there may be a case for the user wanting ha_cluster_manage_firewall without ha_cluster_manage_selinux.

Ok!

However, as you point out, there is no way to dynamically determine the high-availability service ports without firewall being installed. In that case, if the user specifies ha_cluster_manage_selinux: true, we should always use firewall-cmd --service=high-availability --get-ports to get the ports, and the role will fail if that command fails.

You mean, it's ok to fail as long as we tell the customer to specify ha_cluster_manage_firewall: true (maybe in the error message?) if they want to configure selinux?

Yes, in the README and in the error message.

@nhosoi
Copy link
Contributor Author

nhosoi commented Sep 8, 2022

Sorry... I must have been sleeping... In the current implementation, there's no way not to install firewalld. If ha_cluster_manage_firewall: true, it enables the high-availability ports. If false, it disables them. (and true, by default.) So, there's no worry about not having firewall-cmd. Hope it's ok...

@richm
Copy link
Contributor

richm commented Sep 8, 2022

Sorry... I must have been sleeping... In the current implementation, there's no way not to install firewalld. If ha_cluster_manage_firewall: true, it enables the high-availability ports. If false, it disables them. (and true, by default.) So, there's no worry about not having firewall-cmd. Hope it's ok...

Yes, that's ok.

@nhosoi nhosoi force-pushed the use_firewall_role branch 2 times, most recently from d523da3 to c0cd6fc Compare September 9, 2022 17:19
@richm
Copy link
Contributor

richm commented Sep 15, 2022

[citest]

@nhosoi
Copy link
Contributor Author

nhosoi commented Sep 21, 2022

[citest]

@nhosoi
Copy link
Contributor Author

nhosoi commented Sep 21, 2022

[citest]

README.md Outdated Show resolved Hide resolved
README.md Outdated Show resolved Hide resolved
tasks/selinux.yml Outdated Show resolved Hide resolved
tasks/selinux.yml Outdated Show resolved Hide resolved
README.md Outdated
@@ -42,7 +42,7 @@ manage the firewall.

NOTE: `ha_cluster_manage_firewall` is limited to *adding* ports.
It cannot be used for *removing* ports.
If you want to remove ports, you will need to use the firewall selinux system
If you want to remove ports, you will need to use the firewall system
roles directly.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
roles directly.
role directly.

README.md Outdated
@@ -62,7 +62,7 @@ firewall is not installed, managing selinux policy is skipped.

NOTE: `ha_cluster_manage_selinux` is limited to *adding* policy.
It cannot be used for *removing* policy.
If you want to remove policy, you will need to use the selinux selinux system
If you want to remove policy, you will need to use the selinux system
roles directly.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
roles directly.
role directly.

@nhosoi
Copy link
Contributor Author

nhosoi commented Sep 21, 2022

[citest]

Copy link
Contributor

@richm richm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

May take several hours for CI tests to run . . .

@richm
Copy link
Contributor

richm commented Sep 22, 2022

[citest]

2 similar comments
@richm
Copy link
Contributor

richm commented Sep 23, 2022

[citest]

@ukulekek
Copy link
Contributor

[citest]

@ukulekek
Copy link
Contributor

Sorry, needed to rerun CI, we had an outage at that time

vars:
selinux_ports: "{{ _ha_cluster_selinux }}"
when:
- '"firewalld.service" in ansible_facts.services'
Copy link
Contributor

@richm richm Sep 23, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm - seems this condition is not sufficient - not only does the firewalld.service have to be present, it must also be running - here is the state in the test: https://dl.fedoraproject.org/pub/alt/linuxsystemroles/logs/lsr-citool_ha_cluster-63-ed61ee7_CentOS-Stream-8_20220923-130045/artifacts/tests_cib_constraints_create-FAILED.log

            "firewalld.service": {
                "name": "firewalld.service",
                "source": "systemd",
                "state": "inactive",
                "status": "disabled"
            },

here is what a running service looks like:

            "crond.service": {
                "name": "crond.service",
                "source": "systemd",
                "state": "running",
                "status": "enabled"
            },

so maybe a condition like

Suggested change
- '"firewalld.service" in ansible_facts.services'
- '"firewalld.service" in ansible_facts.services'
- ansible_facts.services["firewalld.service"]["state"] == "running"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you, @richm. I will fix it as you suggested.
I'm wondering why I could not capture this failure in this local test... (;_;)

tox -e qemu-ansible-core-2.13 -- --image-name rhel-9 --log-level info tests/tests_cib_constraints_create.yml

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you, @richm. I will fix it as you suggested. I'm wondering why I could not capture this failure in this local test... (;_;)

tox -e qemu-ansible-core-2.13 -- --image-name rhel-9 --log-level info tests/tests_cib_constraints_create.yml

It must have something to do with how the images are built. I'm guessing the qcow2 images we use from the composes are built with firewalld active by default, but the openstack images used by baseos ci do not have firewalld active by default?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok. I removed firewalld before running the test and made sure the test passed. (no worry, I do not plan to include it in the CI. :)

    - name: Remove firewall package
      package:
        name: firewalld
        state: absent

- Introduce ha_cluster_manage_firewall to use the firewall role to
  manage the high-availability service and the fence-virt port.
  Default to false - means the firewall role is not used.

- Introduce ha_cluster_manage_selinux to use the selinux role to
  manage the ports in the high-availability service.
  Assign cluster_port_t to the high-availability service ports.
  Default to false - means the selinux role is not used.

- Add the test check task tasks/check_firewall_selinux.yml for
  verify the ports status.

- Add meta/collection-requirements.yml.

Note: This pr changes the ha_cluster role's behavior slightly.
It used to configure firewall without any settings if the firewall
service is enabled. With this change made by this pr, unless
ha_cluster_manage_firewall is set to true, the firewall is not
configured.
@nhosoi
Copy link
Contributor Author

nhosoi commented Sep 23, 2022

[citest]

@richm
Copy link
Contributor

richm commented Sep 23, 2022

@nhosoi looks good - ready to merge?

@nhosoi
Copy link
Contributor Author

nhosoi commented Sep 23, 2022

@nhosoi looks good - ready to merge?

Yes, please merge. Thanks a lot for your review comments, @richm, @tomjelinek, and @spetrosi.

@richm richm merged commit 59db2cf into linux-system-roles:master Sep 26, 2022

- name: Ensure the service and the ports status with the firewall role
include_role:
name: fedora.linux_system_roles.firewall
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This causes the role to fail on my VMs as the firewall role cannot be found. It seems like ansible collections need to be installed for this to work. Is the rhel-system-roles packages going to depend on ansible collections? Should this be linux-system-roles.firewall instead, so that it uses the firewall role from the rhel-system-roles package?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This causes the role to fail on my VMs

How are you running your VMs? The tox-lsr tox plugin using -e qemu[-ansible-core-2.x] supports installing collection dependencies for the test.
We are following https://linux-system-roles.github.io/documentation/role-requirements.html
And if you use https://github.com/linux-system-roles/tox-lsr#qemu-testing e.g.
tox -e qemu-ansible-core-2.13 -- --image-name centos-9 tests/tests_cluster_basic_new_psks.yml
then tox will automatically install the dependencies from meta/collection-requirements.yml in your test environment (i.e. not in ~/.ansible/collections and "pollute" your local workstation).

as the firewall role cannot be found. It seems like ansible collections need to be installed for this to work.

Yes.

Is the rhel-system-roles packages going to depend on ansible collections?

For upstream (users installing the ha_cluster role from Galaxy), users will need to follow directions like this: https://github.com/linux-system-roles/storage/#requirements

@nhosoi The ha_cluster README needs to be updated with a Requirements section

Users using the ha_cluster role as part of the fedora.linux_system_roles collection will not need to do this, nor users using the role from the Fedora RPM package.

For downstream, the dependencies will be automatically handled, so the user will need to do nothing.

Should this be linux-system-roles.firewall instead, so that it uses the firewall role from the rhel-system-roles package?

No, see above.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, thanks for explaining. I don't use tox. I'm actually developing and testing the role on VMs which I use for developing pcs as well. Once I installed the collection on the VMs, everything started working just fine.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the confirmation, @tomjelinek!
I'm updating the README.md with the new requirements and the examples as suggested by @richm and @spetrosi. (stay tuned...)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tomjelinek, @richm, @spetrosi, could you please take a look at #69?
Thanks!

tasks/selinux.yml Show resolved Hide resolved
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants