Skip to content

Latest commit

 

History

History
601 lines (451 loc) · 21.9 KB

layer_3_clusters.asciidoc

File metadata and controls

601 lines (451 loc) · 21.9 KB

PacketFence supports having clusters where servers are located in multiple layer 3 networks which we will also refer as cluster zones.

Simple RADIUS only clusters are more simple and can be configured without too much in-depth knowledge, but if you want to use the captive portal with a layer 3 cluster, this will definitely make your setup more complex and will certainly require a lot of thinking and understanding on how PacketFence works to be able to know how to properly design a cluster like this.

This section will describe the changes to do on your cluster.conf when dealing with layer 3 clusters but doesn’t cover all the cluster installation. In order to install your cluster, follow the instructions in [_cluster_setup] and refer to this section when reaching the step to configure your cluster.conf.

Simple RADIUS only cluster

In order to configure a RADIUS only layer 3 cluster, you will need at least 3 servers (5 are used in this example) with a single interface (used for management).

Cluster design and behavior:

  • This example will use 3 servers in a network (called DC1), and 2 in another network (called DC2).

  • Each group of server (in the same L2 network) will have a virtual IP address and will perform load-balancing to members in the same L2 zone (i.e. same network).

  • All the servers will use MariaDB Galera cluster and will be part of the same database cluster meaning all servers will have the same data.

  • In the event of the loss of DC1 or a network split between DC1 and DC2, the databases on DC2 will go in read-only and will exhibit the behavior described in "Quorum behavior".

  • All the servers will share the same configuration and same cluster.conf. The data in cluster.conf will serve as an overlay to the data in pf.conf to perform changes specific to each layer 3 zone.

Things to take into consideration while performing the cluster setup:

  • While going through the configurator to configure the network interfaces, you only need to have a single interface and set its type to management and high-availability.

Cluster configuration

When you are at the step where you need to configure your cluster.conf during the cluster setup, refer to the example below to build your cluster.conf.

[general]
multi_zone=enabled

[DC1 CLUSTER]
management_ip=192.168.1.10

[DC1 CLUSTER interface ens192]
ip=192.168.1.10

[DC1 pf1-dc1.example.com]
management_ip=192.168.1.11

[DC1 pf1-dc1.example.com interface ens192]
ip=192.168.1.11

[DC1 pf2-dc1.example.com]
management_ip=192.168.1.12

[DC1 pf2-dc1.example.com interface ens192]
ip=192.168.1.12

[DC1 pf3-dc1.example.com]
management_ip=192.168.1.13

[DC1 pf3-dc1.example.com interface ens192]
ip=192.168.1.13

[DC2 CLUSTER]
management_ip=192.168.2.10

[DC2 CLUSTER interface ens192]
ip=192.168.2.10

[DC2 pf1-dc2.example.com]
management_ip=192.168.2.11

[DC2 pf1-dc2.example.com interface ens192]
ip=192.168.2.11

[DC2 pf2-dc2.example.com]
management_ip=192.168.2.12

[DC2 pf2-dc2.example.com interface ens192]
ip=192.168.2.12

Notes on the configuration:

  • The hostnames (pf1-dc1.example.com, pf2-dc1.example.com, etc) are not directly related to the cluster logic and the servers can have any hostname without impacting the cluster behavior. The assignment of a server to a cluster zone is made by the first part of the section name (ex: "DC1 pf.example.com" assigns server "pf.example.com" to cluster zone "DC1")

  • Each cluster zone needs to have its own "CLUSTER" definition that declares the virtual IPs to use for this cluster zone. This also declares the management IP on which the zone should be joined.

  • Given your zones aren’t in the same layer 2 network, you cannot use the same virtual IP between your zones.

  • You should always declare a "CLUSTER" definition even though a zone has only a single server.

  • Your network equipment should point RADIUS authentication and accounting to both virtual IPs (192.168.1.10 and 192.168.2.10 in this example) either in primary/secondary or load-balancing mode.

  • RFC3576 servers (CoA and RADIUS disconnect) should be declared on your network equipment (if supported) for both virtual IPs (192.168.1.10 and 192.168.2.10 in this example)

  • You can use any virtual IP you want to update configuration: it will be sync between cluster zones

RADIUS server with captive-portal

Warning
As shortly explained above, deploying a captive-portal across a layer 3 network using the cluster zones is complex and requires in-depth knowledge of networking and how PacketFence works.

In order to configure a RADIUS server with a captive-portal on a layer 3 cluster, you will need at least 3 servers (5 are used in this example) with 2 interfaces (one for management and one for registration).

Note
Isolation is omitted in this example for brevity and should be configured the same way as registration if needed

Cluster design and behavior:

  • This example will use 3 servers in a network (called DC1), and 2 in another network (called DC2).

  • Each group of server (in the same L2 network) will have a virtual IP address and will perform load-balancing (RADIUS, HTTP) to members in the same L2 zone (i.e. same network).

  • All the servers will use MariaDB Galera cluster and will be part of the same database cluster meaning all servers will have the same data.

  • In the event of the loss of DC1 or a network split between DC1 and DC2, the databases on DC2 will go in read-only and will exhibit the behavior described in "Quorum behavior".

  • All the servers will share the same configuration and same cluster.conf. The data in cluster.conf will serve as an overlay to the data in pf.conf and networks.conf to perform changes specific to each layer 3 zone.

The schema below presents the routing that needs to be setup in your network in order to deploy this example:

Cluster zones with registration network

Notes on the schema:

  • The static routes from the PacketFence servers to the gateways on your network equipment will be configured through networks.conf and do not need to be configured manually on the servers. You will simply need to declare the remote networks so that PacketFence offers DHCP on them and routes them properly.

  • Since the network of the clients is not directly connected to the PacketFence servers via layer 2, you will need to use IP helper (DHCP relaying) on your network equipment that points to both virtual IPs of your cluster.

  • We assume that your routers are able to route all the different networks that are involved for registration (192.168.11.0/24, 192.168.22.0/24, 192.168.100.0/24) and that any client in these 3 networks can be routed to any of these networks via its gateway (192.168.11.1, 192.168.22.2, 192.168.100.1).

  • Access lists should be put in place to restrict the clients (network 192.168.100.0/24) from accessing networks other than the 3 registrations networks.

  • No special routing is required for the management interface.

Things to take into consideration while performing the cluster setup:

  • While going through the configurator to configure the network interfaces, you will need to set an interface to management and high-availability.

  • While going through the configurator to configure the network interfaces, you will need to set an interface to registration.

Cluster configuration

When you are at the step where you need to configure your cluster.conf during the cluster setup, refer to the example below to build your cluster.conf.

[general]
multi_zone=enabled

[DC1 CLUSTER]
management_ip=192.168.1.10

[DC1 CLUSTER interface ens192]
ip=192.168.1.10

[DC1 CLUSTER interface ens224]
ip=192.168.11.10

[DC1 pf1-dc1.example.com]
management_ip=192.168.1.11

[DC1 pf1-dc1.example.com interface ens192]
ip=192.168.1.11

[DC1 pf1-dc1.example.com interface ens224]
ip=192.168.11.11

[DC1 pf2-dc1.example.com]
management_ip=192.168.1.12

[DC1 pf2-dc1.example.com interface ens192]
ip=192.168.1.12

[DC1 pf2-dc1.example.com interface ens224]
ip=192.168.11.12

[DC1 pf3-dc1.example.com]
management_ip=192.168.1.13

[DC1 pf3-dc1.example.com interface ens192]
ip=192.168.1.13

[DC1 pf3-dc1.example.com interface ens224]
ip=192.168.11.13

[DC2 CLUSTER]
management_ip=192.168.2.10

[DC2 CLUSTER interface ens192]
ip=192.168.2.10

[DC2 CLUSTER interface ens224]
ip=192.168.22.10

[DC2 pf1-dc2.example.com]
management_ip=192.168.2.11

[DC2 pf1-dc2.example.com interface ens192]
ip=192.168.2.11

[DC2 pf1-dc2.example.com interface ens224]
ip=192.168.22.11

[DC2 pf2-dc2.example.com]
management_ip=192.168.2.12

[DC2 pf2-dc2.example.com interface ens192]
ip=192.168.2.12

[DC2 pf2-dc2.example.com interface ens224]
ip=192.168.22.12

Notes on the configuration:

  • The hostnames (pf1-dc1.example.com, pf2-dc1.example.com, etc) are not directly related to the cluster logic and the servers can have any hostname without impacting the cluster behavior. The assignment of a server to a cluster zone is made by the first part of the section name (ex: "DC1 pf.example.com" assigns server "pf.example.com" to cluster zone "DC1")

  • Each cluster zone needs to have its own "CLUSTER" definition that declares the virtual IPs to use for this cluster zone. This also declares the management IP on which the zone should be joined.

  • Given your zones aren’t in the same layer 2 network, you cannot use the same virtual IP between your zones.

  • You should always declare a "CLUSTER" definition even though a zone has only a single server.

  • Your network equipment should point RADIUS authentication and accounting to both virtual IPs (192.168.1.10 and 192.168.2.10 in this example) either in primary/secondary or load-balancing mode.

  • RFC3576 servers (CoA and RADIUS disconnect) should be declared on your network equipment (if supported) for both virtual IPs (192.168.1.10 and 192.168.2.10 in this example)

  • You can use any virtual IP you want to update configuration: it will be sync between cluster zones

Note
You should use the configuration above to perform the cluster setup and complete all the steps required to build your cluster. You should only continue these steps after it is fully setup and running.

Servers network configuration

After you’ve finished configuring your cluster, on one of the servers, add the following in cluster.conf in order to configure both zones registration networks:

[DC1 CLUSTER network 192.168.11.0]
dns=192.168.11.10
split_network=disabled
dhcp_start=192.168.11.10
gateway=192.168.11.10
domain-name=vlan-registration.example.com
nat_enabled=disabled
named=enabled
dhcp_max_lease_time=30
fake_mac_enabled=disabled
dhcpd=enabled
dhcp_end=192.168.11.246
type=vlan-registration
netmask=255.255.255.0
dhcp_default_lease_time=30
[DC2 CLUSTER network 192.168.22.0]
dns=192.168.22.10
split_network=disabled
dhcp_start=192.168.22.10
gateway=192.168.22.10
domain-name=vlan-registration.example.com
nat_enabled=disabled
named=enabled
dhcp_max_lease_time=30
fake_mac_enabled=disabled
dhcpd=enabled
dhcp_end=192.168.22.246
type=vlan-registration
netmask=255.255.255.0
dhcp_default_lease_time=30

Client network configuration

Now, add the following in networks.conf in order to declare the common parameters for the clients in both zones

[192.168.100.0]
gateway=192.168.100.1
dhcp_start=192.168.100.20
domain-name=vlan-registration.example.com
nat_enabled=0
named=enabled
dhcp_max_lease_time=30
dhcpd=enabled
fake_mac_enabled=disabled
netmask=255.255.255.0
type=vlan-registration
dhcp_end=192.168.100.254
dhcp_default_lease_time=30

Then, to complete the client network configuration, you will need to override the next hop (route to reach the network) and DNS server in cluster.conf by adding the following:

[DC1 CLUSTER network 192.168.100.0]
next_hop=192.168.11.1
dns=192.168.11.10
[DC2 CLUSTER network 192.168.100.0]
next_hop=192.168.22.1
dns=192.168.22.10

Synchronization and wrapping-up

Then, reload the configuration and sync the cluster from the server on which you’ve performed the configuration:

/usr/local/pf/bin/cluster/sync --as-master
/usr/local/pf/bin/pfcmd configreload hard

You should now restart PacketFence on all servers using:

/usr/local/pf/bin/pfcmd service pf restart

Remote MariaDB slave server

In cluster layer3 configuration you can configure a remote MariaDB server in slave mode. This will allow the primary database to be replicated from the central DC to the remote site. In normal operation the remote server will use the main sites database. However, when the link is broken the remote server will fallback and use its own local database in read-only mode. packetfence-haproxy-db service is responsible to detect when link between DC2 and DC1 is down.

For this configuration at least 3 servers are needed at the main site and at least 1 server is needed at the remote site.

Prepare the configuration

To configure PacketFence first turn ON "Master/Slave mode" in Configuration → System Configuration → Database → Advanced.

Next configure cluster.conf during the initial cluster setup, refer to the example below.

[general]
multi_zone=enabled

[DC1 CLUSTER]
management_ip=192.168.1.10

[DC1 CLUSTER interface ens192]
ip=192.168.1.10
mask=255.255.255.0
type=management

[DC1 CLUSTER interface ens224]
ip=192.168.11.10
mask=255.255.255.0
enforcement=vlan
type=internal

[DC1 pf1-dc1.example.com]
management_ip=192.168.1.11

[DC1 pf1-dc1.example.com interface ens192]
ip=192.168.1.11
mask=255.255.255.0
type=management

[DC1 pf1-dc1.example.com interface ens224]
ip=192.168.11.11
mask=255.255.255.0
enforcement=vlan
type=internal

[DC1 pf2-dc1.example.com]
management_ip=192.168.1.12

[DC1 pf2-dc1.example.com interface ens192]
ip=192.168.1.12
mask=255.255.255.0
type=management

[DC1 pf2-dc1.example.com interface ens224]
ip=192.168.11.12
mask=255.255.255.0
enforcement=vlan
type=internal

[DC1 pf3-dc1.example.com]
management_ip=192.168.1.13

[DC1 pf3-dc1.example.com interface ens192]
ip=192.168.1.13
mask=255.255.255.0
type=management

[DC1 pf3-dc1.example.com interface ens224]
ip=192.168.11.13
mask=255.255.255.0
enforcement=vlan
type=internal

[DC2 CLUSTER]
management_ip=192.168.2.10
masterslavemode=SLAVE
masterdb=DC1

[DC2 CLUSTER interface ens192]
ip=192.168.2.10
mask=255.255.255.0
type=management

[DC2 CLUSTER interface ens224]
ip=192.168.22.10
mask=255.255.255.0
enforcement=vlan
type=internal

[DC2 pf1-dc2.example.com]
management_ip=192.168.2.11

[DC2 pf1-dc2.example.com interface ens192]
ip=192.168.2.11
mask=255.255.255.0
type=management

[DC2 pf1-dc2.example.com interface ens224]
ip=192.168.22.11
mask=255.255.255.0
enforcement=vlan
type=internal

Note that in the DC2 CLUSTER section we defined this 2 values:

masterslavemode=SLAVE
masterdb=DC1

This mean that the cluster will be in SLAVE mode and will use the db of the DC1 cluster.

And we MUST defined the type and the enforcement on all the interfaces.

Prepare and start the master/slave replication

In order to setup the master a recent backup is required. Backups created prior to the inclusion of this feature will not work. Recent backups now include the replication runtime position of the binary logfile. First restart packetfence-mariadb on all of the servers in the main cluster.

systemctl restart packetfence-mariadb

Run the /usr/local/pf/addons/exportable-backup.sh script on the master node of the main cluster. If you do not know which server is the master, run this command on all nodes in the main cluster and, thanks to the database backup size, the master will have the bigger backup file (eg: /root/backup/packetfence-exportable-backup-YYYY-MM-DD_HHhss.tgz) that contains files and database. Transfer this file to the remote server (eg: /root/backup/)

Connect to the remote server and perform the following to sync the configuration from the master cluster:

/usr/local/pf/bin/cluster/sync --from=192.168.1.11 --api-user=packet --api-password=anotherMoreSecurePassword
/usr/local/pf/bin/pfcmd configreload hard

Then the following command to import the backup:

/usr/local/pf/addons/full-import/import.sh --db -f /root/backup/packetfence-exportable-backup-YYYY-MM-DD_HHhss.tgz
systemctl start packetfence-mariadb

On the master node of the main cluster, grant replication for the replication user:

mysql -uroot -p
MariaDB [(none)]> GRANT REPLICATION SLAVE ON *.* TO  'pfcluster'@'%';
MariaDB [(none)]> FLUSH PRIVILEGES;

Lastly, run the following script on the remote server to start the slave replication.

/usr/local/pf/addons/makeslave.pl
Enter the MySQL root password: password
Enter the MySQL master ip address: 192.168.1.11

The "MySQL master ip address" is the ip address of the master server where you created the backup file. Not the VIP of the primary cluster.

In the case when you run the script you have the following message:

ERROR 1045 (28000) at line 1: Access denied for user 'root'@'%' (using password: YES)
Unable to grant replication on user pfcluster at ./addons/makeslave.pl line 42, <STDIN> line 2.

Then you need to be sure that the root user exist in the remote database and have the correct permissions (SELECT and GRANT):

SELECT * FROM mysql.user WHERE User='root' and host ='%'\G
GRANT GRANT OPTION  ON *.* TO root@'%' identified by 'password';
GRANT SELECT ON *.* TO root@'%' identified by 'password';
FLUSH PRIVILEGES;

Alternatively, to start the slave manually refer to the following:

Edit the file /root/backup/restore/xtrabackup_binlog_info and note the file name and the position:

mariadb-bin.000014      7473

On the master server of the main cluster - where the backup was created - run the following command:

mysql -uroot -p -e "SELECT BINLOG_GTID_POS('mariadb-bin.000014', 7473)"
+---------------------------------------------+
| BINLOG_GTID_POS('mariadb-bin.000014', 7473) |
+---------------------------------------------+
| 22-2-10459                                  |
+---------------------------------------------+
mysql -uroot -p
MariaDB [(none)]> GRANT REPLICATION SLAVE ON *.* TO  'pfcluster'@'%';
MariaDB [(none)]> FLUSH PRIVILEGES;

On the remote site master server run the following MySQL command as root:

SET GLOBAL gtid_slave_pos = '22-2-10459';
CHANGE MASTER TO MASTER_HOST='192.168.1.11', MASTER_PORT=3306, MASTER_USER='pfcluster', MASTER_PASSWORD='clusterpf', MASTER_USE_GTID=slave_pos;
START SLAVE;

The replication MASTER_USER and MASTER_PASSWORD can be found in the main sites pf.conf. The MASTER_HOST is the ip address of the master server on the main site - where the backup was created. Do not use the VIP.

At the end it you want to check the status of the slave server for debug purposes you can run the follwing command:

SHOW SLAVE STATUS;