These instructions all assume that the failed server is destroyed, and being replaced with a completely new VM.
The new system needs to be provisioned with the same certificate name as the system it is replacing.
-
Promote the replica (official docs)
-
Purge the failed primary server
puppet node purge <failed-primary-server-fqdn>
-
Replace missing replica server (same as Replace missing or failed replica Puppet server below)
This procedure uses the following placeholder references.
- <primary-server-fqdn> - The FQDN and certname of the primary Puppet server
- <old-replica-fqdn> - The FQDN and certname of the old replica Puppet server that has failed or is missing
- <replacement-replica-fqdn> - The FQDN and certname of the new replica Puppet server
- <replacement-avail-group-letter> - Either A or B; whichever of the two letter designations is appropriate for the replacement server. It will be the opposite of the primary server.
-
Ensure the old replica server is forgotten.
puppet infrastructure forget <old-replica-fqdn>
-
Install the Puppet agent on the replacement replica.
curl -k https://<primary-server-fqdn>:8140/packages/current/install.bash \ | bash -s -- \ main:certname=<replacement-replica-fqdn> \ extension_requests:1.3.6.1.4.1.34380.1.1.9812=puppet/server \ extension_requests:1.3.6.1.4.1.34380.1.1.9813=<replacement-avail-group-letter> source /ect/profile.d/puppet-agent.sh puppet agent -t
-
Sign the certificate on the primary server.
puppetserver ca sign --certname
-
On the PE-PostgreSQL server in the <replacement-avail-group-letter> group
-
Stop puppet.service
puppet resource service puppet ensure=stopped
-
Add the following two lines to /opt/puppetlabs/server/data/postgresql/<postgres_version>/data/pg_ident.conf
where <postgres_version> is the appropriate major version of PostgreSQL as detailed in Component versions in recent PE releases. For PE release 2023.8.0 the PostgreSQL version is 14.
pe-puppetdb-pe-puppetdb-map <replacement-replica-fqdn> pe-puppetdb pe-puppetdb-pe-puppetdb-migrator-map <replacement-replica-fqdn> pe-puppetdb-migrator
-
Restart pe-postgresql.service
puppet resource service pe-postgresql ensure=stopped puppet resource service pe-postgresql ensure=running
-
Run Puppet
puppet agent -t
-
-
Provision the new system as a replica
puppet infrastructure provision replica <replacement-replica-fqdn> --topology mono-with-compile --skip-agent-config --enable
-
On the PE-PostgreSQL server in the <replacement-avail-group-letter> group, start puppet.service
puppet resource service puppet ensure=running
The procedure for replacing a failed PE-PostgreSQL server is the same regardless of which PE-PostgreSQL server is missing. This procedure uses the following placeholder references.
- <replacement-postgres-server-fqdn> - The FQDN and certname of the new server being brought in to replace the failed PE-PostgreSQL server
- <working-postgres-server-fqdn> - The FQDN and certname of the still-working PE-PostgreSQL server
- <replacement-avail-group-letter> - Either A or B; whichever of the two letter designations is appropriate for the server being replaced. It will be the opposite of the still-working PE-PostgreSQL server
- <primary-server-fqdn> - The FQDN and certname of the primary Puppet server
Procedure:
-
Clean the old <replacement-postgres-server-fqdn> cert so that the restored node will be able to request a new one with the same name
puppetserver ca clean --certname <replacement-server-fqdn>
-
Stop puppet.service and pe-puppetdb.service on all compilers in the <replacement-avail-group-letter> group, and on whichever Puppet server (primary or replica) is in the <replacement-avail-group-letter> group.
-
Pre-seed the following configuration files on the new <replacement-postgres-server-fqdn> node, before installing PE.
-
/etc/puppetlabs/puppet/puppet.conf
[main] certname = <replacement-postgres-server-fqdn>
-
/etc/puppetlabs/puppet/csr_attributes.yaml
--- extension_requests: 1.3.6.1.4.1.34380.1.1.9812: puppet/puppetdb-database 1.3.6.1.4.1.34380.1.1.9813: <replacement-avail-group-letter>
-
/tmp/pe.conf
{ "console_admin_password": "not used", "puppet_enterprise::puppet_master_host": "<primary-server-fqdn>", "puppet_enterprise::database_host": "<replacement-postgres-server-fqdn>", "puppet_enterprise::profile::database::puppetdb_hosts": [ "<primary-server-fqdn>", "<replica-server-fqdn>" ] }
-
-
Download the appropriate version of the Puppet Enterprise installer to <replacement-postgres-server-fqdn>, and run it. Use the pe.conf file created in the previous step.
./puppet-enterprise-installer -c /tmp/pe.conf
-
Run
puppet agent -t
on <replacement-postgres-server-fqdn>
Running this procedure should re-attach <replacement-postgres-server-fqdn> to the cluster. It will not have restored its database, however.
pg_basebackup:
On <working-postgres-server-fqdn>:
-
Stop puppet.
systemctl stop puppet
-
Add this line to /opt/puppetlabs/server/data/postgresql/<postgres_version>/data/pg_ident.conf
where <postgres_version> is the appropriate major version of PostgreSQL as detailed in Component versions in recent PE releases. For PE release 2023.8.0 the PostgreSQL version is 14.
replication-pe-ha-replication-map <replacement-postgres-server-fqdn> pe-ha-replication
-
Add these lines to /opt/puppetlabs/server/data/postgresql/<postgres_version>/data/pg_hba.conf
# REPLICATION RESTORE PERMISSIONS hostssl replication pe-ha-replication 0.0.0.0/0 cert map=replication-pe-ha-replication-map clientcert=1 hostssl replication pe-ha-replication ::/0 cert map=replication-pe-ha-replication-map clientcert=1
-
Reload pe-postgresql.service
systemctl reload pe-postgresql.service
On <replacement-postgres-server-fqdn>:
Run the following commands (using the appropriate PostgreSQL version number)
systemctl stop puppet.service pe-postgresql.service
mv /opt/puppetlabs/server/data/postgresql/14/data/certs /opt/puppetlabs/server/data/pg_certs
rm -rf /opt/puppetlabs/server/data/postgresql/*
runuser -u pe-postgres -- \
/opt/puppetlabs/server/bin/pg_basebackup \
-D /opt/puppetlabs/server/data/postgresql/14/data \
-d "host=<working-postgres-server-fqdn>
user=pe-ha-replication
sslmode=verify-full
sslcert=/opt/puppetlabs/server/data/pg_certs/_local.cert.pem
sslkey=/opt/puppetlabs/server/data/pg_certs/_local.private_key.pem
sslrootcert=/etc/puppetlabs/puppet/ssl/certs/ca.pem"
rm -rf /opt/puppetlabs/server/data/pg_certs
systemctl start puppet.service pe-postgresql.service
puppet agent -t
On <working-postgres-server-fqdn>:
Start puppet again and run it to remove the replication configs.
systemctl start puppet.service
puppet agent -t
Finalize:
After you finish the procedure and pg_basebackup, restart puppetdb.service and puppet.service first on whichever Puppet server (primary or replica) is in the <replacement-avail-group-letter> group, then on all the compilers in the <replacement-avail-group-letter> group.
This procedure uses the following placeholder references.
- <avail-group-letter> - Either A or B; whichever of the two letter designations the compiler is being assigned to
- <new-compiler-fqdn> - The FQDN and certname of the new compiler
- <dns-alt-names> - A comma-separated list of DNS alt names for the compiler
- <primary-server-fqdn> - The FQDN and certname of the primary Puppet server
- <postgresql-server-fqdn> - The FQDN and certname of the PE-PostgreSQL server with availability group <avail-group-letter>
-
On <postgresql-server-fqdn>:
-
Stop puppet.service
-
Add the following two lines to /opt/puppetlabs/server/data/postgresql/11/data/pg_ident.conf
pe-puppetdb-pe-puppetdb-map <new-compiler-fqdn> pe-puppetdb pe-puppetdb-pe-puppetdb-migrator-map <new-compiler-fqdn> pe-puppetdb-migrator
-
Reload pe-postgresql.service
-
-
On <new-compiler-fqdn>:
-
Install the puppet agent making sure to specify an availability group letter, A or B, as an extension request.
curl -k https://<primary-server-fqdn>:8140/packages/current/install.bash \ | sudo bash -s -- \ extension_requests:pp_auth_role=pe_compiler \ extension_requests:1.3.6.1.4.1.34380.1.1.9813=<avail-group-letter> \ main:dns_alt_names=<dns-alt-names> \ main:certname=<new-compiler-fqdn>
-
If necessary, manually submit a CSR
puppet ssl submit_request
-
-
On <primary-server-fqdn>, if necessary, sign the certificate request.
puppetserver ca sign --certname <new-compiler-certname>
-
On <new-compiler-fqdn>, run the puppet agent
puppet agent -t
-
On <postgresql-server-fqdn>:
-
Run the puppet agent
puppet agent -t
-
Start puppet.service
-