These instructions all assume that the failed server is destroyed, and being replaced with a completely new VM.
The new system needs to be provisioned with the same certificate name as the system it is replacing.
Promote the replica (official docs)
Purge the failed primary server
puppet node purge <failed-primary-server-fqdn>
Replace missing replica server (same as Replace missing or failed replica Puppet server below)
This procedure uses the following placeholder references.
- <primary-server-fqdn> - The FQDN and certname of the primary Puppet server
- <old-replica-fqdn> - The FQDN and certname of the old replica Puppet server that has failed or is missing
- <replacement-replica-fqdn> - The FQDN and certname of the new replica Puppet server
- <replacement-avail-group-letter> - Either A or B; whichever of the two letter designations is appropriate for the replacement server. It will be the opposite of the primary server.
Ensure the old replica server is forgotten.
puppet infrastructure forget <old-replica-fqdn>
Install the Puppet agent on the replacement replica.
curl -k https://<primary-server-fqdn>:8140/packages/current/install.bash \ | bash -s -- \ main:certname=<replacement-replica-fqdn> \ extension_requests: \ extension_requests:<replacement-avail-group-letter> source /ect/profile.d/ puppet agent -t
Sign the certificate on the primary server.
puppetserver ca sign --certname
On the PE-PostgreSQL server in the <replacement-avail-group-letter> group
Stop puppet.service
puppet resource service puppet ensure=stopped
Add the following two lines to /opt/puppetlabs/server/data/postgresql/<postgres_version>/data/pg_ident.conf
where <postgres_version> is the appropriate major version of PostgreSQL as detailed in Component versions in recent PE releases. For PE release 2023.8.0 the PostgreSQL version is 14.
pe-puppetdb-pe-puppetdb-map <replacement-replica-fqdn> pe-puppetdb pe-puppetdb-pe-puppetdb-migrator-map <replacement-replica-fqdn> pe-puppetdb-migrator
Restart pe-postgresql.service
puppet resource service pe-postgresql ensure=stopped puppet resource service pe-postgresql ensure=running
Run Puppet
puppet agent -t
Provision the new system as a replica
puppet infrastructure provision replica <replacement-replica-fqdn> --topology mono-with-compile --skip-agent-config --enable
On the PE-PostgreSQL server in the <replacement-avail-group-letter> group, start puppet.service
puppet resource service puppet ensure=running
The procedure for replacing a failed PE-PostgreSQL server is the same regardless of which PE-PostgreSQL server is missing. This procedure uses the following placeholder references.
- <replacement-postgres-server-fqdn> - The FQDN and certname of the new server being brought in to replace the failed PE-PostgreSQL server
- <working-postgres-server-fqdn> - The FQDN and certname of the still-working PE-PostgreSQL server
- <replacement-avail-group-letter> - Either A or B; whichever of the two letter designations is appropriate for the server being replaced. It will be the opposite of the still-working PE-PostgreSQL server
- <primary-server-fqdn> - The FQDN and certname of the primary Puppet server
Clean the old <replacement-postgres-server-fqdn> cert so that the restored node will be able to request a new one with the same name
puppetserver ca clean --certname <replacement-server-fqdn>
Stop puppet.service and pe-puppetdb.service on all compilers in the <replacement-avail-group-letter> group, and on whichever Puppet server (primary or replica) is in the <replacement-avail-group-letter> group.
Pre-seed the following configuration files on the new <replacement-postgres-server-fqdn> node, before installing PE.
[main] certname = <replacement-postgres-server-fqdn>
--- extension_requests: puppet/puppetdb-database <replacement-avail-group-letter>
{ "console_admin_password": "not used", "puppet_enterprise::puppet_master_host": "<primary-server-fqdn>", "puppet_enterprise::database_host": "<replacement-postgres-server-fqdn>", "puppet_enterprise::profile::database::puppetdb_hosts": [ "<primary-server-fqdn>", "<replica-server-fqdn>" ] }
Download the appropriate version of the Puppet Enterprise installer to <replacement-postgres-server-fqdn>, and run it. Use the pe.conf file created in the previous step.
./puppet-enterprise-installer -c /tmp/pe.conf
puppet agent -t
on <replacement-postgres-server-fqdn>
Running this procedure should re-attach <replacement-postgres-server-fqdn> to the cluster. It will not have restored its database, however.
On <working-postgres-server-fqdn>:
Stop puppet.
systemctl stop puppet
Add this line to /opt/puppetlabs/server/data/postgresql/<postgres_version>/data/pg_ident.conf
where <postgres_version> is the appropriate major version of PostgreSQL as detailed in Component versions in recent PE releases. For PE release 2023.8.0 the PostgreSQL version is 14.
replication-pe-ha-replication-map <replacement-postgres-server-fqdn> pe-ha-replication
Add these lines to /opt/puppetlabs/server/data/postgresql/<postgres_version>/data/pg_hba.conf
# REPLICATION RESTORE PERMISSIONS hostssl replication pe-ha-replication cert map=replication-pe-ha-replication-map clientcert=1 hostssl replication pe-ha-replication ::/0 cert map=replication-pe-ha-replication-map clientcert=1
Reload pe-postgresql.service
systemctl reload pe-postgresql.service
On <replacement-postgres-server-fqdn>:
Run the following commands (using the appropriate PostgreSQL version number)
systemctl stop puppet.service pe-postgresql.service
mv /opt/puppetlabs/server/data/postgresql/14/data/certs /opt/puppetlabs/server/data/pg_certs
rm -rf /opt/puppetlabs/server/data/postgresql/*
runuser -u pe-postgres -- \
/opt/puppetlabs/server/bin/pg_basebackup \
-D /opt/puppetlabs/server/data/postgresql/14/data \
-d "host=<working-postgres-server-fqdn>
rm -rf /opt/puppetlabs/server/data/pg_certs
systemctl start puppet.service pe-postgresql.service
puppet agent -t
On <working-postgres-server-fqdn>:
Start puppet again and run it to remove the replication configs.
systemctl start puppet.service
puppet agent -t
After you finish the procedure and pg_basebackup, restart puppetdb.service and puppet.service first on whichever Puppet server (primary or replica) is in the <replacement-avail-group-letter> group, then on all the compilers in the <replacement-avail-group-letter> group.
This procedure uses the following placeholder references.
- <avail-group-letter> - Either A or B; whichever of the two letter designations the compiler is being assigned to
- <new-compiler-fqdn> - The FQDN and certname of the new compiler
- <dns-alt-names> - A comma-separated list of DNS alt names for the compiler
- <primary-server-fqdn> - The FQDN and certname of the primary Puppet server
- <postgresql-server-fqdn> - The FQDN and certname of the PE-PostgreSQL server with availability group <avail-group-letter>
On <postgresql-server-fqdn>:
Stop puppet.service
Add the following two lines to /opt/puppetlabs/server/data/postgresql/11/data/pg_ident.conf
pe-puppetdb-pe-puppetdb-map <new-compiler-fqdn> pe-puppetdb pe-puppetdb-pe-puppetdb-migrator-map <new-compiler-fqdn> pe-puppetdb-migrator
Reload pe-postgresql.service
On <new-compiler-fqdn>:
Install the puppet agent making sure to specify an availability group letter, A or B, as an extension request.
curl -k https://<primary-server-fqdn>:8140/packages/current/install.bash \ | sudo bash -s -- \ extension_requests:pp_auth_role=pe_compiler \ extension_requests:<avail-group-letter> \ main:dns_alt_names=<dns-alt-names> \ main:certname=<new-compiler-fqdn>
If necessary, manually submit a CSR
puppet ssl submit_request
On <primary-server-fqdn>, if necessary, sign the certificate request.
puppetserver ca sign --certname <new-compiler-certname>
On <new-compiler-fqdn>, run the puppet agent
puppet agent -t
On <postgresql-server-fqdn>:
Run the puppet agent
puppet agent -t
Start puppet.service