Skip to content

Commit

Permalink
Merge pull request #42 from rug-cit-hpc/develop
Browse files Browse the repository at this point in the history
Develop -> Master
  • Loading branch information
Gerbenvandervries authored Jan 25, 2019
2 parents 1eb904c + e8b8e65 commit 5ca01be
Show file tree
Hide file tree
Showing 44 changed files with 389 additions and 224 deletions.
2 changes: 1 addition & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -12,4 +12,4 @@ promtools/results/*
roles/hpc-cloud
roles/HPCplaybooks
roles/HPCplaybooks/*
ssh-host-ca
ssh-host-ca/umcg-hpc-ca
48 changes: 37 additions & 11 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -79,6 +79,9 @@ These roles install various docker images built and hosted by RuG webhosting. Th
#### Deployment of OpenStack
The steps below describe how to get from machines with a bare ubuntu 16.04 installed to a running openstack installation.

#### Steps to upgrade the OpenStack cluster

### 3. Steps to deploy HPC compute cluster on top of OpenStack cluster
---

0. Clone this repo.
Expand Down Expand Up @@ -108,13 +111,13 @@ The steps below describe how to get from machines with a bare ubuntu 16.04 insta

3. Configure Ansible settings including the vault.
* To create (a new) secrets.yml:
Generate and encrypt the passwords for the various openstack components.
Generate and encrypt the passwords for the various OpenStack components.
```bash
./generate_secrets.py
ansible-vault --vault-password-file=.vault_pass.txt encrypt secrets.yml
```
The encrypted secrets.yml can now safely be comitted.
The `.vault_pass.txt` file is in the .gitignore and needs to be tranfered in a secure way.
The encrypted secrets.yml can now safely be committed.
The `.vault_pass.txt` file is in the .gitignore and needs to be transfered in a secure way.

* To use use an existing encrypted secrets.yml add .vault_pass.txt to the root folder of this repo
and create in the same location ansible.cfg using the following template:
Expand All @@ -126,10 +129,37 @@ The steps below describe how to get from machines with a bare ubuntu 16.04 insta
remote_user = your_local_account_not_from_the_LDAP
```

4. Build Prometheus Node Exporter
4. Configure the Certificate Authority (CA).
We use an SSH public-private key pair to sign the host keys of all the machines in a cluster.
This way users only need the public key of the CA in their ```~.ssh/known_hosts``` file
and will not get bothered by messages like this:
```
The authenticity of host '....' can't be established.
ECDSA key fingerprint is ....
Are you sure you want to continue connecting (yes/no)?
```
* The filename of the CA private key is specified using the ```ssh_host_signer_ca_private_key``` variable defined in ```group_vars/*/vars.yml```
* The filename of the corresponding CA public key must be the same as the one of the private key suffixed with ```.pub```
* The password required to decrypt the CA private key must be specified using the ```ssh_host_signer_ca_private_key_pass``` variable defined in ```group_vars/*/secrets.yml```,
which must be encrypted with ```ansible-vault```.
* Each user must add the content of the CA public key to their ```~.ssh/known_hosts``` like this:
```
@cert-authority [names of the hosts for which the cert is valid] [content of the CA public key]
```
E.g.:
```
@cert-authority reception*,*talos,*tl-* ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAACAQDWNAF....VMZpZ5b9+5GA3O8w== UMCG HPC Development CA
```
* Example to create a new CA key pair with the ```rsa``` algorithm:
```bash
ssh-keygen -t ed25519 -a 101 -f ssh-host-ca/ca-key-file-name -C "CA key for ..."
```
5. Build Prometheus Node Exporter
* Make sure you are a member of the `docker` group.
Otherwise you will get this error:
```ERRO[0000] failed to dial gRPC: cannot connect to the Docker daemon.
```
ERRO[0000] failed to dial gRPC: cannot connect to the Docker daemon.
Is 'docker daemon' running on this host?: dial unix /var/run/docker.sock: connect:
permission denied
context canceled
Expand All @@ -140,7 +170,7 @@ The steps below describe how to get from machines with a bare ubuntu 16.04 insta
./build.sh
```
5. Running playbooks. Some examples:
6. Running playbooks. Some examples:
* Install the OpenStack cluster.
```bash
ansible-playbook site.yml
Expand All @@ -150,8 +180,4 @@ The steps below describe how to get from machines with a bare ubuntu 16.04 insta
ansible-playbook site.yml -i talos_hosts slurm.yml
```
6. verify operation.

#### Steps to upgrade openstack cluster.

### 3. Steps to install Compute cluster on top of openstack cluster.
7. verify operation.
16 changes: 10 additions & 6 deletions cluster.yml
Original file line number Diff line number Diff line change
@@ -1,4 +1,10 @@
---
- name: Sign host keys of all cluster hosts.
hosts: all
roles:
- ssh_host_signer
- ssh_known_hosts

- name: Install roles needed for all virtual cluster components except jumphosts.
hosts: cluster
become: true
Expand All @@ -11,8 +17,8 @@

- name: Install ansible on admin interfaces (DAI & SAI).
hosts:
- imperator
- sugarsnax
- sys-admin-interface
- deploy-admin-interface
become: True
tasks:
- name: install Ansible
Expand Down Expand Up @@ -65,16 +71,14 @@
- isilon
- slurm-client


- name: export /home
- name: Export /home on NFS server.
hosts: user-interface:&talos-cluster
roles:
- nfs_home_server

- name: export /home
- name: Mount /home on NFS clients.
hosts: compute-vm&talos-cluster
roles:
- nfs_home_client

- import_playbook: users.yml
#- import_playbook: ssh-host-signer.yml
2 changes: 1 addition & 1 deletion galaxy-requirements.yml
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
---
- src: chrisgavin.ansible-ssh-host-signer
- src: geerlingguy.firewall
version: 2.4.0
- src: geerlingguy.postfix
- src: geerlingguy.repo-epel
- src: geerlingguy.security
...
7 changes: 0 additions & 7 deletions gearshift_cluster.yml

This file was deleted.

12 changes: 8 additions & 4 deletions gearshift_hosts.ini
Original file line number Diff line number Diff line change
Expand Up @@ -44,13 +44,16 @@ airlock
[slurm]
imperator

[sys-admin-interface]
imperator

[deploy-admin-interface]
sugarsnax

[administration]
gearshift
imperator
sugarsnax
[administration:children]
sys-admin-interface
deploy-admin-interface
user-interface

[user-interface]
gearshift
Expand All @@ -64,6 +67,7 @@ administration

[gearshift-cluster:children]
cluster
jumphost

[metal]
gs-openstack
Expand Down
5 changes: 4 additions & 1 deletion group_vars/all/vars.yml
Original file line number Diff line number Diff line change
@@ -1,5 +1,8 @@
---
admin_ranges: "129.125.249.0/24,172.23.40.1/24"
ssh_host_signer_hostnames: "{{ ansible_fqdn }},{{ ansible_hostname }},{% for host in groups['jumphost'] %}{{ host }}+{{ ansible_hostname }}{% endfor %}"
ssh_host_signer_ca_keypair_dir: "{{ inventory_dir }}/ssh-host-ca"
ssh_host_signer_ca_private_key: "{{ ssh_host_signer_ca_keypair_dir }}/hpc-ca"
ssh_host_signer_key_types: '.*(rsa|ed25519).*'
ssh_host_signer_hostnames: "{{ ansible_fqdn }},{{ ansible_hostname }}{% for host in groups['jumphost'] %},{{ host }}+{{ ansible_hostname }}{% endfor %}"
spacewalk_server_url: 'http://spacewalk.hpc.rug.nl/XMLRPC'
...
File renamed without changes.
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,7 @@ ui_cores_per_socket: 2
ui_real_memory: 8192
ui_local_disk: 0
ui_features: 'prm01,tmp01'
ssh_host_signer_ca_private_key: "{{ ssh_host_signer_ca_keypair_dir }}/umcg-hpc-ca"
uri_ldap: 172.23.40.249
uri_ldaps: comanage-in.id.rug.nl
ldap_port: 389
Expand Down
20 changes: 0 additions & 20 deletions group_vars/gearshift_secrets.yml

This file was deleted.

File renamed without changes.
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,7 @@ ui_cores_per_socket: 1
ui_real_memory: 3000
ui_local_disk: 0
ui_features: 'prm07,tmp07'
ssh_host_signer_ca_private_key: "{{ ssh_host_signer_ca_keypair_dir }}/umcg-hpc-ca"
key_name: Gerben
image_cirros: cirros-0.3.4-x86_64-disk.img
image_centos7: centos7
Expand Down
27 changes: 27 additions & 0 deletions group_vars/talos-cluster/secrets.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
$ANSIBLE_VAULT;1.1;AES256
36363232356235643436383162303734376463343966373436646339303861326236666337633138
6561663835303037373831383233333134366461653539360a643237333166393266656338613530
66366266643264383761313831343934636261666366396539376130666465313662313537366332
3235616432613462370a623130393439636466663734326136646139373962393331316663326662
30643837373934343337646430373463623865383931383764366466376261663034306234356133
64386666643562653664653933336236363462346134336534616166363561306235356463653963
65656463663232626137613533316139623462653434666532343263316362656361623032333230
33343630376437613033333263343439636666636365336263393938383264346138333364393832
31613663303362353364663038366637303932353364333661303635623030323666346433393265
36303739313338353932326139373038316130323639323938613764623833353631623539316663
33653636653865323733383133653338303861313434383136653830393637636264363234303161
38666363636563613464313362333839643631363333636137343231306433373235336165346438
39643634383863333631313764303161333764623930343731353037326530633937646263326234
38656236366665663737333235336632303835333530303236363336333766626666386330303138
34636433316163376335656431613631646436386530363837366133383764326465303865343961
63663833303732373762303034636465383639623232663664386334323931313034353631666366
30336266616137373862316531646464336132363436396430373233316330343336346635646537
36613439623561393365383539366435393235356434643937323733313462386430303832653635
35663966356561383161386464396635353935623738333965336637613931383235336138393931
32663066613062326231393865363137613932356237343762626536316266663332333566383737
66663933313361653639336664653934303761363966373536623231366364666362393535663933
30383166643366613230333739333165336364383637303236316137333865313762643361383363
65666263343463353966353461616231623164393738373034396662616563306536616538346335
35356238323965346639313063306365313031366563653866616534636538373534653233663535
38633963323534616336353164333435616635373165623761363864666337353764636333303132
3132333039393739303933646165343862306632613032336538
Original file line number Diff line number Diff line change
Expand Up @@ -16,4 +16,11 @@ ui_cores_per_socket: 2
ui_real_memory: 8192
ui_local_disk: 0
ui_features: 'prm08,tmp08'
ssh_host_signer_ca_private_key: "{{ ssh_host_signer_ca_keypair_dir }}/umcg-hpc-development-ca"
uri_ldap: 172.23.40.249
uri_ldaps: comanage-in.id.rug.nl
ldap_port: 389
ldaps_port: 636
ldap_base: ou=umcg,o=asds
ldap_binddn: cn=clusteradminumcg,o=asds
...
22 changes: 0 additions & 22 deletions group_vars/talos/secrets.yml

This file was deleted.

3 changes: 3 additions & 0 deletions host_vars/hc-sai
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
---
mailhub: 192.168.0.5
rewrite_domain: hc-sai.gcc.rug.nl
7 changes: 0 additions & 7 deletions hyperchicken_cluster.yml

This file was deleted.

16 changes: 10 additions & 6 deletions hyperchicken_hosts.ini
Original file line number Diff line number Diff line change
@@ -1,19 +1,22 @@
[jumphost]
portal

[slurm]
hc-sai

[jumphost]
portal
[sys-admin-interface]
hc-sai

[user-interface]
hyperchicken

[deploy-admin-interface]
hc-dai

[administration]
hc-sai
hc-dai
hyperchicken
[administration:children]
sys-admin-interface
deploy-admin-interface
user-interface

[compute-vm]
hc-vcompute[01:05]
Expand All @@ -24,3 +27,4 @@ administration

[hyperchicken-cluster:children]
cluster
jumphost
1 change: 0 additions & 1 deletion roles/ansible-ssh-host-signer
Submodule ansible-ssh-host-signer deleted from 1ef7f5
1 change: 0 additions & 1 deletion roles/cluster/files/known_hosts

This file was deleted.

11 changes: 3 additions & 8 deletions roles/cluster/tasks/main.yml
Original file line number Diff line number Diff line change
Expand Up @@ -14,12 +14,12 @@
hostname:
name: '{{ inventory_hostname }}'

- name: set selinux in permissive mode
- name: Set selinux in permissive mode
selinux:
policy: targeted
state: permissive

- name: install some standard software
- name: Install some standard software
yum:
state: latest
update_cache: yes
Expand All @@ -41,9 +41,4 @@
- figlet
tags:
- software

- name: Create ssh_known_hosts file with CA used for signed host keys.
copy:
dest: /etc/ssh/ssh_known_hosts
src: files/known_hosts
tags: ['known_hosts']
...
7 changes: 7 additions & 0 deletions roles/ldap/defaults/main.yml
Original file line number Diff line number Diff line change
@@ -1,3 +1,10 @@
---
firewall_allowed_tcp_ports:
- "22"
ldap_port: 389
ldaps_port: 636
uri_ldap: ''
uri_ldaps: ''
ldap_base: ''
ldap_binddn: ''
...
1 change: 1 addition & 0 deletions roles/ldap/meta/main.yml
Original file line number Diff line number Diff line change
Expand Up @@ -4,3 +4,4 @@ dependencies:
vars:
firewall_allowed_tcp_ports:
- "22"
...
Loading

0 comments on commit 5ca01be

Please sign in to comment.