Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cron (slurm group accounts), sql password (bugfix and changed HC sql root password), delete backup older than 7 days #539

Merged
merged 9 commits into from
Feb 15, 2022

Conversation

scimerman
Copy link
Contributor

  • cron is executed every 30 minutes, it looks for the subfolders of the /mnt/*/group/ folder. It creates only missing slurm accounts. The output is suppressed if no accounts have been added. This prevents users to get the error when they submit their first job.
  • slurm database dump was not working correctly due to the offending characters in the password "<>|". I changed the sql root password for the HC and redeployed the slurm. The quote in the playbook should prevent any similar problems in the future.
  • Slurm database backup is keeping all the backups, but we wished to keep only 7 backups.

@scimerman scimerman requested a review from pneerincx February 14, 2022 17:33
login_host: 'localhost'
login_user: 'root'
login_password: "{{ MYSQL_ROOT_PASSWORD }}"
login_unix_socket: "/var/lib/mysql/mysql.sock"
name: "{{ slurm_database_name }}"
state: 'present'
no_log: true
no_log: false
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will log also the password, which is Ok for debugging, but a security risk for production.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct - I forgot to change it back.

@@ -282,4 +282,20 @@
when: functional_admin_group is defined and functional_admin_group | length >= 1
become: true

- name: 'Deploy script to verify and if necessary fix prmissions of the environment managed by the envsync account.'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Forgot to update name after copy-paste?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Exactly 🤦‍♂️

IFS=$'\n'
for _each_group in "${_accounts_list}"; do
_output="$(sacctmgr -i create account ${_each_group} descr=scientists org=various parent=users fairshare=parent 2>&1)"
if [[ "${?}" -ne "1" ]]; then # suppress the normal output of " Nothing new added." with exit code 1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will also suppress other errors that will result in the same exit status code "1"...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right. I presumed the errors will be returned under another number (255), and not 1 ... will fix it

# Search for all existing groups inside the '/groups' folder
# that are inside /mnt/*/ folder on the SAI machines
for _each_pfs in /mnt/*; do
cd ${_each_pfs}/groups
Copy link
Contributor

@pneerincx pneerincx Feb 14, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could use an

if [[ -e "${_each_pfs}/groups" ]]; then
  ...
else
  continue
fi

check as not all PFS-ses need to have a groups LFS; some may only have an apps or home LFS.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will do. I thought we have it needs to be defined everywhere as I checked the vars of FD, HC and GS.

Comment on lines 9 to 12
_accounts_list="$(find . -maxdepth 1 -mindepth 1 -type d | sed "s/^\.\///g")"

IFS=$'\n'
for _each_group in "${_accounts_list}"; do
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
_accounts_list="$(find . -maxdepth 1 -mindepth 1 -type d | sed "s/^\.\///g")"
IFS=$'\n'
for _each_group in "${_accounts_list}"; do
readarray -t _groups < <(find . -maxdepth 1 -mindepth 1 -type d | sed "s/^\.\///g")
for _each_group in "${_groups[@]}"; do

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was unaware of the readarray. Thanks, I already like it.

Copy link
Contributor

@pneerincx pneerincx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See inline comments

Copy link
Contributor

@pneerincx pneerincx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See inline comments.

@scimerman scimerman requested a review from pneerincx February 15, 2022 11:48
@pneerincx pneerincx merged commit 407fbc7 into rug-cit-hpc:develop Feb 15, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants