Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Check for git clone failure in rt.sh #680

Closed
GeorgeGayno-NOAA opened this issue Aug 8, 2022 · 2 comments · Fixed by #681
Closed

Check for git clone failure in rt.sh #680

GeorgeGayno-NOAA opened this issue Aug 8, 2022 · 2 comments · Fixed by #681
Assignees
Labels
bug Something isn't working

Comments

@GeorgeGayno-NOAA
Copy link
Collaborator

The rt.sh script is run off the cron on all supported machines. Recently, the script failed during the git clone command on Hera because the machine's disk was full. The log file showed:

Cloning into 'UFS_UTILS'...
error: copy-fd: write returned: Disk quota exceeded
fatal: cannot copy '/usr/share/git-core/templates/hooks/commit-msg.sample' to '/scratch2/NCEPDEV/stmp1/role.ufsutils/reg_tests.cron/UFS_UTILS/.git/hooks/commit-msg.sample': Disk quota exceeded
/scratch1/NCEPDEV/nems/role.ufsutils/ufs_utils/UFS_UTILS/reg_tests/rt.sh: line 20: cd: UFS_UTILS: No such file or directory
/scratch1/NCEPDEV/nems/role.ufsutils/ufs_utils/UFS_UTILS/reg_tests/rt.sh: line 22: sorc/machine-setup.sh: No such file or directory
fatal: not a git repository (or any parent up to mount point /)
Stopping at filesystem boundary (GIT_DISCOVERY_ACROSS_FILESYSTEM not set).
/scratch1/NCEPDEV/nems/role.ufsutils/ufs_utils/UFS_UTILS/reg_tests/rt.sh: line 37: echo: write error: Disk quota exceeded
/scratch1/NCEPDEV/nems/role.ufsutils/ufs_utils/UFS_UTILS/reg_tests/rt.sh: line 39: ./build_all.sh: No such file or directory
/scratch1/NCEPDEV/nems/role.ufsutils/ufs_utils/UFS_UTILS/reg_tests/rt.sh: line 52: cd: fix: No such file or directory
/scratch1/NCEPDEV/nems/role.ufsutils/ufs_utils/UFS_UTILS/reg_tests/rt.sh: line 53: ./link_fixdirs.sh: No such file or directory
/scratch1/NCEPDEV/nems/role.ufsutils/ufs_utils/UFS_UTILS/reg_tests/rt.sh: line 55: cd: ../reg_tests: No such file or directory
/scratch1/NCEPDEV/nems/role.ufsutils/ufs_utils/UFS_UTILS/reg_tests/rt.sh: line 84: cd: snow2mdl: No such file or directory
/scratch1/NCEPDEV/nems/role.ufsutils/ufs_utils/UFS_UTILS/reg_tests/rt.sh: line 85: ./driver..sh: No such file or directory

An email was sent to the maintainers, but the subject line did not indicate the machine, and the body of the email was blank.

The script needs better error handling.

@GeorgeGayno-NOAA GeorgeGayno-NOAA added the bug Something isn't working label Aug 8, 2022
@GeorgeGayno-NOAA GeorgeGayno-NOAA self-assigned this Aug 9, 2022
GeorgeGayno-NOAA added a commit to GeorgeGayno-NOAA/UFS_UTILS that referenced this issue Aug 9, 2022
GeorgeGayno-NOAA added a commit to GeorgeGayno-NOAA/UFS_UTILS that referenced this issue Aug 9, 2022
GeorgeGayno-NOAA added a commit to GeorgeGayno-NOAA/UFS_UTILS that referenced this issue Aug 9, 2022
GeorgeGayno-NOAA added a commit to GeorgeGayno-NOAA/UFS_UTILS that referenced this issue Aug 9, 2022
@GeorgeGayno-NOAA
Copy link
Collaborator Author

How to test this logic? The disk has been cleaned up on Hera. So I can't repeat the original problem. All I can do is adjust the script logic to trip the error. Example, look for the wrong directory:

diff --git a/reg_tests/rt.sh b/reg_tests/rt.sh
index e2d975a..b84cfd8 100755
--- a/reg_tests/rt.sh
+++ b/reg_tests/rt.sh
 # Check to see if the clone was successful. Previously, it has
 # failed due to lack of disk space.

-if [[ $rc == 0 ]] && [[ -d UFS_UTILS ]];then
+if [[ $rc == 0 ]] && [[ -d US_UTILS ]];then
   echo "Clone Successful"
 else

Or make the clone fail:

 rm -f reg_test_results.txt
 rm -rf UFS_UTILS

-git clone --recursive https://github.com/ufs-community/UFS_UTILS.git
+git clone --recursive www.github.com/ufs-community/UFS_UTILS.git
 rc=$?

@GeorgeGayno-NOAA
Copy link
Collaborator Author

The error was tripped on Hera, Jet, Orion and WCOSS2. As expected, the script execution stopped and a proper email was sent to my NOAA email account.

GeorgeGayno-NOAA added a commit to GeorgeGayno-NOAA/UFS_UTILS that referenced this issue Aug 9, 2022
GeorgeGayno-NOAA added a commit that referenced this issue Aug 9, 2022
Add logic to check if the git clone failed. Upon failure, exit script and send 
a clear email message.

Fixes #680.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant