Skip to content

Commit

Permalink
Add -f (force) option to auto clean stale locks
Browse files Browse the repository at this point in the history
If a local lock exists and the process is no longer running (according to the
contained PID) or if a remote lock exists and was created by the same (local)
host and the indicated PID is no longer running, then clean up the lock files
and continue.

This changes the remote locking mechanism so that the remote hostname, PID, and
starting timestamp, are added to a lock file called `remote` in the tmp/lock
folder on the remote host. This is meant to allow a client to detect if it is
the client owning the lock file and allows it to determine if the PID which
created the lock file is still running.

If two clients have the same hostname, it could be possible for them to clobber
each other with the `-f` option. It's up to the administrator to make sure all
connecting clients have different hostnames when using this new options.

A new exit status is created, `6` which indicates that the remote lock file is
owned by this host and is considered stale. The `1` status is reused if it can
be detected that this host is currently syncing to the host in a different PID.
The `3` status is reused to indicate that another client is syncing with this
remote. If the remote lock file is stale because another client crashed, then
that client would have to remove the lock or it would have to be manually
removed.
  • Loading branch information
Jared Hancock committed Aug 6, 2018
1 parent c63d3a1 commit 5bc8402
Show file tree
Hide file tree
Showing 3 changed files with 79 additions and 7 deletions.
69 changes: 63 additions & 6 deletions bin/bitpocket
Original file line number Diff line number Diff line change
Expand Up @@ -65,6 +65,7 @@ else
fi

REMOTE_TMP_DIR="$REMOTE_PATH/$DOT_DIR/tmp"
HOSTNAME="$(hostname)"

# Don't sync user excluded files
if [[ -f "$DOT_DIR/exclude" ]]; then
Expand Down Expand Up @@ -489,6 +490,12 @@ function acquire_lock {
echo "There's already an instance of BitPocket syncing this directory. Exiting."
exit 1
else
if [[ $OPTIONS =~ 'force' ]]
then
echo -e "${YELLOW}Removing stale, local lock file${CLEAR}"
rm "$LOCK_DIR/pid" && rmdir "$LOCK_DIR" && acquire_lock && return 0
fi

echo -e "${RED}bitpocket error:${CLEAR} Bitpocket found a stale lock directory:"
echo " | Root dir: $(pwd)"
echo " | Lock dir: $LOCK_DIR"
Expand All @@ -506,17 +513,64 @@ function release_lock {
}

function acquire_remote_lock {
$REMOTE_RUNNER "mkdir -p \"$REMOTE_TMP_DIR\"; cd \"$REMOTE_PATH\" && mkdir \"$LOCK_DIR\" 2>/dev/null"
# TODO: Place the local hostname and this PID in a file, which will make
# automatic lock file cleanup possible. It will also offer better output if
# another host is truly syncing with the remote host.
local INFO="$HOSTNAME:$$:$TIMESTAMP"
local REMOTE_INFO=$($REMOTE_RUNNER "
mkdir -p '$REMOTE_TMP_DIR' && cd '$REMOTE_PATH'
[[ -d '$LOCK_DIR' ]] || mkdir '$LOCK_DIR'
[[ -e '$LOCK_DIR'/remote ]] || echo '$INFO' > '$LOCK_DIR'/remote
cat '$LOCK_DIR'/remote")

[[ "$INFO" == "$REMOTE_INFO" ]] && return 0

IFS=":" read -ra INFO <<< "$REMOTE_INFO"

# From here down, assume the lock could not be acquired
local code=3
if [[ -z $REMOTE_INFO ]]
then
echo "Couldn't acquire remote lock or lock file couldn't be created. Exiting."
elif [[ "$HOSTNAME" != "${INFO[0]}" ]]
then
echo -e "${YELLOW}Another client is syncing with '$REMOTE'${CLEAR}"
echo ">> Host: ${INFO[0]}"
echo ">> PID: ${INFO[1]}"
echo ">> Started: ${INFO[2]}"
elif [[ "$$" != "${INFO[1]}" ]]
then
# This host is syncing with the remote host. Check if the PID is still running
if kill -0 "${INFO[1]}" &>/dev/null
then
# XXX: This should be handled in the `acquire_lock` function
echo "Another instance of Bitpocket is currently syncing this" \
"host with '$REMOTE'"
code=1
else
# In this case, this host is holding the lock with the remote server
# but the sync is no longer running. It is perhaps possible to remove
# the lock?
if [[ $OPTIONS =~ 'force' ]]
then
echo -e "${YELLOW}Removing stale, remote lock file${CLEAR}"
$REMOTE_RUNNER "cd '$REMOTE_PATH' && rm '$LOCK_DIR/remote' && rmdir '$LOCK_DIR'"
# Try again
acquire_remote_lock && return 0
fi

echo "The remote lock is held by this host and is stale." \
"It should be removed, and the sync should be retried."
code=6
fi
fi

if [[ $? != 0 ]]; then
echo "Couldn't acquire remote lock. Another client is syncing with $REMOTE or lock file couldn't be created. Exiting."
release_lock
exit 3
fi
exit $code
}

function release_remote_lock {
$REMOTE_RUNNER "cd \"$REMOTE_PATH\" && rmdir \"$LOCK_DIR\" &>/dev/null"
$REMOTE_RUNNER "cd \"$REMOTE_PATH\" && grep -q '$HOSTNAME:$$' '$LOCK_DIR/remote' && rm '$LOCK_DIR/remote' && rmdir '$LOCK_DIR' &>/dev/null"
}

function assert_dotdir {
Expand Down Expand Up @@ -575,6 +629,7 @@ Available commands:
help Show this message.
Options:
-f, --force Clean up stale lock files automatically
-p, --pretend Don't really perform the sync or update the current
state. Instead, show what would be synchronized.
Expand All @@ -588,6 +643,7 @@ function parseargs() {
case $1 in
# Switches and configuration
-p|--pretend) OPTIONS+=('pretend');;
-f|--force) OPTIONS+=('force');;
-h|--help|-*) COMMANDS+=('help');;
# Arguments (commands)
init) if [[ $# < 2 ]]; then
Expand All @@ -603,6 +659,7 @@ function parseargs() {
shift;;
sync|init|pack|cron|log|list|help)
COMMANDS+=($1);;
# Anything else
*) echo "!!! Invalid command: $1";;
esac
shift
Expand Down
15 changes: 15 additions & 0 deletions spec/locking_spec.rb
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,7 @@

it "exits with status 3 when can't acquire remote lock" do
mkdir remote_path('.bitpocket/tmp/lock')
cat 'remote-host:0:0', remote_path('.bitpocket/tmp/lock/remote')

sync.should exit_with(3)
end
Expand All @@ -33,6 +34,20 @@
remote_path('.bitpocket/tmp/lock').should_not exist
end

it 'should cleanup remote stale lock files if forced' do
cat %x[hostname].rstrip + ':' + max_pid.to_s + ':0', remote_path('.bitpocket/tmp/lock/remote')

sync.should exit_with(6)
sync(:flags => '-f').should succeed
end

it 'should cleanup local stale lock files if forced' do
cat max_pid, local_path('.bitpocket/tmp/lock/pid')

sync.should exit_with(2)
sync(:flags => '-f').should succeed
end

def max_pid
if RUBY_PLATFORM =~ /darwin/
99998
Expand Down
2 changes: 1 addition & 1 deletion spec/spec_helper.rb
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@
PATH = "#{RSYNC_STUB_BIN_PATH}:#{ENV['PATH']}"

def sync(opts={})
system "bash -c 'CALLBACK=#{opts[:callback]} PATH=#{PATH} bash #{BP_BIN_PATH}' >/dev/null"
system "bash -c 'CALLBACK=#{opts[:callback]} PATH=#{PATH} bash #{BP_BIN_PATH} #{opts[:flags]}' >/dev/null"
$?.exitstatus
end

Expand Down

0 comments on commit 5bc8402

Please sign in to comment.