-
Notifications
You must be signed in to change notification settings - Fork 135
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WIP: backup and restore scripts #72
base: master
Are you sure you want to change the base?
Changes from all commits
f856972
7787703
67fcc69
51f3e31
0173872
1aaa465
7acd8e2
bf7f30a
27842b9
aa6c7db
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -9,3 +9,6 @@ config/**/* | |
|
||
data/**/* | ||
!data/.gitkeep | ||
|
||
## backup directories | ||
backup/ |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,174 @@ | ||
#! /usr/bin/env bash | ||
|
||
set -euo pipefail | ||
|
||
#### Detect Toolkit Project Root #### | ||
# if realpath is not available, create a semi-equivalent function | ||
command -v realpath >/dev/null 2>&1 || realpath() { | ||
[[ $1 = /* ]] && echo "$1" || echo "$PWD/${1#./}" | ||
} | ||
SCRIPT_PATH="$(realpath "${BASH_SOURCE[0]}")" | ||
SCRIPT_DIR="$(dirname "$SCRIPT_PATH")" | ||
TOOLKIT_ROOT="$(realpath "$SCRIPT_DIR/..")" | ||
if [[ ! -d "$TOOLKIT_ROOT/bin" ]] || [[ ! -d "$TOOLKIT_ROOT/config" ]]; then | ||
echo "ERROR: could not find root of overleaf-toolkit project (inferred project root as '$TOOLKIT_ROOT')" | ||
exit 1 | ||
fi | ||
|
||
TMP_ROOT_DIR="$TOOLKIT_ROOT/tmp" | ||
|
||
IS_SERVER_PRO="$(grep -q 'SERVER_PRO=true' \ | ||
"$TOOLKIT_ROOT/config/overleaf.rc" && echo 'true' || echo 'false')" | ||
|
||
function usage() { | ||
cat <<EOF | ||
Usage: bin/backup | ||
|
||
Makes a backup of the data in this installation, and writes it to a | ||
timestamped tar.gz file in the ./backup/ directory. | ||
|
||
This file can then be consumed by the bin/restore script. | ||
|
||
EOF | ||
} | ||
|
||
function wait-for-mongo () { | ||
while ! "$TOOLKIT_ROOT/bin/docker-compose" exec -T mongo \ | ||
mongo --eval "db.version()" \ | ||
> /dev/null; do echo '[mongo is not ready]' && sleep 1; done | ||
echo '[mongo is ready]' | ||
} | ||
|
||
function create-tmp-dir () { | ||
local now | ||
now="$(date '+%F-%H%M%S')" | ||
local random_part | ||
random_part="$(head -c 8 /dev/urandom | md5sum | cut -c 1-4)" | ||
if ! [[ -d "$TMP_ROOT_DIR" ]]; then | ||
mkdir "$TMP_ROOT_DIR" | ||
fi | ||
local tmp_dir="$TMP_ROOT_DIR/backup-$now-$random_part" | ||
if [[ -d "$tmp_dir" ]]; then | ||
echo "Error: temp directory '$tmp_dir' already exists" >&2 | ||
exit 1 | ||
fi | ||
mkdir -p "$tmp_dir/backup" | ||
echo "$tmp_dir" | ||
} | ||
|
||
function get-container-name () { | ||
local name="$1" | ||
"$TOOLKIT_ROOT/bin/docker-compose" ps | grep "$name" | cut -d ' ' -f 1 | head -n 1 | ||
} | ||
|
||
function dump-mongo () { | ||
local tmp_dir="$1" | ||
local mongo_tmp_dir="$tmp_dir/backup/mongo" | ||
mkdir "$mongo_tmp_dir" | ||
|
||
"$TOOLKIT_ROOT/bin/docker-compose" up -d mongo | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Can we handle the case where there is an external Mongo DB here? I think there will be people running without the internal DBs. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yeah, would be useful to have a There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. In addition, if we go with external redis, it could also have a There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I was thinking that we could look at the config in some way and back up the external mongo/redis using the same mechanism. Given that we've already checked the prerequisites for making a backup (the service is stopped so nothing will be in-flight) it makes sense to roll up the backup from the external DB at the same time. WDYT? |
||
wait-for-mongo | ||
|
||
# shellcheck disable=SC1004 | ||
"$TOOLKIT_ROOT/bin/docker-compose" exec mongo bash -lc '\ | ||
[[ -d /tmp/dump ]] && rm -rf /tmp/dump; \ | ||
cd /tmp && mongodump --quiet;' | ||
|
||
docker cp "$(get-container-name mongo)":/tmp/dump \ | ||
"$mongo_tmp_dir/dump" | ||
|
||
if [[ ! -d "$mongo_tmp_dir/dump" ]]; then | ||
echo "Error: did not get mongo backup" >&2 | ||
exit 1 | ||
fi | ||
|
||
# shellcheck disable=SC1004 | ||
"$TOOLKIT_ROOT/bin/docker-compose" exec mongo bash -lc '\ | ||
rm -rf /tmp/dump;' | ||
} | ||
|
||
function copy-data-files () { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Are data files really needed for the backup? Most of them would be intermediate compile results right? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I assumed that there would be e.g. filestore data here? |
||
local tmp_dir="$1" | ||
local sharelatex_tmp_dir="$tmp_dir/backup/data/sharelatex" | ||
mkdir -p "$sharelatex_tmp_dir" | ||
|
||
rsync -a "$TOOLKIT_ROOT/data/sharelatex/" "$sharelatex_tmp_dir" | ||
} | ||
|
||
function backup-tar () { | ||
local tmp_dir="$1" | ||
local backup_name | ||
backup_name="$(basename "$tmp_dir")" | ||
local tar_file="$TOOLKIT_ROOT/backup/${backup_name}.tar.gz" | ||
echo "Writing backup to backup/$(basename "$tar_file")" | ||
pushd "$tmp_dir" 1>/dev/null | ||
tar zcvf "$tar_file" backup info.txt > /dev/null | ||
popd 1>/dev/null | ||
} | ||
|
||
function write-info-file () { | ||
local tmp_dir="$1" | ||
cat <<EOF > "$tmp_dir/info.txt" | ||
Backup info: | ||
- time: $(date '+%F-%H%M%S') | ||
- user: $(whoami) | ||
- server pro: $IS_SERVER_PRO | ||
EOF | ||
|
||
} | ||
|
||
function _main() { | ||
## Help, and such | ||
if [[ "${1:-null}" == '--help' ]] || [[ "${1:-null}" == "help" ]]; then | ||
usage | ||
exit 0 | ||
fi | ||
|
||
## Get a temp directory | ||
local tmp_dir | ||
tmp_dir="$(create-tmp-dir)" | ||
echo "Using temp directory: $tmp_dir" | ||
|
||
## Stop docker services | ||
echo "Stopping docker-compose services..." | ||
"$TOOLKIT_ROOT/bin/docker-compose" stop 2>/dev/null | ||
|
||
## Dump mongo | ||
echo "Dumping mongo..." | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Do we also need to back up redis? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think we have gone back and forth on this, the docs used to say it wasn't needed but were changed earlier this year: https://github.com/overleaf/overleaf/wiki/Backup-of-Data/_compare/dd55f820bf868464db70cbf8f584a608b6425c1b...f07e059281545c18907f9971e0fd18d9492b0127 I believe it's the case that if the editor is closed, users are disconnected, and then everything is shut down cleanly (in that order), nothing left in Redis is absolutely essential. But since we don't currently have a way to close the editor and disconnect users from a script, it seems safer to back up both, given various problems that admins have run into on support. (I think scripting the close editor and disconnect users actions was being discussed elsewhere – if we add that to the backup script then maybe Redis backup is indeed not needed.) |
||
dump-mongo "$tmp_dir" | ||
|
||
## Copy data files | ||
echo "Copying data/ files..." | ||
copy-data-files "$tmp_dir" | ||
|
||
## Add info file | ||
echo "Writing info file..." | ||
write-info-file "$tmp_dir" | ||
|
||
## Prepare backup directory | ||
[[ ! -d "$TOOLKIT_ROOT/backup" ]] && mkdir "$TOOLKIT_ROOT/backup" | ||
|
||
## Archive structure: | ||
## - backup/ | ||
## - mongo/ | ||
## - data/ | ||
## - ... | ||
## - info.txt | ||
|
||
## Create backup archive | ||
echo "Creating tar.gz archive..." | ||
backup-tar "$tmp_dir" | ||
|
||
## Clean up temp dir | ||
echo "Removing temp files..." | ||
rm -rf "$tmp_dir" | ||
|
||
## Stop docker services | ||
echo "Stopping docker-compose services..." | ||
"$TOOLKIT_ROOT/bin/docker-compose" stop 2>/dev/null | ||
|
||
echo "Done" | ||
exit 0 | ||
} | ||
|
||
_main "$@" |
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -118,6 +118,7 @@ function check_dependencies() { | |
perl | ||
awk | ||
openssl | ||
rsync | ||
) | ||
|
||
for binary in "${binaries[@]}"; do | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could
mktemp
be used here?