Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add proof of concept for pull mechanism #6270

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

qtc-de
Copy link

@qtc-de qtc-de commented Feb 10, 2022

Hi there 👋

first of all: This is not a real pull request. It is basically a proof of concept for an operation that I would love to see being implemented, but that is far to complex to implement it by myself. The operation itself is quite simple (as demonstrated by this PR), however, it probably requires changes in many other locations of this project.

So what is this PR about? It implements a simple proof of concept for a Pull Operation. Currently, the recommended way to perform a backup in pull mode is using sshfs, which has several drawbacks as outlined in the documentation. After looking into how push backups are implemented, I did not understand why pull backups are not implemented in the same way. Since the remote backup mechanism simply utilizes stdin and stdout of an ssh connection, it is basically invertible and it is possible to create a pull mechanism in the same way.

When applying the patch contained within this PR, the borg create command allows a new protocol type to use serve://<repo>. When using this protocol type, borg behaves basically the same as during a push operation, but outputs all RPC commands to stdout and obtains input from stdin. Therefore, borg create with the serve:// protocol is expected to run on the server side where the update is pulled from.

On the other side, on the local backup server, the borg serve operation got a new option --pull-command. This option expects the command to use for a connection to the serving borg server in pull mode.

And that's basically everything. So let's look at a short demonstration: On the server side we write the following in our authorized-keys file:

command="BORG_PASSPHRASE=example123 borg create serve:///opt/backup::'{now}' /etc /opt",restrict ssh-ed25519 AAAA...

On the client side we create a new repo and attempt a backup in pull mode:

[user@host opt]$ borg init --encryption repokey backup
Enter new passphrase: example123
[user@host opt]$ borg list /opt/backup/
Enter passphrase for key /opt/backup: example123
[user@host opt]$ borg serve --pull "ssh -i ./id_rsa root@172.17.0.2"
[user@host opt]$ borg list /opt/backup/
Enter passphrase for key /opt/backup: example123
2022-02-10T22:07:54                  Thu, 2022-02-10 23:07:54 [1713f6df6648055c3de2574e014ec45c91c8bf464cca4b27e72ef095d6820a94]

This example demonstrates that everything works like expected. That being said, this was obviously an easy example and there are many edge cases. One example is e.g. when unencrypted repositories are used. In this case borg create asks for user confirmation, which obviously breaks the pull backup. Furthermore, the whole approach is not very flexible. It would be nice if clients could specify additional parameters that are used during the backup on the server side.

So like I said, there is a lot of work to be done. However, this PR demonstrates that creating updates in pull mode is basically possible by recycling already existing mechanisms. Feel free to close this PR and to use the code snippets from it during your development process. I thought is would be easier to monitor my changes than creating an issue, but I'm aware that this PR should not be merged in it's current state.

Added a small proof of concept for a pull mechanism implementation.
Works fine for simple test cases, but probably needs many improvements
to be useful for more complex usage scenarios.
Copy link
Member

@ThomasWaldmann ThomasWaldmann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

interesting idea / PR, thanks for this!

as you said it likely needs more work (code as well as docs).

i can help with this, but considering i maybe won't use this and also that noone else made such a PR (IIRC), i guess you would be needed to do the main work.

whether this can be merged depends a bit on how big it gets in the end (I do not want to make the client/server stuff even more complex than it already is, esp. considering that debugging this is hard and a lot of work).

is this "PoC" practically working?

@@ -4630,6 +4632,9 @@ def define_borg_mount(parser):
'When a new repository is initialized, sets the storage quota on the new '
'repository as well. Default: no quota.')

subparser.add_argument('--pull-command', metavar='cmd', dest='pull_command',
help='command to use for pulling from a borg server started in serve:// mode')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

isn't that confusing?

the source of the data is always where the borg client runs and the destination of the data is where borg serve(r) runs (which feeds the data into the repo).

so the data is either pushed from client to server (what we ever did) or pulled from the client to the server.

but we never are "pulling from a borg server" (I guess you meant that the pulling action is executed on the server, but that is not really clear).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same applies for the PR text (which in the end needs to be refactored and moved into the documentation).

isn't is simpler/less confusing if we just define borg server == where "borg serve" runs (and usually where repo is located) and borg client == where the to-be-backed-up files are located == where the stuff is encrypted/compressed/etc.?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the PR top-post text is confusing the hell out of me. is it just me? please re-read / re-write.

os.set_blocking(stdout_fd, True)
os.set_blocking(stderr_fd, True)

if self.pull_command:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you please reverse this, so that what we had in the past (push) is in the if-block and the new stuff is generally in the else-block?

os.set_blocking(self.stdin_fd, False)
os.set_blocking(self.stdout_fd, False)
os.set_blocking(self.stderr_fd, False)
if serve:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also reverse blocks please.

@codecov-commenter
Copy link

codecov-commenter commented Feb 11, 2022

Codecov Report

Merging #6270 (577109c) into master (94c3d7e) will decrease coverage by 0.09%.
The diff coverage is 51.85%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #6270      +/-   ##
==========================================
- Coverage   83.26%   83.17%   -0.10%     
==========================================
  Files          38       38              
  Lines       10392    10406      +14     
  Branches     2041     2043       +2     
==========================================
+ Hits         8653     8655       +2     
- Misses       1230     1242      +12     
  Partials      509      509              
Impacted Files Coverage Δ
src/borg/helpers/parseformat.py 89.17% <28.57%> (-0.53%) ⬇️
src/borg/remote.py 78.79% <55.55%> (-0.76%) ⬇️
src/borg/archiver.py 80.39% <100.00%> (+<0.01%) ⬆️
src/borg/platform/base.py 81.19% <0.00%> (-1.71%) ⬇️
src/borg/repository.py 84.07% <0.00%> (-0.19%) ⬇️
src/borg/archive.py 82.19% <0.00%> (+0.12%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 94c3d7e...577109c. Read the comment docs.

@ThomasWaldmann
Copy link
Member

BTW, considering there is quite some interest in a safe and good working "pull mode" and that it is quite some work making this work, we could have a bounty ($$$) for this.

Maybe there could be a part for related research / experiments independent of whether the stuff gets merged in the end.
Another part could be for getting it production ready (after we see that it can get merged).

@ThomasWaldmann
Copy link
Member

@qtc-de did you see my feedback?

@qtc-de
Copy link
Author

qtc-de commented May 24, 2022

Yes, I saw it. Unfortunately I currently don't have the time to implement this. Maybe at the end of the year, but not sure. However, I'm also fine if someone else picks up the idea and starts working on it 🙂

@PhrozenByte
Copy link
Contributor

PhrozenByte commented Feb 10, 2023

I'd ❤️ to see this implemented 👍

Just to get this straight, the only reason why this is just a PoC (i.e. non-mergeable) is that it lacks support for more than plain borg create, right? I was thinking about this idea, and isn't this PoC basically the same as #6183? The only difference is that instead of creating a socket, this PoC lets borg serve directly invoke borg create. But, do we really need borg serve to invoke borg create?

Why don't we simply create a "universal communication interface" - i.e. a socket - and leave things up to the user to use this socket? One can then utilize SSH's port forwarding to get the wished pull mechanism. Since borg serve blocks, we just need two terminals:

[user@host opt]$ borg init --encryption repokey backup
Enter new passphrase: example123
[user@host opt]$ borg list /opt/backup/
Enter passphrase for key /opt/backup: example123
[user@host opt]$ borg serve --socket /opt/backup/borg.sock
… … … blocking until terminated … … …
[user@host opt]$ ssh -i ./id_rsa -R /run/borg.sock:/opt/backup/borg.sock root@172.17.0.2 borg create socket:///run/borg.sock::'{now}' /etc /opt
Enter passphrase for key /run/borg.sock: example123
[user@host opt]$ borg list /opt/backup/
Enter passphrase for key /opt/backup: example123
2022-02-10T22:07:54                  Thu, 2022-02-10 23:07:54 [1713f6df6648055c3de2574e014ec45c91c8bf464cca4b27e72ef095d6820a94]

Note: If one adds borg create to the user's authorized keys file and therefore restricts the command to execute (as explained above), it would still work, the ssh command would just look like this (it's just slightly harder to understand what is going on):

[user@host opt]$ ssh -i ./id_rsa -R /run/borg.sock:/opt/backup/borg.sock root@172.17.0.2
Enter passphrase for key /run/borg.sock: example123

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants