Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Show backup size with excludes applied #961

Merged
merged 6 commits into from
Oct 18, 2021

Conversation

freder
Copy link
Contributor

@freder freder commented Apr 25, 2021

since I want this feature and I believe it has been requested elsewhere already, I had a first stab at it, as a proof-of-concept.

vorta

I'm currently not sure if the globmatch package handles patterns the same way borg does.

@m3nu
Copy link
Contributor

m3nu commented Apr 25, 2021

Nice take on a hard problem. 👍

If this works, just put the new, more accurate size. No need to add a new column (except for debugging).

I just dont think we can add this new globmatch dependency. Only 6 stars, 3 files with a few lines each. Would be too risky and hard to package.

And it’s not really needed: we mainly need shell patterns and maybe fn. See here for the full list: https://borgbackup.readthedocs.io/en/stable/usage/help.html#borg-help-patterns

Shell patterns are translated into Python regex by Borg. We can borrow from there: https://github.com/borgbackup/borg/blob/c88a37eea430d7ec2e5da1ae503e43519ee90cb1/src/borg/shellpattern.py

@m3nu m3nu marked this pull request as draft April 25, 2021 15:10
@m3nu
Copy link
Contributor

m3nu commented Apr 25, 2021

Their code is already very similar to the one in Borg. They both say to have copied it from Python’s fnmatch to a degree.

https://github.com/vidartf/globmatch/blob/master/globmatch/translation.py#L91

@freder
Copy link
Contributor Author

freder commented Apr 30, 2021

(how) would it work if we simply copied https://github.com/borgbackup/borg/blob/master/src/borg/shellpattern.py into this repo? would we only have to add their license to the one respective file and that's it?

or what if we imported the file directly from the local borg installation? (I little hacky, I suppose)

@m3nu
Copy link
Contributor

m3nu commented May 1, 2021

We cant import it because Borg is only called via their CLI. It may be installed in a different Python env.

Copying the file will work, since they copied it from fnmatch anyways. Or adjust and integrate the relevant function only.

@jose1711
Copy link

Why not to call borg with dry_run and parse the output? I am not a developer so please don't judge me - this is only for a demonstration purposes.

Example: show all excluded files in /etc and /tmp matching the **/r* exclude pattern:

from borg import archiver as borgArchiver
from borg.helpers import Location
from borg.logger import create_logger, setup_logging
from borg.patterns import CmdTuple, FnmatchPattern, IECommand
from tempfile import NamedTemporaryFile
import io
import logging
import argparse

paths = ['/etc', '/tmp']
patterns = [CmdTuple(val=FnmatchPattern('**/r*'), cmd=IECommand.Exclude)]

logfile = io.StringIO()
setup_logging(logfile)

tmp_file = NamedTemporaryFile().name

args = argparse.Namespace()

archiver = borgArchiver.Archiver()
archiver.log_json = False
location = Location(tmp_file)
args.location = location
args.encryption = 'none'
archiver.do_init(args)

args.dry_run = True
args.output_list = True
args.paths = paths
args.patterns = patterns

args.output_filter = None
args.exclude_nodump = None
args.one_file_system = None
args.exclude_caches = None
args.exclude_if_present = None
args.read_special = None
args.keep_exclude_tags = None

archiver.do_create(args)

logfile.seek(0)
print('Files matched by the exclude pattern:')
print([x.strip() for x in logfile.readlines() if x.startswith('x ')])

List of the included files could be parsed from the same logfile + stat call to get the sizes.

@m3nu
Copy link
Contributor

m3nu commented May 14, 2021

Vorta doesn't interface with Borg via Python, but via the CLI. So we can't use private Borg functions and classes. This is because there are different ways to install Borg and it may not be in the same Python environment.

@jose1711
Copy link

Vorta doesn't interface with Borg via Python, but via the CLI. So we can't use private Borg functions and classes. This is because there are different ways to install Borg and it may not be in the same Python environment.

Understood, but in that case same can be achieved via Popen, right?

@m3nu
Copy link
Contributor

m3nu commented May 14, 2021

If borg create --dry-run gives the required data, then yes.

@jose1711
Copy link

If borg create --dry-run gives the required data, then yes.

It is actually recommended in borg-patterns man page:
obrázok

@freder
Copy link
Contributor Author

freder commented Aug 21, 2021

borg create \
	--list \
	--dry-run \
	--exclude='...' \
	--exclude='...' \
	/path/to/archive::DRYRUN \
	/path/to/path \
	2>&1 \
		| perl -lne 'if (/- (.*)/) { print $1 }' \
		| while read p ; do \
			[ -f "$p" ] && echo "$p" ; \
		done

prints the paths of all files that are going to be included in the backup. unfortunately --stats does not work with --dry-run.

@freder
Copy link
Contributor Author

freder commented Aug 21, 2021

@freder
Copy link
Contributor Author

freder commented Sep 4, 2021

what do you guys think?

@m3nu
Copy link
Contributor

m3nu commented Sep 4, 2021

Could be a good addition. Is it ready to merge?

@freder freder force-pushed the size-after-excludes branch from 67bc6e2 to fca5fc5 Compare October 9, 2021 09:25
@freder freder marked this pull request as ready for review October 9, 2021 09:25
@freder
Copy link
Contributor Author

freder commented Oct 9, 2021

I rebased and cleaned up. should be good to go now.

@m3nu
Copy link
Contributor

m3nu commented Oct 13, 2021

Good stuff. And timely to add while v0.8 is under heavy development.

2 thoughts:

  • Is translate the best function name, since we already use it to mean translations of language.
  • There is a good bit of sizing-related code by now. Does it make sense to move it to a new module/file?

@freder
Copy link
Contributor Author

freder commented Oct 16, 2021

2 thoughts:
Is translate the best function name, since we already use it to mean translations of language.

translate is very generic, indeed. I renamed it to pattern_to_regex.

There is a good bit of sizing-related code by now. Does it make sense to move it to a new module/file?

it's probably a good idea to refactor in a separate PR, or not?

@m3nu
Copy link
Contributor

m3nu commented Oct 16, 2021

it's probably a good idea to refactor in a separate PR, or not?

True. Thanks for the other change. Will locally test and merge it tomorrow morning.

@m3nu m3nu merged commit 9bad152 into borgbase:master Oct 18, 2021
@m3nu
Copy link
Contributor

m3nu commented Oct 18, 2021

Thanks for the valuable addition! Now merged.

@freder
Copy link
Contributor Author

freder commented Oct 18, 2021

👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants