Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

After cloning repo, next Create/sync takes ages #8348

Open
cyberroneous opened this issue Aug 31, 2024 · 6 comments
Open

After cloning repo, next Create/sync takes ages #8348

cyberroneous opened this issue Aug 31, 2024 · 6 comments
Labels

Comments

@cyberroneous
Copy link

For the purposes of this query, assume I do not care about security implications like AES counter re-use etc. Only performance. I am trying to find the simplest way to clone/copy a repo without a massive delay in the next Create immediately afterward. I have read the docs about how to do it, and the associated warnings.

I have a borg repo at: /mnt/harddrive1/borgrepo with id 12345

I clone it using rsync to: /mnt/harddrive2/borgrepo

I edit /mnt/harddrive2/borgrepo/config and change id 12345 to 12346

I use rsync to copy ~/.config/borg/security/12345 to ~/.config/borg/security/12346

I edit ~/.config/borg/security/12346/location to be /mnt/harddrive2/borgrepo

As far as I can tell, these two repos should now be completely independent of each other, but contain the same data. However, when I run a Create command using the exact same source as previously used (no changed files or folders, about 100GB) on the original repo at /mnt/harddrive1, it takes about five minutes as expected and as usual.

But when I run the same Create command with the same unchanged 100GB source with the new repo on harddrive2 as the target, it first does "syncing chunks cache" for about an hour, and then proceeds to take about five hours to perform the actual create/backup, even though none of the source files have changed and the new copied repo already has all of the same deduplicated data in it as the original repo. The next Create after that takes only 5 minutes as expected again.

So, what am I doing wrong? What steps do I need to take/add/change to simply copy a repo and have the next Create only take five minutes like the original repo does?

@cyberroneous cyberroneous changed the title After cloming repo, next Create/sync takes ages After cloning repo, next Create/sync takes ages Aug 31, 2024
@ThomasWaldmann
Copy link
Member

Maybe the simplest solution would be to just run borg init for both hdds individually.

Then you can backup with borg to both disks without any problem.
As usual, the first backup might take a while, but after that it should be always fast.

@cyberroneous
Copy link
Author

cyberroneous commented Aug 31, 2024

Maybe the simplest solution would be to just run borg init for both hdds individually.

Then you can backup with borg to both disks without any problem. As usual, the first backup might take a while, but after that it should be always fast.

Thanks for the suggestion. I had considered that, but the problem is that the repo on the first HDD has months and months of history/snapshots/versioning of the backed up folders, which I want to replicate to the second HDD as well. Doing a fresh init would backup the current state of the folders fine, but I'd lose those months worth of past historical snapshots in the new repo.

I do sometimes need to go back and restore a file or folder version from months back and many iterations ago, and I need the repo on the second HDD to be able to carry over that capability from the first HDD. Thus copying with rsync seems to be my only reliable option.

Given that situation, is there any known procedure for doing so without causing that chunks-syncing and slow-create issue on first run? I'd have thought manually editing the ID and Location would cause Borg to see it as a unique repo that's already up-to-date, but something I'm missing is triggering that slow resync just the same.

@ThomasWaldmann
Copy link
Member

ThomasWaldmann commented Sep 1, 2024

OK, understood. Should be possible, but you need to deal a lot with borg internals.

Guess the main issue is that you forgot to treat .cache/borg/... in the same way as .config.

If borg does not have its files cache, it will read and chunk all files in the first backup run.

If it does not have its chunks cache, it will read all the archives from the repo to rebuild that in the first borg run.

@ThomasWaldmann
Copy link
Member

Side note: for borg2, there will be a borg transfer command to transfer archives from one repo to a related one. but borg2 is not ready for production yet.

@cyberroneous
Copy link
Author

Guess the main issue is that you forgot to treat .cache/borg/... in the same way as .config.
If borg does not have its files cache, it will read and chunk all files in the first backup run.
If it does not have its chunks cache, it will read all the archives from the repo to rebuild that in the first borg run.

Ahh... that would indeed explain it. Did not occur to me to copy/rename the repo ID folder in .cache as well. But I'll keep it in mind for next time in order to avoid the long re/sync process on first backup. At least until the borg2 feature mentioned becomes available. Thanks!

@cyberroneous
Copy link
Author

cyberroneous commented Sep 2, 2024

Interestingly I repeated the repository cloning process, this time also copying, renaming and editing the ~/.cache/borg/12346 folder accordingly as well, and and on first backup to the new repo, borg still went through the "syncing chunks cache" process and then did a slow backup thereafter. Clearly there is some other hidden data in some folder or file somewhere that I am neglecting to edit, that is still causing borg to recognize the cloned repo as somehow new, rather than already up to date.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants