-
Notifications
You must be signed in to change notification settings - Fork 67
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add pristine-lfs support #79
base: master
Are you sure you want to change the base?
Conversation
@andrewshadura thanks! This looks like a useful addition. Thinks i'm unsure about:
Hmm...would it make sense to use
That does't look too bad from a quick glance. |
So,
Given the above I think the first try should be
I agree: |
Thanks ore spelling out the details. If we make |
1e9af1e
to
9f547ff
Compare
@agx, I finally found time for this pull request. Given how long it’s been around — and I’ve been using it all that time! — I think we should merge the minimal dumb version of it without extra smartness, as described in the last message: import to either pristine-tar or pristine-lfs only when told to, when both are enabled, import to both. Export checks both, prefers pristine-lfs. I haven’t added any configuration for that, but I can’t really think about many situations when this might be necessary — unless someone commits different tarballs to both pristine-lfs and pristine-tar. I have described the approach and the rationale in the commit message for 531971c. |
Add pristine-lfs support similar to the existing pristine-tar support. Since pristine-lfs has a Python interface, use that instead of the command-line interface, but use it in a fail-safe way. Unlike pristine-tar, pristine-lfs supports looking up tarballs by a version, so use that instead of picking tarballs manually. If both pristine-lfs and pristine-tar are enabled, import tarballs into both; when checking out, prefer pristine-lfs, but also do pristine-tar. The rationale for the above behaviour: pristine-lfs and pristine-tar are built on different principles. What is stored: * pristine-tar records the hash of the Git tree the tarball corresponds to, the hash of the tarball itself, and a binary delta between its Git export and the actual tarball * pristine-lfs records the hash of the tarball only; the actual tarball is stored and transferred out-of-band using Git LFS mechanisms. How tarballs are recreated: * pristine-tar only needs the Git repository itself: it uses the Git tree referenced in its metadata and the binary delta to recreate the tarball. It needs to explicitly support each compressor and its quirks to be able to efficiently store the tarball and precisely recreate it * pristine-lfs needs the Git repository itself, but it also needs to download binary blobs for each tarball separately. They’re stored inside .git but not as Git objects. When a tarball is checked out, it’s basically downloaded if it’s missing and then copied from the internal storage to the desired location. This means that for off-line operation all tarballs need to be cached beforehand. Conclusions: * for a lot of tarballs especially with tricky compressors, when you need just one, the bandwidth to recreate it will be less for pristine-lfs than pristine-tar since pristine-lfs branch only contains text metadata but no tarball data — with pristine-tar the branch will contain all binary deltas for all tarballs, whereas pristine-lfs only downloads tarballs on-demand. (On the other hand, the tarball data is already on the upstream code branch.) * when you need all tarball ever imported, pristine-tar will probably save you a lot of bandwidth * pristine-lfs doesn’t care about compressors or the tarball format * pristine-tar doesn’t require any server-side support, pristine-lfs needs a Git LFS server. Given the above, it makes sense that during a checkout, the first try should be pristine-lfs (if available) and then pristine-tar as a fallback. During import, --pristine-lfs --pristine-tar on import should do both, not just one. This approach may be revisited in future. Signed-off-by: Andrej Shadura <andrew.shadura@collabora.co.uk>
Signed-off-by: Andrej Shadura <andrew.shadura@collabora.co.uk>
Signed-off-by: Andrej Shadura <andrew.shadura@collabora.co.uk>
9f547ff
to
309165b
Compare
By the way, I’ve just noticed the RPM test data repo is gone — is that intentional? |
@agx, is there any chance to get this merged before the freeze? |
Seems github ate my reply: Can we have tests for the new options (enabled via |
This pull request adds pristine-lfs support in a similar way to the existing pristine-tar support. Since pristine-lfs has a Python interface, I use that instead of the command-line interface, but in a fail-safe way, providing a plug raising an exception when pristine-lfs cannot be imported.
Unlike pristine-tar, pristine-lfs supports looking up tarballs by a version, so this implementation uses that instead of picking tarballs manually.
If both pristine-lfs and pristine-tar are enabled, tarballs are only imported with pristine-lfs; when checking out, pristine-lfs is preferred, but pristine-tar processed anyway if pristine-lfs is missing the necessary tarball.
This PR is still work in progress, it’s missing documentation and tests. Should I maybe add a subcommand for pristine-lfs too?
I’m also unsure what exceptions to raise depending on the error situation.