WIP - support multi-image docker archives #975

vrothberg · 2020-06-29T13:25:40Z

Add a MultiImageArchive{Reader,Writer} to docker/archive to support
docker archives with more than one image.

To allow the new archive reader/writer to be used for copying images,
add an Image{Destination,Source} to copy.Options. When set, the
destination/source referenced will be ignored and the specified
Image{Destination,Source} will be used instead.

Signed-off-by: Valentin Rothberg rothberg@redhat.com

vrothberg · 2020-06-29T13:28:42Z

libpod PR to illustrate how I imagine it to be used -> containers/podman#6811

mtrmac

Just a very quick first pass for now.

copy/copy.go

docker/archive/multi-reader.go

docker/archive/multi-writer.go

docker/tarfile/dest.go

Signed-off-by: Valentin Rothberg <rothberg@redhat.com>

vrothberg · 2020-07-02T14:07:46Z

@mtrmac, mind taking another look? I've been starring at it too long.

mtrmac

From a quick look, this still seems to lean rather towards inheritance, almost towards monkey-patching rather than restructuring; most importantly I think we should not need two more ImageReference implementations.

I was thinking:

For sources:
- Split the part of tarfile.Source that contains archive-wide state (tarPath, removeTarPathOnClose, maybe some of the layer info) into a docker/internal/tarfile.ReadableArchive or so (no need to make this public? I might have to think about the public/private structure more. The name can certainly be better.)
- Keep the existing public tarfile.Source working, on top of that, but also add a way to create it with a caller-supplied ReadableArchive (in which case tarfile.Source does not drive deleting temporary files).
- Add a *ReadableArchive to archiveReference; in archive.newImageSource, if ref.readableArchive is set, use it to create a tarfile.Source; if not, use the old code path.
- Add support for lookup by tag and ID/index/something to archiveReference; it automatically applies both to the single-image and multi-image case.
- Then make MultiImageSource, which creates a ReadableArchive and provides an API to enumerate / create references.

And similarly for destinations.

mtrmac · 2020-07-02T16:22:23Z

docker/archive/dest.go

+
+// Reference returns an ImageReference embedding the MultiImageDestination.
+func (m *MultiImageDestination) Reference() types.ImageReference {
+	ref := &archiveReference{path: m.path}


This does not include the tag

Tags are set via AdditionalTags.

https://github.com/containers/libpod/pull/6811/files#diff-3303ff4a5ad91328550c7bee8df0be69R259

Yes, but not on the reference = not in copy.Image error messages.

OTOH there is still the case of saving untagged images, so references don’t always have any extra data anyway.

I’ll read all of this more carefully a bit later.

Much appreciated, thanks!

I am on PTO tomorrow but would will go back to this on Monday morning. Ideally, we need to get the feature in next week.

@mtrmac, ideas how to proceed?

@mtrmac @rhatdan ... can we get this moving or are we blocked on something? I am getting increasing pressure to get this done.

Can we please proceed?

I’m sorry for not getting back, I did promise I would.

Still, #975 (review) was I think fairly clear to the general direction. See #991 for an unfinished prototype of the read side. Yes, the PR is larger, but it already includes the ability to read any (even untagged) image in an archive using a textual reference, and a lot of the “new” code is actually only moved — tarfile.Source was just split into two, the only non-trivial net new code at that layer is just chooseManifest.

Thanks! I am still not sure how to proceed.

Now we have two PRs and I am worried we are running out of time; in fact, we might not get it into RHEL 8.3 any more.

If you agree, I want to focus on Podman-only needs first. We can still make follow-up cards for a more generally applicable solution, in case that will buy us some time. Certainly, the API shouldn't break.

docker/archive/src.go

mtrmac · 2020-07-02T16:27:17Z

docker/tarfile/dest.go

+	// Reset the repoTags to prevent them from leaking into a following
+	// image/manifest.
+	d.repoTags = []reference.NamedTagged{}
+	return nil


This is still stateful, then…

mtrmac

More detailed comments after a careful read, finally. Conceptually, I’d still prefer for the code paths to be shared as much as possible, instead of a layer on top that partially overrides behavior like the destination here.

mtrmac · 2020-07-27T21:33:48Z

docker/tarfile/src.go

-	return s.loadTarManifest()
+	if s.manifests != nil {
+		return s.manifests, nil
+	}


This is somewhat contra to the documentation of the function.

mtrmac · 2020-07-27T21:36:33Z

docker/archive/src.go

+// MultiImageSourceItem is a reference to _one_ image in a multi-image archive.
+// Note that MultiImageSourceItem implements types.ImageReference.  It's a
+// long-lived object that can only be closed via it's parent MultiImageSource.
+type MultiImageSourceItem struct {


Conceptually, I’m concerned about having two ImageReference implementations with different behavior but the same string syntax. ref.Transport().ParseReference(ref.StringWithinTransport()) is documented to be “equivalent to” ref; that’s easiest to do when there is just one implementation, and not the case here (with a docker-archive:$path-formatted references that successfully access images in multi-image archives, but fail when used from the CLI).

mtrmac · 2020-07-27T22:07:43Z

docker/archive/src.go

+}
+
+// Manifest returns the tarfile.ManifestItem.
+func (m *MultiImageSourceItem) Manifest() (*tarfile.ManifestItem, error) {


Nothing but RepoTags is really usable by callers, see e.g. the use of Config in containers/podman#6811 ; so I’m a bit reluctant to commit to this as an API. OTOH, we clearly can keep this stable.

mtrmac · 2020-07-27T22:10:39Z

docker/archive/dest.go

+// MultiImageDestinations allows for creating and writing to docker archives
+// that include more than one image.
+type MultiImageDestination struct {
+	*archiveImageDestination


Doesn’t this externally expose PutBlob and everything else?

mtrmac · 2020-07-27T22:14:38Z

docker/archive/dest.go

+
+// Close is a NOP.  Please use Finalize() for committing the archive and
+// closing the underlying resources.
+func (m *MultiImageDestination) Close() error {


Close() is conceptually a bit different from Finalize() (or ImageDestination.Commit); it allows cleaning up the temporary files even on error.

Sure, this is a possible way to structure the API, but it’s a bit inconvenient to use: A typically caller will use something like defer multiDest.Close() and might not even check for errors if there is already a failure with a more important root cause to preserve, whereas on success the caller really wants to check that multiDest.Finalize() did succeed. Having “commit” and “deallocate” be the same operation forces every such caller to have a committed flag that’s checked inside the defer, or to have a critical part of the process in a defer.

mtrmac · 2020-07-27T22:33:43Z

docker/tarfile/dest.go

+	// Reset the repoTags to prevent them from leaking into a following
+	// image/manifest.
+	d.repoTags = []reference.NamedTagged{}
+	return nil


… it’s also not documented that the caller is supposed to use (only) AddRepoTags after each PutManifest AFAICS.

vrothberg · 2020-08-12T09:40:20Z

Closing as @mtrmac took over. Thanks again!

vrothberg force-pushed the fix-610 branch from 8acbdbf to 7d0e22a Compare June 29, 2020 13:26

vrothberg changed the title ~~support multi-image docker archives~~ WIP - support multi-image docker archives Jun 29, 2020

vrothberg mentioned this pull request Jun 29, 2020

podman load/save: support multi-image docker archive containers/podman#6811

Merged

mtrmac reviewed Jun 29, 2020

View reviewed changes

vrothberg force-pushed the fix-610 branch from 7d0e22a to b1afb52 Compare July 2, 2020 14:03

WIP - multi-image docker archives

0e3c401

Signed-off-by: Valentin Rothberg <rothberg@redhat.com>

vrothberg force-pushed the fix-610 branch from b1afb52 to 0e3c401 Compare July 2, 2020 14:06

mtrmac reviewed Jul 2, 2020

View reviewed changes

mtrmac reviewed Jul 27, 2020

View reviewed changes

vrothberg closed this Aug 12, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

WIP - support multi-image docker archives #975

WIP - support multi-image docker archives #975

vrothberg commented Jun 29, 2020

vrothberg commented Jun 29, 2020

mtrmac left a comment

vrothberg commented Jul 2, 2020

mtrmac left a comment

mtrmac Jul 2, 2020

vrothberg Jul 2, 2020

mtrmac Jul 2, 2020

mtrmac Jul 2, 2020

vrothberg Jul 2, 2020

vrothberg Jul 6, 2020

vrothberg Jul 16, 2020

vrothberg Jul 24, 2020

mtrmac Jul 25, 2020

vrothberg Jul 27, 2020

mtrmac Jul 2, 2020

mtrmac left a comment

mtrmac Jul 27, 2020

mtrmac Jul 27, 2020

mtrmac Jul 27, 2020

mtrmac Jul 27, 2020

mtrmac Jul 27, 2020

mtrmac Jul 27, 2020

vrothberg commented Aug 12, 2020

WIP - support multi-image docker archives #975

WIP - support multi-image docker archives #975

Conversation

vrothberg commented Jun 29, 2020

vrothberg commented Jun 29, 2020

mtrmac left a comment

Choose a reason for hiding this comment

vrothberg commented Jul 2, 2020

mtrmac left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mtrmac left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

vrothberg commented Aug 12, 2020