-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Catalog for V2 API Implementation #653
Conversation
f438ee7
to
0761c38
Compare
I'm not sure this approach will work. The catalog handler will block the connection and given a large number of repositories could cause an idle socket timeout. This could be worked around by streaming repository entires back as soon as they are encountered during This won't make the response quicker, but will keep the connection active. |
Yeah.. it was meant to be quick and dirty, but you're correct in that it won't work for large numbers of repositories with slow storage. What would be a reasonable amount of repositories that we would want to support? |
maxEntries = maximumReturnedEntries | ||
} | ||
|
||
repos, err := storage.GetRepositories(ch, ch.App.driver, maxEntries, lastEntry) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't have enough time today to give you full details, but the handlers should not be directly making calls to storage. We'll need a new API method on distribution.Namespace
to provide a catalog object. Something like the following:
type Namespace interface {
...
Catalog(ctx context.Context) CatalogService
}
type CatalogService interface {
// Repositories fills p with up to len(p) entries, beginning after the provided q parameter.
Repositories(p []string, q string) (n int, err error)
}
May want to generalize the interpretation of q
to allow more interesting queries against the catalog in the future. Note that this is reentrant, so if you have a max value of n
entries, p
is filled with those entries, similar to the way readers work.
@@ -361,6 +361,48 @@ var routeDescriptors = []RouteDescriptor{ | |||
}, | |||
}, | |||
{ | |||
Name: RouteNameCatalog, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rather than copying this from #583, just rebase this branch on the commits from that PR branch. When everything is good to go, we can merge it as a unit.
ea85373
to
0e47b03
Compare
@@ -466,6 +470,7 @@ func (app *App) authorized(w http.ResponseWriter, r *http.Request, context *Cont | |||
repo := getName(context) | |||
|
|||
if app.accessController == nil { | |||
ctxu.GetLogger(context).Debug("access controller not set") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't need to be logged on every request.
How do you plan to authenticate this? Unless I'm missing something, this will either be unauthenticated, in which case the Catalog will be missing private images. Or it will be authenticated, in which case it is unclear what scope would be passed to the authentication endpoint to retrieve a Bearer token. |
@mattmoor The catalog API is considered an administration function. This is protected by a JWT access record with type="registry", name="catalog" and access="*" for the standard distribution/auth package. This is covered in the app.go portion of this PR. Other implementations may choose to use a more or less granular access model, but this endpoint is not meant for end users. If you have suggestions here, they would be appreciated, but this is a little late in the design process. |
@stevvooe Thanks for the pointer, so IIUC the auth endpoint would get: If this is not meant for end-user consumption, then maybe it should go in a separate doc covering administrative endpoints? I am not concerned with the design, just making sure I understand what this is (and isn't). thanks again. |
// Repositories returns a lexigraphically sorted catalog given a base URL. The 'entries' slice will be filled up to the size | ||
// of the slice, starting at the value provided in 'last'. The number of entries will be returned along with io.EOF if there | ||
// are no more entries | ||
func Repositories(ctx context.Context, baseURL string, entries []string, last string, transport http.RoundTripper) (int, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This client API doesn't follow the conventions of the rest of the API. Right now, the client doesn't have the notion of a repository or namespace, so this is better than anything.
Let's break this up into a struct, which takes the transport and baseURL, and a function that follows the signature of the Namespace
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I was nervous about this when I was writing it. The original way was more straightforward as it followed already used conventions, but was awkward because it seemed like overkill.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@stevvooe should I create a separate Namespace interface, or use distribution.Namespace? The problem with using distribution.Namespace is that I'll have to create additional functions for Scope() and Repository() which will be broken.
@pdevine This is almost ready to go. The only aspect I am weary about is the client API for this. |
// Namespace provides an interface for calling Repositories. It's intended to be a slimmed down | ||
// version of distribution.Namespace since that interface implements additional functions which are | ||
// currently not supported through namespaces. | ||
type Namespace interface { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just call this Catalog
. It will be confusing to mix this up with distribution.Namespace
. We'll fix up the naming later.
The way Repositories() was initially called was somewhat different than other parts of the client bindings because there was no way to instantiate a Namespace. This change implements a NewRegistry() function which changes it so that Repositories() can be called the way one would expect. It doesn't implement any of the other functions of Namespaces. Signed-off-by: Patrick Devine <patrick.devine@docker.com>
LGTM |
Catalog for V2 API Implementation
For the v2 catalog api, consider the following scenario: We need to get a list of image: tag. So we used the v2 catalog api to get the repository names and looped it to get the tags for each one to form a list of image: tags. Right now, we manually remove the repository for which zero tags are present. This leads to x number of extra http calls to be made for x single tag repositories. Is there any way such repositories can be removed from the v2 catalog. Please let me know, if there's any thing I have missed out. |
Also, we need the date time of creation or modification when the tags were created or modified. Right now, we are manually walking the storage folder and accessing the date. Is there any provision for the same. Its impossible to distinguish between newly created image:tags and the old ones. |
Apart from that, need to filter these repositories from pagination too. |
@monikakatiyar16 This issue was merged like a 1.5 years ago. What makes you think this is a reasonable venue to ask these kinds of questions? It sounds like you have questions on how to use the API. I first recommend that you read the specification and documentation. If you still have questions, then I'd recommend reaching out on IRC or the community slack. If you still can't answer your question, open an issue, describing what you tried and why it didn't work so we can better help to route your request. As a hint, I'd recommend taking a look at notifications. |
Agreed. This was not the right place to ask questions. I have read about notifications. The only problem is that the v2 catalog api (to list repositories) also lists the repositories for which all the tags have been deleted. I will create a new issue for the same with all the required details. Thanks. |
Mike and I have had a lengthy discussion before realizing that we were not interpreting these terms the same way [1]. This commit replaces some explicit object-type lists with "objects defined in the image specification", which is more concrete than using terms without local definitions. I've also added image-spec links where I do use object-type terms. And I've used the wordy but more explicit "groups of image indexes" instead of "repositories" in most cases. I've also explicitly included /v2/_catalog as out of scope. It's a higher-level endpoint than the image-spec objects. As Patrick [2] and Stephen [3] pointed out when the endpoint landed, it's for internal administration. Matt suggested keeping it out of the user-facing API on those grounds [4], and while that didn't happen in docker/distribution, I think we want to keep the endpoint out of our distribution spec in order to avoid burdening implementors (because as Patrick pointed out [2], it's an expensive endpoint unrelated to image push/pull. [1]: opencontainers#44 (comment) [2]: distribution/distribution#653 (comment) [3]: distribution/distribution#653 (comment) [4]: distribution/distribution#653 (comment)
Mike and I have had a lengthy discussion before realizing that we were not interpreting these terms the same way [1]. This commit replaces some explicit object-type lists with "objects defined in the image specification", which is more concrete than using terms without local definitions. I've also added image-spec links where I do use object-type terms. And I've used the wordy but more explicit "groups of image indexes" instead of "repositories" in most cases. I've also explicitly listed /v2/_catalog as out of scope. It's a higher-level endpoint than the image-spec objects. As Patrick [2] and Stephen [3] pointed out when the endpoint landed, it's for internal administration. Matt suggested keeping it out of the user-facing API on those grounds [4], and while that didn't happen in docker/distribution, I think we want to keep the endpoint out of our distribution spec in order to avoid burdening implementors (because as Patrick pointed out [2], it's an expensive endpoint unrelated to image push/pull). [1]: opencontainers#44 (comment) [2]: distribution/distribution#653 (comment) [3]: distribution/distribution#653 (comment) [4]: distribution/distribution#653 (comment)
Mike and I have had a lengthy discussion before realizing that we were not interpreting these terms the same way [1]. This commit replaces some explicit object-type lists with "objects defined in the image specification", which is more concrete than using terms without local definitions. I've also added image-spec links where I do use object-type terms. And I've used the wordy but more explicit "groups of image indexes" instead of "repositories" in most cases. I've also explicitly listed /v2/_catalog as out of scope. It's a higher-level endpoint than the image-spec objects. As Patrick [2] and Stephen [3] pointed out when the endpoint landed, it's for internal administration. Matt suggested keeping it out of the user-facing API on those grounds [4], and while that didn't happen in docker/distribution, I think we want to keep the endpoint out of our distribution spec in order to avoid burdening implementors (because as Patrick pointed out [2], it's an expensive endpoint unrelated to image push/pull). [1]: opencontainers#44 (comment) [2]: distribution/distribution#653 (comment) [3]: distribution/distribution#653 (comment) [4]: distribution/distribution#653 (comment)
Mike and I have had a lengthy discussion before realizing that we were not interpreting these terms the same way [1]. This commit replaces some explicit object-type lists with "objects defined in the image specification", which is more concrete than using terms without local definitions. I've also added image-spec links where I do use object-type terms. And I've used the wordy but more explicit "groups of image indexes" instead of "repositories" in most cases. I've also explicitly listed /v2/_catalog as out of scope. It's a higher-level endpoint than the image-spec objects. As Patrick [2] and Stephen [3] pointed out when the endpoint landed, it's for internal administration. Matt suggested keeping it out of the user-facing API on those grounds [4], and while that didn't happen in docker/distribution, I think we want to keep the endpoint out of our distribution spec in order to avoid burdening implementors (because as Patrick pointed out [2], it's an expensive endpoint unrelated to image push/pull). [1]: opencontainers#44 (comment) [2]: distribution/distribution#653 (comment) [3]: distribution/distribution#653 (comment) [4]: distribution/distribution#653 (comment)
Mike and I have had a lengthy discussion before realizing that we were not interpreting these terms the same way [1]. This commit replaces some explicit object-type lists with "objects defined in the image specification", which is more concrete than using terms without local definitions. I've also added image-spec links where I do use object-type terms. And I've used the wordy but more explicit "groups of image indexes" instead of "repositories" in most cases. I've also explicitly listed /v2/_catalog as out of scope. It's a higher-level endpoint than the image-spec objects. As Patrick [2] and Stephen [3] pointed out when the endpoint landed, it's for internal administration. Matt suggested keeping it out of the user-facing API on those grounds [4], and while that didn't happen in docker/distribution, I think we want to keep the endpoint out of our distribution spec in order to avoid burdening implementors (because as Patrick pointed out [2], it's an expensive endpoint unrelated to image push/pull). [1]: opencontainers#44 (comment) [2]: distribution/distribution#653 (comment) [3]: distribution/distribution#653 (comment) [4]: distribution/distribution#653 (comment)
Catalog for V2 API Implementation
Catalog for V2 API Implementation
This change adds a basic catalog endpoint to the API, which returns a list,
or partial list, of all of the repositories contained in the registry. Calls
to this endpoint are somewhat expensive, as every call requires walking a
large part of the registry.
Instead, to maintain a list of repositories, you would first call the catalog
endpoint to get an initial list, and then use the events API to maintain
any future repositories.
ALSO.. it could probably stand to use more tests, but I wanted to make certain I was
on the right track.
Signed-off-by: Patrick Devine patrick.devine@docker.com