Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Registry catalog #87

Merged
merged 1 commit into from
Jun 26, 2018

Conversation

miminar
Copy link

@miminar miminar commented May 22, 2018

Implements the upstream repository catalog API call:

GET /v2/_catalog

Returned is a list of repository names as json:

200 OK
Content-Type: application/json
{ "repositories": [ <name>, ... ] }

Pagination is supported by mapping the last returned repository name to an opaque continue token received from master API and using it on the next invocation.

Task: https://trello.com/c/AZINw5qI

Associated origin PR to allow registry to list image stream: openshift/origin#19879

Documentation: openshift/openshift-docs#10102

TODO:

  • - tests
  • - use user token
  • - app holds just a lru cache

@openshift-ci-robot openshift-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. approved Indicates a PR has been approved by an approver from all required OWNERS files. labels May 22, 2018
@miminar miminar changed the title Registry catalog [WIP] Registry catalog May 22, 2018
@openshift-ci-robot openshift-ci-robot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label May 22, 2018
@miminar
Copy link
Author

miminar commented May 22, 2018

@dmage, @legionus PTAL

@bparees FYI

/test

) error {
var start string

osClient, err := r.client.Client()
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably user's token should be used instead. The registry cannot list imagestreams ATM, but we can be sure that user can thanks to the SAR check.

Otherwise list imagestreams needs to be added to system:registry cluster role.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

using the user's token seems reasonable to me...

return r.registry.Repository(ctx, name)
}

func (r *registry) Repositories(ctx context.Context, repos []string, last string) (n int, err error) {
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also a good candidate for metrics. Suitable as a follow-up after this and #79 are merged.

repositoryEnumerator RepositoryEnumerator
}

type registry struct {
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note to myself: Maybe worth its own file (to complement already existing registry_test.go).

@@ -55,16 +57,44 @@ type App struct {

// cache is a shared cache of digests and descriptors.
cache cache.DigestCache

// repositoryEnumerator allows to enumerate repository names known to registry
repositoryEnumerator RepositoryEnumerator
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could be just the lru cache of opaque blobs. Putting on TODO.

last string,
) (n int, err error) {
if len(repos) == 0 {
// Client explicitely requested 0 results. Returning nil for error seems more appropriate but this is
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

explicitly

// Client explicitely requested 0 results. Returning nil for error seems more appropriate but this is
// more in line with upstream does. Returning nil actually makes the upstream code panic.
// TODO: patch upstream? /_catalog?n=0 results in 500
return 0, errors.New("no space in slice")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we at least return a better error than "no space in slice", like "client requested zero entries"?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we'll use this API somewhere else, the message about the client will provide misleading information.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

still seems like there is a more informative message we can provide/bubble up. it's pretty much a self-inflicted+correctable error, so the error text should make that clear.. "no space in slice" makes it sound like an internal error where we ran out of memory and nothing can be done w/o changing code.

) error {
var start string

osClient, err := r.client.Client()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

using the user's token seems reasonable to me...

name := fmt.Sprintf("%s/%s", is.Namespace, is.Name)
if len(last) > 0 && name <= last {
if !warned {
context.GetLogger(ctx).Warnf("pagination not working -> filtering the full image stream list")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What should I do when I see this warning in the log?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Check if you run the registry against different version of master API. The message seemed too long already but I can add the suggestion.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this situation be handled by enumerateImageStreams, so that we don't need to spread this hack by the code?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure. I'll rework it.

case nil:
err = io.EOF
default:
context.GetLogger(ctx).Errorf("failed to enumerate repositories: %v", err)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need this? I would expect the upstream code to log this error.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right, the returned error is logged, I've removed the logging statement.

Continue: start,
})

if err != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The master may respond with 410 ResourceExpired, that should be treated as an indication to repeat the query without Continue.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, I'll handle the case.

}

if len(last) > 0 {
if c, ok := r.cache.Get(last); !ok {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we use this token twice?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, from what I understand, the same token can be returned to two different clients or to two different requests.

@miminar
Copy link
Author

miminar commented May 24, 2018

Addressed most of the comments.


if len(last) > 0 {
if c, ok := r.cache.Get(last); !ok {
context.GetLogger(ctx).Warnf("failed to find opaque continue token for last repository=%q -> requesting the full image stream list instead of %d items", last, limit)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should set warned=true?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not necessary, but it will add to legibility and robustness.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added

@miminar miminar changed the title [WIP] Registry catalog Registry catalog May 30, 2018
@openshift-ci-robot openshift-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label May 30, 2018
@miminar
Copy link
Author

miminar commented May 30, 2018

Added unit tests and squashed.

miminar pushed a commit to miminar/origin that referenced this pull request May 30, 2018
To sustain registry catalog API implemented in
openshift/image-registry#87

Not really necessary since registry will attempt to use user's token.
But a fall-back to registry client depends on this policy.

Signed-off-by: Michal Minář <miminar@redhat.com>
@miminar miminar force-pushed the repository-catalog branch 2 times, most recently from f2f9660 to 78f5f9e Compare May 30, 2018 12:03
@dmage
Copy link
Contributor

dmage commented Jun 7, 2018

My main concern that it will perform poorly behind load balancers without sticky sessions. And as I can see, the optimistic way when we have an opaque token is not tested.

If we are fine with slow requests, then LGTM.

@bparees
Copy link
Contributor

bparees commented Jun 7, 2018

My main concern that it will perform poorly behind load balancers without sticky sessions.

specifically the pagination you mean?

@miminar
Copy link
Author

miminar commented Jun 7, 2018

Our docker-registry's service has sessionAffinity=ClientIP by default. We can only hope that clients will not be too dumm to ask for all.

@openshift-bot openshift-bot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jun 20, 2018
@openshift-bot openshift-bot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jun 21, 2018
@miminar
Copy link
Author

miminar commented Jun 21, 2018

Rebased.

@miminar
Copy link
Author

miminar commented Jun 21, 2018

Re: ci/prow/e2e - Job failed. - Looks like broken environment:

failed to build "172.30.42.94:5000/e2e-test-prune-images-hnpr6/prune-schema2-all-images-false:latest" image: dial unix /var/run/docker.sock: connect: no such file or directory

/retest

@miminar
Copy link
Author

miminar commented Jun 21, 2018

@openshift/sig-continuous-infrastructure to rescue:

failed to build "172.30.74.83:5000/e2e-test-prune-images-dt99r/prune:latest" image: dial unix /var/run/docker.sock: connect: no such file or directory

@smarterclayton
Copy link
Contributor

Does test-prune-images assume it has access to the docker socket on the host where the e2e test runs? IF so it should be [local] and excluded, or fixed to not assume that.

@bparees
Copy link
Contributor

bparees commented Jun 21, 2018

yes it looks like hardprune.go is building an image locally to push into the registry in order to tightly control the structure of the image being produced.

@bparees
Copy link
Contributor

bparees commented Jun 21, 2018

@smarterclayton if we mark it [local], under what conditions will it still get run?

@bparees
Copy link
Contributor

bparees commented Jun 21, 2018

(marking it local here: openshift/origin#20061)

@openshift-docker
Copy link

openshift-docker commented Jun 21, 2018 via email

@bparees
Copy link
Contributor

bparees commented Jun 21, 2018

When run as part of a single node installation.

@smarterclayton right, so does any of our PR testing on either origin or image-registry exercise the extended tests on a single node installation? That is my fundamental question here... once i tag the test as [local] will we still know if something breaks it.

Extended tests shouldn’t depend on local access to the node. Use a
privileged pod if you need to check host stuff (and remember, you would
fail on crio).

I get it, trying to evaluate how urgent it is that we fix this so the test is being executed again.

@smarterclayton
Copy link
Contributor

smarterclayton commented Jun 21, 2018 via email

@bparees
Copy link
Contributor

bparees commented Jun 22, 2018

The master AWS job currently tests it. We will probably have a single node
oc cluster up test that may be able to use this in the future.

ok, good enough, thanks.

@miminar
Copy link
Author

miminar commented Jun 22, 2018

/retest

@legionus
Copy link
Contributor

/retest

Copy link
Contributor

@legionus legionus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@openshift-ci-robot openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Jun 25, 2018
Implements the upstream repository catalog API call:

    GET /v2/_catalog

Returned is a list of repository names as json:

    200 OK
    Content-Type: application/json
    { "repositories": [ <name>, ... ] }

Pagination is supported by mapping the last returned repository name to
the opaque continue token received from master API and using it on the
next invocation.

Signed-off-by: Michal Minář <miminar@redhat.com>
@openshift-ci-robot openshift-ci-robot removed the lgtm Indicates that a PR is ready to be merged. label Jun 26, 2018
@miminar
Copy link
Author

miminar commented Jun 26, 2018

Rebased.

@legionus
Copy link
Contributor

/lgtm
/retest

@openshift-ci-robot openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Jun 26, 2018
@openshift-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: legionus, miminar

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@miminar
Copy link
Author

miminar commented Jun 26, 2018

/retest

@openshift-merge-robot openshift-merge-robot merged commit 746afe4 into openshift:master Jun 26, 2018
miminar pushed a commit to miminar/origin that referenced this pull request Jul 9, 2018
To sustain registry catalog API implemented in
openshift/image-registry#87

Not really necessary since registry will attempt to use user's token.
But a fall-back to registry client depends on this policy.

Signed-off-by: Michal Minář <miminar@redhat.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants