Skip to content
This repository has been archived by the owner on Apr 17, 2023. It is now read-only.

WIP: Implement Catalog/Manifest API synchronization #241

Closed
wants to merge 11 commits into from

Conversation

lsamayoa
Copy link
Contributor

Adding this for early review.

Fixes:

Missing:

  • Uni tests for synchronize methods

@lsamayoa lsamayoa force-pushed the feature/ISSUE-199 branch from c9da99a to f105c65 Compare July 31, 2015 13:21
def client
RegistryClient.new(hostname, use_ssl)
end
memoize :client
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are cautious when it comes to introducing new dependencies: it takes time and resources to keep them up-to-date and secure.

In this case I would drop this gem and dome something simple like:

def client
  @client = RegistryClient.new(hostname, use_ssl) unless @client
  @client   
end

@flavio
Copy link
Member

flavio commented Aug 3, 2015

I think it would be better to address the issues with two separated PRs to make the review process easier.

I'm not particularly convinced by Registry#update_from_catalog!. The method is pretty long and it takes time to understand what is going on. I think it would be better to split it into smaller pieces.

As for the image layers, I was planning to store them using either closure_tree or awesome_nested_sets (more about the latter here). I'm think I was more inclined toward the 1st one, but I looked into them many months ago. However in both cases we will have to store the data in a different way.

@@ -26,6 +26,9 @@ gem 'search_cop'
# Pagination
gem 'kaminari'

# Memoization
gem 'memoist'
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with @flavio, we are careful when requiring new gems. This looks avoidable.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I used it so we can memoize a RegistryClient inside the Registry model. I can do the memoization manually doe.

@mssola
Copy link
Collaborator

mssola commented Aug 3, 2015

I have some problems with this PR. Let's see:

  1. As @flavio already pointed out, it tries to do too much. These are two non-trivial features, you shouldn't merge them in the same PR.
  2. Minor: you claim to fix both issues, but then in the commit message you always write "ISSUE-199", which makes hard to know which commit fixes things for the issue Take advantage of the Manifest API #200 .

Besides that, I don't see in your code when do you sync with the Catalog API. By that I mean that I can see in your code where you do the synchronization but I'm not sure when is that code actually executed. In my opinion there are two strategies here:

  1. We can use a separate process or something like Sidekiq that continuously performs the sync if needed (e.g. after 5min, sync if needed). You're not doing this in this PR.
  2. We can sync on push. That is, when a new tag gets pushed, sync. Something is telling me that you're striving for this strategy, but I don't see where you hook the synchronization mechanism on push either. Anyways, this strategy has the same flaw as in the current webhook mechanism.

In regards to the Catalog API, to be honest, I was already thinking on this and I'm still not decided on how is the best way to implement this feature. Note however that I'd prefer a solution that goes on the lines of my first point, since it's the safest way to guarantee that both the registry and the database will be in sync. Since we are asking you to try to divide this PR, if I were you I'd choose to just implement the Manifesto API, since I'm already researching on Catalog API. This doesn't necessarily mean that I "hate" your PR in regards to the Catalog API, but I'd prefer to keep things simple and allow you to work on the Manifesto API.

In regards to the Manifesto API, in general I like it, and I don't really have many thoughts put in this. I'd appreciate if you could consider @flavio's ideas ;)

Another issue that maybe is out of scope of these two issues but that I want to point out anyways: where and how should we show all the results that we gather from the Manifest API? I think that there should be a new issue for that, but we should also think about this as soon as possible because this affects to how we store the data from the manifest.

@mssola
Copy link
Collaborator

mssola commented Aug 3, 2015

Also @lsamayoa , note that we are updating the CONTRIBUTING.md file in the PR #248 . Since this is just a PR, this doesn't mean that these are the final contents of this file. This should help to avoid awkward situations like the one that just happenned here of "I was already working on this" ;)

Last but not least, I should've assigned myself into the issue #199 before, so sorry for the inconvenience :)

@lsamayoa
Copy link
Contributor Author

lsamayoa commented Aug 3, 2015

@mssola Catalog API is not that complex, it just lists the repos, so you have to actually use the Manifest API to update the whole repo.

I was already thinking about using sidekiq. But thought something like crono to be more appropiate so we don't have to add redis to the stack.

I was thinking that the synchronization should be indempotent and should only pull new stuff, not delete stuff is not there anymore, maybe we can mark stuff as deleted but I was not thinking that far.

@mssola
Copy link
Collaborator

mssola commented Aug 3, 2015

@lsamayoa I agree that is not that complex, but as you can see there are multiple options for doing this: its own process, sidekiq, crono, etc. So, in order to get this right, I'd prefer to investigate more on the different possibilities that we have and see which is the cleanest :)

Maybe we should discuss ideas in the issue page for this. Let me add the "discussing" label to that issue ;)

Since there's this discussion and we don't have a definitive candidate, that's enough reason to keep it in a different PR. Therefore, my suggestion would be to transform this PR into just the Manifest API. Also, take a look at the discussion on the issue #249 :)

Thanks for your work and your input, it's really appreciated btw :)

@mssola
Copy link
Collaborator

mssola commented Aug 12, 2015

Just as a heads up, the RegistryClient was basically unreliable, but this was fixed with the PR #259. Moreover, all we wanted from the Catalog API has been merged in the PR #260 (by using crono, which is really cool, thanks @lsamayoa for the suggestion ;) ).

Therefore, since the Catalog API is already implemented and this PR needs to be rebased and re-worked anyways, I suggest to close this PR for now.

@mssola mssola closed this Aug 12, 2015
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants