Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor registry namespace check to be compatible with OSDF topology #1038

Merged
merged 10 commits into from
Apr 5, 2024

Conversation

haoming29
Copy link
Contributor

@haoming29 haoming29 commented Apr 3, 2024

Closes #1029

Other fixes

  • Added a Registry health component to the UI to should if registering namespace succeeded or not
  • Better status tracking/updates of the health component overall
  • Better error messages and plumbing

The updated policy is as follows (only for OSDF):

Given that a namespace exists in the topology, say /foo:

  • If an origin server has prefix /foo or /foo/**/**, and the admin starts the server without registering the namespace at registry website, the auto-registration in the origin will fail, with an error message to ask them register the namespace at the registry web UI:
The namespace <namespace> already exists in the OSDF topology. To register a Pelican equivalence, you need to present your identity. If you are registering through Pelican CLI, try again with the flag '--with-identity' enabled. If this is an auto-registration from a Pelican origin or cache server, register your namespace or server through the Pelican registry website at <registry url> instead.
  • If the admin attempts to register a namespace /foo or /foo/**/** via Pelican CLI (pelican namespace register) but without the --with-identity flag, the registration will also fail, with the same error message as above
  • If the admin register the namespace /fooor/foo//via Pelican CLI with--with-identityflag OR via the registry web UI. The registration will succeed. However, there will be a note in the registrationDescriptionfield as[ Attention: Prefix exists in OSDF topology ]` to warn the registry admin to pay extra attention when reviewing it.

This PR also removes topology check from checkNamespaceExists and function alike to maintain consistency.

Pitfall: This should fix all the incompatibility between Pelican namespace and OSDF legacy namespaces. However, one pitfall is that the director also fetches namespace from both OSDF and Pelican servers, and there can be two situations:

  • The Pelican origin server that serves the OSDF namespace prefix has the exactly the same origin server advertisement as we parsed from the topology. This way, the two server advertisement will race against each other every time we update topology/origin server advertises itself to the director. The consequence is unknown
  • The Pelican origin server has different fields in the origin advertisement, and we record both in the director TTL cache. Then we essentially have two servers serving the same namespace, which shouldn't be an issue, just make things a bit intertwined.

How to test

Here's the instruction to test the fix e2e:

  • Configuration file (Note the FederationPrefix, DbLocation, and Loggin.Level
Origin:
  Exports:
    - StoragePrefix: /tmp/stash
      FederationPrefix: /chtc/itb/<your mock name>
      Capabilities: ["Reads", "PublicReads", "DirectReads"]
Federation:
  DiscoveryUrl: "https://<your container hostname>:8444"
  TopologyNamespaceUrl: "https://topology.opensciencegrid.org/stashcache/namespaces.json"
TLSSkipVerify: true
Logging:
  Level: "Error"
Registry:
  InstitutionsUrl: "https://topology.opensciencegrid.org/institution_ids"
  DbLocation: "<new location>/<new file>.sqlite"
  • Build Pelican web UI by running make web-build in your Pelican root folder
  • Run Pelican binary with osdf alias.
    • First, run your federation in a box with the director and the registry
     ./osdf serve --module director,registry
    
    • Second, run your origin. Note that you need to change the origin web port to not have a conflict
    OSDF_SERVER_WEBPORT=<YOUR PORT> ./osdf origin serve
    
    • You will see an error message (non-blocking though) from your origin terminal, something like:
    ERROR[2024-04-04T18:31:17Z] Failed to register with namespace service: Failed to register prefix /chtc/itb/hmtest2: failed to register the namespace: You don't have permission to register the prefix: The namespace /chtc/itb/hmtest2 already exists in the OSDF topology. To register a Pelican equivalence, you need to present your identity. If you are registering through Pelican CLI, try again with the flag '--with-identity' enabled. If this is an auto-registration from a Pelican origin or cache server, register your namespace or server through the Pelican registry website at https://6ee28c5df997:8444 instead.: The POST attempt to https://6ee28c5df997:8444/api/v1.0/registry resulted in status code 403; will automatically retry in 10 seconds 
    
    • As the error suggests, you will need to go to the registry web UI to register your namespace first
    • If you go to the origin's web UI, the same error message will appear at the status section, under the Registry component
  • Before you close your origin (you don't have to though). Go to the public key endpoint and download your origin's public key at /.well-known/issuer.jwks
  • Go to your registry web UI, should be at https://localhost:8444, and register a new namespace with your origin's federation prefix and the public key
  • Switch to your origin terminal, wait for a minute and it should not complain about the error anymore
  • Go to your origin web UI, the error in the Registry component should be gone, and the Director component should report OK (green color)

If you want to take one step further, you can test if a file can be transferred through this origin and the federation (I tested locally and it works for me)

@haoming29 haoming29 added bug Something isn't working critical High priority for next release registry Issue relating to the registry component labels Apr 3, 2024
@haoming29 haoming29 added this to the v7.7.0 milestone Apr 3, 2024
@haoming29 haoming29 requested a review from jhiemstrawisc April 3, 2024 18:48
@haoming29
Copy link
Contributor Author

@jhiemstrawisc this one should be good to go. See PR description for how to test e2e locally.

@jhiemstrawisc
Copy link
Member

One comment about the error I get from serving the origin:

The namespace /chtc/itb/jhiemstra already exists in the OSDF topology

I think this is somewhat misleading, because that full namespace prefix isn't registered in topology. Is there a way to indicate specifically that in order to register a sub namespace of /chtc/itb (or maybe it's just /chtc, whichever is actually registered) the user has to follow the provided steps?

@haoming29
Copy link
Contributor Author

One comment about the error I get from serving the origin:

The namespace /chtc/itb/jhiemstra already exists in the OSDF topology

I think this is somewhat misleading, because that full namespace prefix isn't registered in topology. Is there a way to indicate specifically that in order to register a sub namespace of /chtc/itb (or maybe it's just /chtc, whichever is actually registered) the user has to follow the provided steps?

I agree the wording is kind of misleading, I can work on feeding the exact topology namespace that's taken but I think a quicker fix would be rewording the error message and say "part of the namespace ...." instead? I can do either way.

Copy link
Member

@jhiemstrawisc jhiemstrawisc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few cleanup items, but I'm convinced that this should solve our issue by giving federation admins a way to bypass the previous restrictions we had in place.

registry/registry.go Outdated Show resolved Hide resolved
registry/registry.go Show resolved Hide resolved
registry/registry_ui.go Outdated Show resolved Hide resolved
registry/registry_validation.go Outdated Show resolved Hide resolved
@haoming29 haoming29 requested a review from jhiemstrawisc April 5, 2024 15:01
Copy link
Member

@jhiemstrawisc jhiemstrawisc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@jhiemstrawisc
Copy link
Member

@turetske, I believe we need to patch this into 7.6 before EoD

@jhiemstrawisc jhiemstrawisc merged commit 8beb82c into PelicanPlatform:main Apr 5, 2024
19 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working critical High priority for next release registry Issue relating to the registry component
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Topology Namespaces don't include their public key information
2 participants