Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

systemd service files and corresponding configs for Pelican services and Pelican-based OSDF services. #584

Merged
merged 26 commits into from
Feb 1, 2024

Conversation

matyasselmeci
Copy link
Contributor

@matyasselmeci matyasselmeci commented Jan 3, 2024

This will resolve #331

The osdf-cache and osdf-origin services are meant as a replacement for the 'stash-cache-auth' and the 'stash-origin-auth' xrootd instances of xcache, so they listen on the authenticated ports (8443 for the cache and 1095 for the origin).

@matyasselmeci matyasselmeci force-pushed the pr/systemd branch 2 times, most recently from 7a7f0ed to 6e9f697 Compare January 4, 2024 19:11
Copy link
Collaborator

@bbockelm bbockelm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Systemd services look mostly fine - just the question about whether we should also provide a sysconfig-based override.

The bigger question is about the config files. For the most part, what's filled in here is the same as the code defaults - my inclination would be to drop them.

Once you drop the defaults, then there are really only one or two settings. At that point, is it better to just have a single pelican.yaml/osdf.yaml shared by all services?

Obviously, Pelican needs to add a config.d-style system ASAP. If we assume one is done in February, does this change any of the approaches here?

systemd/osdf-cache.yaml Outdated Show resolved Hide resolved
systemd/osdf-cache.service Outdated Show resolved Hide resolved
DirectorUrl: https://director-origins.osg-htc.org
TopologyNamespaceURL: https://topology.opensciencegrid.org/osdf/namespaces
Server:
TLSCertificate: /etc/grid-security/xrd/xrdcert.pem
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there not a better system location?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not really. There's no 'standard' location for a host cert; Let's Encrypt puts the cert/key in /etc/letsencrypt/live/$domain for example. At least /etc/grid-security/xrd/xrdcert.pem is backward compatible with xcache-based cache/origin configurations; I can pick something else like /etc/pelican/{cert,key}.pem for the registry, director, and non-osdf configs.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought /etc/pki/tls/certs was standard? https://www.getpagespeed.com/server-setup/ssl-directory

systemd/osdf-cache.yaml Outdated Show resolved Hide resolved
systemd/osdf-cache.yaml Outdated Show resolved Hide resolved
systemd/osdf-director.yaml Outdated Show resolved Hide resolved
systemd/osdf-director.yaml Outdated Show resolved Hide resolved
@@ -0,0 +1,17 @@
Debug: false
Logging:
Level: "Error"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Default values, I think. Can be removed.

systemd/osdf-director.service Show resolved Hide resolved
DetailedMonitoringHost: xrd-mon.osgstorage.org
Mount: "/mnt/osdf"
Port: 1095
# Authfile: /run/stash-origin-auth/Authfile
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's remove commented-out lines (or add a comment about what they're for).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The plan is to do the latter.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually for now I'm going to uncomment them since we're going to need them until #564 is implemented.

@matyasselmeci
Copy link
Contributor Author

The reason I had all those options even though they're the defaults is they're the most likely ones for the admin to change and I don't want them to have to go back and forth to the documentation to figure out which ones they are. I plan to add explanatory comments as well (but I will need help writing some of them).

@matyasselmeci matyasselmeci force-pushed the pr/systemd branch 2 times, most recently from 7d0ed01 to a7b916d Compare January 17, 2024 03:23
bbockelm pushed a commit to opensciencegrid/Software-Redhat that referenced this pull request Jan 23, 2024
@matyasselmeci
Copy link
Contributor Author

I think this is ready for another look. For the service files, I added sysconfig files, restarting on failure, and using a working directory we'll be able to write core files to. (The RPM will handle creating that directory.)

For the configs, I added comments describing many of the options. Biggest change is to have log files (/var/log/osdf-cache.log, /var/log/pelican-registry.log, etc.) instead of logging to the systemd journal. The osdf-* services now start the osdf binary so I was able to drop some options.

Copy link
Collaborator

@bbockelm bbockelm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

General approach is OK. I'm still ambivalent on having separate config files per service versus a shared one. I can see pros and cons for both directions. I'm going to defer to your experience here.

Approved! Let's get this in and move on to the RPM piece.

Copy link
Contributor

@brianhlin brianhlin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor comments, the big one being the standard cert location.

I wish we could consolidate the service files down to 1 or 2. Maybe we could use the following vars?

       ├───────────┼─────────────────────────────────────────────────────┼──────────────────────────────────────────────────────────────────────┤
       │ "%i"      │ Instance name                                       │ For instantiated units this is the string between the first "@"      │
       │           │                                                     │ character and the type suffix. Empty for non-instantiated units.     │
       ├───────────┼─────────────────────────────────────────────────────┼──────────────────────────────────────────────────────────────────────┤
       │ "%p"      │ Prefix name                                         │ For instantiated units, this refers to the string before the first   │
       │           │                                                     │ "@" character of the unit name. For non-instantiated units, same as  │
       │           │                                                     │ "%N".                                                                │
       ├───────────┼─────────────────────────────────────────────────────┼──────────────────────────────────────────────────────────────────────┤

And have users run something like systemctl start pelican@director? I think the only thing that would prevent that is that the director/registry don't have a working dir specified and the cache/origin do

DirectorUrl: https://director-origins.osg-htc.org
TopologyNamespaceURL: https://topology.opensciencegrid.org/osdf/namespaces
Server:
TLSCertificate: /etc/grid-security/xrd/xrdcert.pem
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought /etc/pki/tls/certs was standard? https://www.getpagespeed.com/server-setup/ssl-directory

## Set Hostname to the external DNS name this can be accessed over, if
## different than the current hostname.
# Hostname:
Xrootd:
Copy link
Contributor

@brianhlin brianhlin Jan 31, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this case sensitive? If not, we should use XRootD

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

re. cert location: I guess your google-fu is better than mine: I didn't find a good standard last time I looked. FWIW, we need a cert/key with permissions for the xrootd user to read.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

more re. cert location: Also the page you linked mentioned a directory but not a file name -- that is, it said the cert should be named after the FQDN; obviously I can't put that in the default config, not knowing what it is.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I prefer mostly aligning with best practices, e.g. /etc/pki/tls/certs/pelican.{crt,key}, than maintaining the old-world order

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fine. #743

EnvironmentFile = -/etc/sysconfig/osdf-director
ExecStart = /usr/bin/osdf --config /etc/pelican/osdf-director.yaml director serve
Restart = on-failure
RestartSec = 20s
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see your commit removing the working dir for the director/registry but it's not clear to me why they don't need it

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose the director/registry can core dump too.

Most service files don't have a WorkingDirectory set; either the services run as root (in which case they can write their core files to /), or they run as non-root (in which case they start in their home directories and can write core files there). The reason we have to specify it in this case is that pelican starts as root but then spawns xrootd with reduced privileges (without changing the directory first).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mainly asked with the idea of consolidating service files

systemd/osdf-cache.yaml Show resolved Hide resolved
@brianhlin
Copy link
Contributor

Mat and I discussed the split of systemd files and its affect on packaging and appropriate location for OSDF vs Pelican bits (we feel that OSG Software should package OSDF bits in a downstream metapackage). For the time being, this split of packages is ok and should allow for flexible future changes.

Mat convinced me that pelican@director isn't the right UX as it's not exactly an instance but I could see an interface like pelican-director@osdf or something like that in the future.

I think the cert location is the only thing that is left to address here before a merge.

…r Pelican services

and Pelican-based OSDF services.

The osdf-cache and osdf-origin services are meant as a replacement for
the 'stash-cache-auth' and the 'stash-origin-auth' xrootd instances of
xcache, so they listen on the authenticated ports (8443 for the cache
and 1095 for the origin).
systemd conflicts are documented here: https://www.freedesktop.org/software/systemd/man/latest/systemd.unit.html#Conflicts=

If you try to start both services at the same time, it will be an error; otherwise, turning on a service turns off the service it conflicts with.
For OSDF origins and caches, add the commented-out option to switch back
to the /etc/grid-security/certificates directory
…nfigs

we'll need them until Pelican can do authfile/scitokens.conf generation from Topology automatically
osdf is started as root; it can add the capabilities it wants to
…ache-public.yaml and osdf-cache-public.service for the public cache instances
…reate an osdf-origin-public service file and yaml
Either /var/spool/osdf for the osdf services or /var/spool/pelican for the pelican services.
*Note*: the RPMs will have to create the directories.
…gin (which use xrootd), not the director and registry
…t values. Remove some options admins are less likely to change.
Issue PelicanPlatform#621 got fixed in main so we don't need it anymore
@matyasselmeci matyasselmeci marked this pull request as ready for review February 1, 2024 00:11
@matyasselmeci matyasselmeci merged commit 9474446 into PelicanPlatform:main Feb 1, 2024
9 checks passed
@matyasselmeci matyasselmeci deleted the pr/systemd branch February 1, 2024 15:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add pelican-server RPM
3 participants