-
Notifications
You must be signed in to change notification settings - Fork 51
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WIP: run housekeeping, prolog, epilog in the flux systemd instance #6662
Open
garlick
wants to merge
6
commits into
flux-framework:master
Choose a base branch
from
garlick:issue#6619
base: master
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Problem: the scripts flux-run-{epilog,prolog,housekeeping} and their templated systemd unit files are no longer needed. The job manager will now start the site-provided scripts directly with the IMP in a transient systemd unit in the Flux systemd instance. Remove the systemd unit files. Instead of removing the run scripts and forcing sites to update the IMP config to run the site scripts immediately, retain the run scripts for now and modify them to exec the site scripts (making them a no-op). We can remove the run scripts later.
Problem: a housekeeping comment references 'flux imp exec' when flux 'flux imp run' was intended. Fix the comment.
Problem: when housekeeping (re-)parses its configuration, it recreates the subprocess command template within the config parsing function, which will become unwieldy given planned changes. Without altering functionality, move the command template creation to a separate function. Also, drop the unused hk->imp_path member.
Problem: we'd like to start housekeeping in the flux systemd instance but it currently hard codes the "rexec" subprocess exec service. If exec.service is set in configuration, use that for housekeeping too. If using the IMP with sdexec, set command options for proxy signaling. Change the bulk exec name from "housekeeping" to "flux-housekeeping". This becomes the unit name under sdexec, which must begin with "flux-" to get the required IMP signal forwarding behavior
Problem: perilog_procdesc.uses_imp is unused Drop unused struct member.
Problem: we'd like to start prolog/epilog in the flux systemd instance but it currently hard codes the "rexec" subprocess exec service. If exec.service is set in configuration, use that for perilog too. If using the IMP with sdexec, set command options for proxy signaling. Change the bulk exec name from prolog/epilog to flux-prolog/flux-epilog". This becomes the unit name under sdexec, which must begin with "flux-" to get the required IMP signal forwarding behavior Although perilog never sends SIGKILL after it was disabled in flux-framework#6330, set a kill_signal to SIGUSR1 or SIGKILL in case we want to revert that change in the future.
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #6662 +/- ##
==========================================
- Coverage 83.88% 83.88% -0.01%
==========================================
Files 533 533
Lines 88536 88576 +40
==========================================
+ Hits 74266 74299 +33
- Misses 14270 14277 +7
|
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
as discussed in #6619, the proposed
sdmon
service (#6616) could be simplified if it didn't have to monitor the system systemd instance for prolog/epilog/housekeeping, in addition to monitoring the flux one for jobs.This moves prolog/epilog/housekeeping to the flux systemd instance. The
flux-run-prolog
type scripts now just exec the user scripts directly to give the admins time to convert the IMP run config to run the user scripts instead of the wrappers.I'm marking this as a WIP pending
I've tested this manually on my test cluster, including signaling, but haven't really tried to write tests yet.