Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
resource: conditionally monitor sdmon.idle
Problem: nodes are not checked for untracked running work when a Flux instance starts up. This might happen, for example, if - job-exec deems job shell(s) unkillable - housekeeping/prolog/epilog gets stuck on a hung file system When systemd is enabled, the new sdmon module joins a 'sdmon.idle' on startup. If there are any running flux units, this is delayed until those units are no longer running. Change the resource module so that it monitors sdmon.idle instead of broker.online when systemd is enabled. This will withhold "busy" nodes from the scheduler until they become idle. Fixes flux-framework#6590
- Loading branch information