feat: requirements and hints #595

jdidion · 2023-12-11T06:38:16Z

This PR adds the new requirements and hints sections, and deprecates runtime.

There are a few additions here that were not in the original spec for this feature but which I think are needed to make it complete:

Added the fpga requirement - similar to gpu, it is a Boolean that specifies whether the task requires a FPGA. The popularity of DRAGEN, and the increasing number of other FPGA accelerated bioinformatics tools, make this essential.
Added the gpu and fpga hints to enable more detailed requests for hardware accelerators (e.g., number and type).
Added the disks hint to enable more detailed requests for disk volumes (e.g., class and permissions).
Stipulated compute environment-specific hints can override task-specific hints
Deprecated the previously allowed behavior implied by setting allowNestedInputs: true where required task/subworkflow inputs could be left unsatisfied. Now all inputs either need to have a default value or have their value specified in the call inputs. Only optional task/subworkflow inputs that are not explicitly set in the call inputs may have their value set at runtime.

Checklist

Pull request details were added to CHANGELOG.md

Closes issues: #540, #541 #543

rhpvorderman

Excellent write-up. As a general discussion point, I strongly disagree with not allowing expressions in the hints section.

The SLURM cluster for instance requires jobs to be given a maximum runtime. This can be dynamically derived from the filesize when it is an expression. Currently we use time_minutes in the runtime section. Under the new scheme this would classify as a hint and thus expressions cannot be used.
Another solution is to add time to the requirements section, but I think this strategy is untenable. There are quite a lot of execution backends and as a consequence there will be unforeseen requirements that need to be solved in the hints section. Therefore allowing expressions in the hints section is preferable, so hints can be tuned to the input data. Since I work with widely disparate filesizes, that would make my life a lot easier. Furthermore, the machinery for expression evaluation is already present in the execution engines, so why not use it?

jdidion · 2023-12-11T15:28:20Z

@rhpvorderman I agree with you. I don't really remember the argument for hints being like meta rather than runtime but from an implementation perspective I don't feel like it would be adding that much additional complexity, and as you say it would make hints much more useful.

It would be great if others can weigh in on this PR, especially @mlin and someone from the Cromwell/Terra team (@cjllanwarne?). Hopefully we can discuss at the next meeting if there is disagreement.

jdidion · 2023-12-20T18:26:10Z

Two challenges with allowing expressions in the hints section:

Unlike requirements, some hints are expected to have object values. The inputs and outputs hints are the best example. If we are deprecating the Object type, then how does the user actually create these? Maybe we allow object literals only within hints?
Hints can be specified at the workflow level. Do we allow expressions there? Are expressions evaluated in the context of the workflow, or the task that inherits them?

jdidion · 2023-12-21T17:40:24Z

I addressed this by introducing a (yet another) new concept: scoped types. These are similar to hidden types in that they may only be declared by the execution engine, but different in that a) they are only declared in a specific scope and b) they may be instantiated by the user.

I introduced three types scoped to the hints section: input, output, and hints. The use of reserved words for naming is intentional, so that they don't conflict with existing user-defined types.

SPEC.md

markjschreiber

Looks great. Might be some places where we can be more explicit that I have mentioned in comments.

SPEC.md

markjschreiber

Request for minor rewording around the maxCPu andMaxMemory hints.

…nd hints types

…nt fail regardless of the value

jdidion requested review from mlin, geoffjentry, patmagee, cjllanwarne, illusional and rhpvorderman December 11, 2023 06:38

rhpvorderman reviewed Dec 11, 2023

View reviewed changes

jdidion self-assigned this Dec 17, 2023