-
Notifications
You must be signed in to change notification settings - Fork 308
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: requirements and hints #595
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Excellent write-up. As a general discussion point, I strongly disagree with not allowing expressions in the hints section.
The SLURM cluster for instance requires jobs to be given a maximum runtime. This can be dynamically derived from the filesize when it is an expression. Currently we use time_minutes
in the runtime
section. Under the new scheme this would classify as a hint and thus expressions cannot be used.
Another solution is to add time to the requirements section, but I think this strategy is untenable. There are quite a lot of execution backends and as a consequence there will be unforeseen requirements that need to be solved in the hints section. Therefore allowing expressions in the hints section is preferable, so hints can be tuned to the input data. Since I work with widely disparate filesizes, that would make my life a lot easier. Furthermore, the machinery for expression evaluation is already present in the execution engines, so why not use it?
@rhpvorderman I agree with you. I don't really remember the argument for It would be great if others can weigh in on this PR, especially @mlin and someone from the Cromwell/Terra team (@cjllanwarne?). Hopefully we can discuss at the next meeting if there is disagreement. |
Two challenges with allowing expressions in the
|
I addressed this by introducing a (yet another) new concept: scoped types. These are similar to hidden types in that they may only be declared by the execution engine, but different in that a) they are only declared in a specific scope and b) they may be instantiated by the user. I introduced three types scoped to the hints section: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great. Might be some places where we can be more explicit that I have mentioned in comments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Request for minor rewording around the maxCPu andMaxMemory hints.
…nt fail regardless of the value
966983a
to
74b1ca0
Compare
This PR adds the new
requirements
andhints
sections, and deprecatesruntime
.There are a few additions here that were not in the original spec for this feature but which I think are needed to make it complete:
fpga
requirement - similar togpu
, it is aBoolean
that specifies whether the task requires a FPGA. The popularity of DRAGEN, and the increasing number of other FPGA accelerated bioinformatics tools, make this essential.gpu
andfpga
hints to enable more detailed requests for hardware accelerators (e.g., number and type).disks
hint to enable more detailed requests for disk volumes (e.g., class and permissions).allowNestedInputs: true
where required task/subworkflow inputs could be left unsatisfied. Now all inputs either need to have a default value or have their value specified in the call inputs. Only optional task/subworkflow inputs that are not explicitly set in the call inputs may have their value set at runtime.Checklist
Closes issues: #540, #541 #543