-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
array.jobs supported for SLURM systems? #23
Comments
Unfortunately, array jobs are currently not supported by the future.batchtools backends - each future is submitted as an individual job. I am aware that this is less than ideal and support for array jobs would be awesome. However, how the awareness/concept of an array of jobs should be brought into the Future API is not 100% clear, i.e. it's a design decision that holds us back from supporting it. The main object is that any feature added to the Future API should be work the same regardless of backends. If this is not possible, it may be added as an optional feature, but the concept of optional features is yet to be designed. The way I can see it being introduced is via higher level future APIs, e.g. a1 <- future(1)
a2 <- future(2)
b1 <- future(1)
b2 <- future(2) Maybe it can be done as: a1 <- future(1, lazy = TRUE)
a2 <- future(2, lazy = TRUE)
a <- c(a1, a2)
b1 <- future(1, lazy = TRUE)
b2 <- future(2, lazy = TRUE)
b <- c(a1, a2) where the concatenation (or some other mechanism) will trigger the futures to be launched. |
Would it be possible to do a "simple" approach where we set a max array size parameter someplace, and have future.batchtools spread what would have been individual jobs across the array instead? e.g. given a max array size of 100, with 250 iterations needed, and setting an array=T sort of thing, it would create three array jobs, (1:100, 101:200, 201:250)? It's just an issue of distributing the iterations across array instances instead of individual jobs. As I'm sure you are aware, many schedulers dislike (and often limit) a large number of individual jobs, but combined with array jobs you can functionally achieve lots of jobs. With batchtools, we:
|
So, this is a tricky design problem that is hard to solve any time soon - it needs lots of thoughts and work. But for your every day work, are you aware of: plan(batchtools_slurm, workers = 100L) It will cause y <- future.apply::future_lapply(X, ...) to chunk up |
I'm trying to use future.batchtools with a SLURM system that heavily suggests using array jobs: e.g. makeClusterFunctionsSlurm(...,array.jobs=T) but I can't seem to figure out how to pass that option via future.batchtools. Any ideas?
The text was updated successfully, but these errors were encountered: