Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

file_in/out within dynamic targets #1140

Closed
2 tasks done
jennysjaarda opened this issue Jan 20, 2020 · 5 comments
Closed
2 tasks done

file_in/out within dynamic targets #1140

jennysjaarda opened this issue Jan 20, 2020 · 5 comments
Assignees

Comments

@jennysjaarda
Copy link

Prework

Proposal

Hi Will,

Will file_in() and file_out() be supported within dynamic targets in the future? Or is this too difficult to implement? I am writing figures to different sub folders which are defined within the dynamic target and the files are not saved to one directory (so I cannot simply track the entire directory).

Thanks in advance,
Jenny

@wlandau
Copy link
Member

wlandau commented Jan 20, 2020

I knew someone would ask for this eventually. I did consider it, but it would be an arduous uphill battle against the original design of drake, and I do not think we would gain much from it in the end.

If you put all your sub-folders under one monolithic folder, you can achieve a similar effect.

drake_plan(
  x = settings_vector,
  y = target(
    visualize(x, figures = file_out("all_figures")),
    dynamic = map(x)
  )
)

It is not perfect because any change to any figure in all_figures/ invalidates all the dynamic sub-targets of y. However, if you leave those files alone, then drake responds to changes in settings_vector more efficiently, keeping some-targets up to date while re-building others.

library(drake)
fs::dir_create("all_figures")

write_file <- function(x, dir) {
  dir <- file.path(dir, x)
  writeLines("lines", dir)
}

plan <- drake_plan(
  file_names = c(1L, 2L, 3L, 4L),
  write_files = target(
    write_file(file_names, dir = file_out("all_figures")),
    dynamic = map(file_names)
  )
)

make(plan)
#> target file_names
#> dynamic write_files
#> subtarget write_files_0b3474bd
#> subtarget write_files_b2a5c9b8
#> subtarget write_files_71f311ad
#> subtarget write_files_98cf3c11
#> aggregate write_files

# Break a file
writeLines("broken", "all_figures/1")

# Rebuild all the sub-targets
make(plan)
#> dynamic write_files
#> subtarget write_files_0b3474bd
#> subtarget write_files_b2a5c9b8
#> subtarget write_files_71f311ad
#> subtarget write_files_98cf3c11
#> aggregate write_files

# Plan for a different set of output files
plan <- drake_plan(
  file_names = c(1L, 2L, 5L, 6L),
  write_files = target(
    write_file(file_names, dir = file_out("all_figures")),
    dynamic = map(file_names)
  )
)

# Rebuild only some of the sub-targets
make(plan)
#> target file_names
#> dynamic write_files
#> subtarget write_files_0a86c9cb
#> subtarget write_files_cb15b01f
#> aggregate write_files

# Plan for *all* the files built so far. 
plan <- drake_plan(
  file_names = c(1L, 2L, 3L, 4L, 5L, 6L),
  write_files = target(
    write_file(file_names, dir = file_out("all_figures")),
    dynamic = map(file_names)
  )
)

# The directory gets re-hashed,
# but no sub-targets need to build.
make(plan)
#> target file_names
#> dynamic write_files
#> aggregate write_files

Created on 2020-01-20 by the reprex package (v0.3.0)

Related: #1127.

@wlandau wlandau closed this as completed Jan 20, 2020
@wlandau
Copy link
Member

wlandau commented Jan 20, 2020

Hmm... maybe that reprex is not such a good way to do things. It is possible to trick drake into accepting an old set of files.

library(drake)
fs::dir_create("all_figures")

write_file <- function(x, dir) {
  dir <- file.path(dir, x)
  writeLines("lines", dir)
}

plan <- drake_plan(
  file_names = c(1L, 2L, 3L, 4L),
  write_files = target(
    write_file(file_names, dir = file_out("all_figures")),
    dynamic = map(file_names)
  )
)

make(plan)
#> target file_names
#> dynamic write_files
#> subtarget write_files_0b3474bd
#> subtarget write_files_b2a5c9b8
#> subtarget write_files_71f311ad
#> subtarget write_files_98cf3c11
#> aggregate write_files

# Change what the files are supposed to contain
write_file <- function(x, dir) {
  dir <- file.path(dir, x)
  writeLines("lines2", dir)
}

plan <- drake_plan(
  file_names = c(1L, 2L),
  write_files = target(
    write_file(file_names, dir = file_out("all_figures")),
    dynamic = map(file_names)
  )
)

make(plan)
#> target file_names
#> dynamic write_files
#> subtarget write_files_0b3474bd
#> subtarget write_files_b2a5c9b8
#> aggregate write_files

# Go back to all the files.
plan <- drake_plan(
  file_names = c(1L, 2L, 3L, 4L),
  write_files = target(
    write_file(file_names, dir = file_out("all_figures")),
    dynamic = map(file_names)
  )
)

make(plan)
#> target file_names
#> dynamic write_files
#> aggregate write_files

# wrong contents
readLines("all_figures/3")
#> [1] "lines"

Created on 2020-01-20 by the reprex package (v0.3.0)

@wlandau
Copy link
Member

wlandau commented Jan 20, 2020

For what it's worth, file_in() should be safe for dynamic targets. But again, any change to anything in file_in() will invalidate all the sub-targets.

@jennysjaarda
Copy link
Author

For anyone else looking for a workaround regarding file_out() with dynamic targets, this is the solution we came up with. Sometimes writing is necessary if the results are being pushed to a non-R program and used as an input. The problem with the approach below is that writing is done in sequence, not parallel. This can take a long time with many dynamic targets, so I think it would be a fantastic feature to be able to incorporate it as a dynamic target itself - but understand it may be simply too much work and not the intended design of drake! Regardless, here's the workaround:

library(drake)

dir.create("all_figures")

make_lines <- function() {
  output <- c("lines")
}

write_file <- function(content, file_names, dir) {
  for(i in seq_len(length(file_names))) {
    file_out <- paste0(dir, "/", file_names[i], ".txt")
    writeLines(content[[i]], file_out)
  }
}

plan <- drake_plan(
  file_names = c(1L, 2L, 3L, 4L),
  lines = target(
    make_lines(), dynamic = map(file_names)
  ),
  write_lines = write_file(lines, file_names, dir = file_out("all_figures"))
)

make(plan)
#> target file_names
#> dynamic lines
#> subtarget lines_0b3474bd
#> subtarget lines_b2a5c9b8
#> subtarget lines_71f311ad
#> subtarget lines_98cf3c11
#> aggregate lines
#> target write_lines

readLines("all_figures/1.txt")
#> [1] "lines"

make(plan)
#> All targets are already up to date.

@wlandau
Copy link
Member

wlandau commented Feb 22, 2020

Update: I suggest taking a look at #1178.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants