-
Notifications
You must be signed in to change notification settings - Fork 129
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Get dependencies of knitr reports automatically. #9
Comments
Maybe |
Interface: the > plan(report.md = "report.Rmd",
file_targets = TRUE, strings_in_dots = "filenames")
target command
1 'report.md' 'report.Rmd' then drake would do a preprocessing step to turn that into target command
1 'report.md' drake::knit_drake('report.Rmd', report.Rmd_dependencies)
2 report.Rmd_dependencies c("this_target", "that_target", "'other_file'") The preprocessing step should happen inside |
The most reliable way to get the code chunks will probably be to dig into the internals of |
At the very least, |
|
To be clear, I plan to use |
I forgot to mention: this is a funny situation where imports (e.g. plan(report.md = knit('report.Rmd'), strings_in_dots = "filenames", file_targets = TRUE)
## target command
## 1 'report.md' knit('report.Rmd') In the workflow graph, |
Almost forgot: need to exclude code chunks with 'eval = FALSE'. I wonder if CodeDepends knows how. |
As I said, we may be able to solve this issue with CodeDepends. From the code chunks of dynamic reports, I want to
Strings are candidates for file targets, and inputs are candidates for other targets. For non-file targets, it is vitally important to only detect inputs to library(drake) # Should detect nothing.
var <- 10 # Should detect nothing
# Strings: "could_be_a_file", "so_could_this"
var2 <- list(17, var, "could_be_a_file", "so_could_this")
# Inputs: large.
f(readd(large) + var)
# Strings: "small".
print(drake::readd(target = "small", character_only = TRUE, path = "subdir") + var)
# Strings: "small".
print(drake:::readd("small", character_only = TRUE, cache = NULL) + var)
# Inputs: regression2_large, small. Strings: large
f(loadd(regression2_large, small, list = "large"), var) First attempt at an input collector: library(CodeDepends)
drake_handler <- function(e, collector, ...){
args <- as.list(e)[-1] # Arguments to readd() or loadd()
# Arguments passed to ... in loadd()
include <- !nchar(names(args))
if(!any(include)){
dots <- args[[1]]
} else {
dots <- args[include]
}
candidates <- c(args[["target"]], dots)
collector$vars(as.character(args), input = TRUE)
}
string_handler <- function(name){
browser()
strings <<- c(strings, name)
}
col <- inputCollector(
readd = drake_handler,
loadd = drake_handler,
string = string_handler
)
x <- readScript("report.Rmd") %>%
getInputs(collector = col) So far, there are too many inputs in too many places, particularly |
A first attempt without # From https://github.com/duncantl/CodeDepends/blob/master/R/sweave.R#L15
get_tangled_frags <- function(doc, txt = readLines(doc)) {
in.con <- textConnection(txt)
out.con <- textConnection("bob", "w", local = TRUE)
on.exit({
close(in.con)
close(out.con)
})
knitr::knit(in.con, output = out.con, tangle = TRUE, quiet = TRUE)
code <- textConnectionValue(out.con)
parse(text = code)
}
wide_deparse <- function(x){
paste(deparse(x), collapse = "")
}
library(magrittr)
find_targets <- function(expr, targets = character(0)){
if (is.function(expr)){
return(find_targets(body(expr), targets = targets))
} else if (is.call(expr) & length(this_call <- as.list(expr)) > 1){
if(deparse(this_call[[1]]) %in% c("readd", "loadd")){
symbols <- Filter(this_call[-1], f = is.symbol)
targets <- c(targets, deparse(symbols)) %>%
unlist %>%
unique
}
deepen_search <- sapply(this_call, function(x){
grepl("readd|loadd", wide_deparse(x))
})
lapply(this_call[deepen_search], find_targets, targets = targets)
} else if (is.recursive(expr)){
v <- lapply(as.list(expr), find_targets, targets = targets)
targets <- unique(c(targets, unlist(v)))
}
targets
}
x <- get_tangled_frags("test.Rmd")
find_targets(x) # incorrectly returns `character(0)` With
|
I now have some code in the |
An easier approach than the current code:
|
The master branch now has a fix. Now, if |
Sorry it took me a bit to look at this. One thing that jumps out is here:
So you created candidates, but you're using the full args when you register variables with the walker. It's hard to be more specific without the Rmd you want to operate on so that I can actually play with it. |
Thank you for your input, @gmbecker! I think that mistake is definitely part of the problem. However, I am still incorrectly seeing Just so you know, I am not in a rush. I do want to use this example to get better at library(CodeDepends)
library(magrittr)
drake_handler <- function(e, collector, ...){
args <- as.list(e)[-1] # Arguments to readd() or loadd()
# Arguments passed to ... in loadd()
include <- !nchar(names(args))
if(!length(include)){
dots <- args[[1]]
} else if(!any(include)){
dots <- args[[1]]
} else {
dots <- args[include]
}
candidates <- as.character(c(args[["target"]], dots))
collector$vars(candidates, input = TRUE)
}
string_handler <- function(name){
strings <<- c(strings, name)
}
col <- inputCollector(
readd = drake_handler,
loadd = drake_handler,
string = string_handler
)
x <- readScript("report.Rmd")
getInputs(x, collector = col)
|
It should be possible to
.Rmd
or.Rnw
file extension.readd()
orloadd()
But this would miss external files. On the other hand, scanning for any mention any target in any code chunk might be too aggressive.
The text was updated successfully, but these errors were encountered: