-
Notifications
You must be signed in to change notification settings - Fork 129
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Helper function for creating file targets with multiple files #257
Comments
@tiernanmartin It's a good point. I was actually trying to solve this sort of problem once and for all in #232, but as I explain here, it is extremely difficult to make EDIT: 2018-02-24I modified the next bit to be FAQ-friendly. The "first solution" that @tiernanmartin refers to next is actually the solution for Solution for
|
Thanks for the explanation and for pointing me toward the In the meantime, I'll use the first approach you recommended. Quick question: I notice that the file targets in the first solution you demonstrated lack file extensions (e.g.,
Is there a way to convert those |
Sorry, I forgot about the way |
Is there a reason why a target cannot be a directory? In this example, I realized that the command
Having But when I tried implementing it I see the following error:
Reprexlibrary(drake)
library(sf)
spatial_data <- st_read(system.file("shape/nc.shp", package = "sf"))
plan <- drake_plan(
spatial_data = st_write(spatial_data, "spatial_data", driver = "ESRI Shapefile"),
strings_in_dots = "literals",
file_targets = TRUE
)
plan
## # A tibble: 1 x 2
## target command
## <chr> <chr>
## 1 'spatial_data' "st_write(spatial_data, \"spatial_data\", driver = \"ESR~
make(plan)
## cache C:/Users/UrbanDesigner/AppData/Local/Temp/Rtmp40h7BI/.drake
## connect 2 imports: plan, spatial_data
## connect 1 target: 'spatial_data'
## check 2 items: spatial_data, st_write
## check 1 item: 'spatial_data'
## target 'spatial_data'
## Writing layer `spatial_data' to data source `spatial_data' using driver `ESRI Shapefile'
## features: 100
## fields: 14
## geometry type: Multi Polygon
## Error: The specified pathname is not a file: spatial_data |
I believe the
library(digest)
> digest("~/projects/", file = TRUE)
Error: The specified pathname is not a file: /home/landau/projects/
> file.exists("~/projects")
[1] TRUE Directory targets would be nice to have, but I do not think it is |
Oops: forgot this issue was about more than just directory hashes. Reopening. |
I just updated #257 (comment) to be more FAQ-friendly, and this thread is now part of our automatically-generated FAQ. I think we can close. We should discuss potential further development on #12. |
FYI: the best practices guide now has detailed guidance on output file targets, including the main drawback and main alternative to the workaround we talked about earlier in the thread. |
@wlandau awesome work implementing this feature 🎉 You asked for a shapefile workflow so I did my best to put something together: Drake Shapefile Example# SETUP -------------------------------------------------------------------
library(tibble)
library(purrr)
library(sf)
library(drake) # devtools::install_github("ropensci/drake")
# PLAN --------------------------------------------------------------------
make_place <- function(Name, Latitude, Longitude){
tibble(Name = Name,
Latitude = Latitude,
Longitude = Longitude) %>%
st_as_sf(coords = c("Longitude", "Latitude")) %>%
st_set_crs(4326)
}
st_write_multiple <- function(..., file_outputs){
pwalk(list(...), st_write)
}
u_auckland_plan <- drake_plan(u_auckland = make_place(Name = "University of Auckland", Latitude = -36.8521369, Longitude = 174.7688785),
u_aukland_shapefile = st_write_multiple(list(u_auckland), dsn = file_out("u-auckland.shp"), driver = "ESRI Shapefile",delete_dsn=TRUE,
file_outputs = file_out(c("u-auckland.prj","u-auckland.shx","u-auckland.dbf"))),
strings_in_dots = "literals")
u_auckland_plan
make(u_auckland_plan)
# TEST --------------------------------------------------------------------
file.remove("u-auckland.shp")
make(u_auckland_plan)
file.remove("u-auckland.prj")
make(u_auckland_plan)
file.remove("u-auckland.dbf")
make(u_auckland_plan)
It is more complicated than I hope it would be. The Perhaps someone else who is experimenting with using |
@tiernanmartin Thanks for the quick start on this example! I am optimistic. library(drake)
pkgconfig::set_config("drake::strings_in_dots" = "literals")
u_auckland_plan <- drake_plan(
u_auckland = make_place(
Name = "University of Auckland",
Latitude = -36.8521369,
Longitude = 174.7688785
),
u_aukland_shapefile = {
file_out("u-auckland.prj", "u-auckland.shx", "u-auckland.dbf")
st_write(
obj = u_auckland,
dsn = file_out("u-auckland.shp"),
driver = "ESRI Shapefile",
delete_dsn = TRUE
)
}
) I am not sure my changes are totally correct because Warning: target u_aukland_shapefile warnings:
GDAL Error 1: u-auckland.shp does not appear to be a file or directory. On the other hand, the four files do appear, including |
Ah I didn't realize that commands we so flexible. Your code looks good to me! The GDAL error is annoying but not a deal breaker. The reason it shows up is because I set |
I just noticed there is a typo - here's the complete version with your suggested revisions: library(tibble)
library(sf)
library(drake)
pkgconfig::set_config("drake::strings_in_dots" = "literals")
make_place <- function(Name, Latitude, Longitude){
tibble(Name = Name,
Latitude = Latitude,
Longitude = Longitude) %>%
st_as_sf(coords = c("Longitude", "Latitude")) %>%
st_set_crs(4326)
}
u_auckland_plan <- drake_plan(
u_auckland = make_place(
Name = "University of Auckland",
Latitude = -36.8521369,
Longitude = 174.7688785
),
u_auckland_shapefile = {
file_out("u-auckland.prj", "u-auckland.shx", "u-auckland.dbf")
st_write(
obj = u_auckland,
dsn = file_out("u-auckland.shp"),
driver = "ESRI Shapefile",
delete_dsn = TRUE
)
}
)
make(u_auckland_plan)
file.remove("u-auckland.shp")
make(u_auckland_plan)
file.remove("u-auckland.prj")
make(u_auckland_plan)
file.remove("u-auckland.dbf")
make(u_auckland_plan) |
Thanks, @tiernanmartin. This is nice inspiration for a chapter in the docs. |
FYI: effective #795, you can write |
I have a command that creates multiple files every time it runs. The command writes a spatial object in the shapefile format which results in the creation of four files:
All shapefiles need these four file types in order to work properly (actually, they need 3 of the 4 but that's irrelevant to this example).
This creates issues for any plan that includes this
st_write()
command. For instance, if I have the following plan:The plan is only tracking one of the four necessary file targets (
spatial_data.shp
), and if I were to delete any of the untracked files and re-run the plan it would tell me that all targets are already up to date.Could there be a function that allows users to create a list of file targets that come from a single command?
Thanks!
The text was updated successfully, but these errors were encountered: