-
Notifications
You must be signed in to change notification settings - Fork 120
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How to debug a data loader #853
Comments
Was there anything in the logs of the preview (or build) script? |
In what way did it fail? How did you run it locally? Did you try running it with Rscript in the same directory? (That’s all that Framework does, so it’s surprising that it works outside of Framework but not inside it.) |
Well, things failed because I was reading some local configuration. I had to discover from which directory Rscript was being executing in order to specify the correct pathnames. I have something working to (data) load a subset of the airports in OurAirports like in the code snippet below: The file
the relevant dataloader library(dplyr)
library(readr)
library(here)
# put a marker file `.here` in root of the framework example
moni <- readr::read_csv(here::here("docs", "data", "monitored-airports.csv")) |>
dplyr::pull(icao)
apts_url <- 'https://raw.githubusercontent.com/davidmegginson/ourairports-data/main/airports.csv'
ctrs_url <- 'https://raw.githubusercontent.com/davidmegginson/ourairports-data/main/countries.csv'
regs_url <- 'https://raw.githubusercontent.com/davidmegginson/ourairports-data/main/regions.csv'
ctrs <- readr::read_csv(ctrs_url, na = c(""))
regs <- readr::read_csv(regs_url, na = c(""))
rwys <- readr::read_csv(rwys_url, na = c(""))
apts <- readr::read_csv(apts_url, na = c("")) |>
dplyr::filter(ident %in% moni) |>
dplyr::left_join(ctrs, by = c("iso_country" = "code"), suffix = c("", "_country")) |>
dplyr::left_join(regs, by = c("iso_region" = "code"), suffix = c("", "_region")) |>
dplyr::select(
id,
ident,
code_icao = ident,
iata_code,
type,
name,
latitude = latitude_deg,
longitude = longitude_deg,
elevation = elevation_ft,
iso_country,
name_country,
iso_region,
name_region,
continent,
) |>
dplyr::mutate(
name_continent = dplyr::case_when(
continent == "AF" ~ "Africa",
continent == "AN" ~ "Antarctica",
continent == "AS" ~ "Asia",
continent == "EU" ~ "Europe",
continent == "NA" ~ "North America",
continent == "OC" ~ "Oceania",
continent == "SA" ~ "South America",
.default = NA_character_
)
)
apts |>
readr::write_csv(stdout()) |
A simpler R data loader is the following for retrieving Italian medium/large airports library(dplyr)
library(readr)
apts_url <- 'https://raw.githubusercontent.com/davidmegginson/ourairports-data/main/airports.csv'
ctrs_url <- 'https://raw.githubusercontent.com/davidmegginson/ourairports-data/main/countries.csv'
regs_url <- 'https://raw.githubusercontent.com/davidmegginson/ourairports-data/main/regions.csv'
ctrs <- readr::read_csv(ctrs_url, na = c(""))
regs <- readr::read_csv(regs_url, na = c(""))
apts <- readr::read_csv(apts_url, na = c("")) |>
dplyr::left_join(ctrs, by = c("iso_country" = "code"), suffix = c("", "_country")) |>
dplyr::left_join(regs, by = c("iso_region" = "code"), suffix = c("", "_region")) |>
dplyr::filter(iso_country == "IT", type %in% c("medium_airport", "large_airport")) |>
dplyr::select(
id,
ident,
code_icao = ident,
iata_code,
type,
name,
latitude = latitude_deg,
longitude = longitude_deg,
elevation = elevation_ft,
iso_country,
name_country,
iso_region,
name_region,
continent,
) |>
dplyr::mutate(
name_continent = dplyr::case_when(
continent == "AF" ~ "Africa",
continent == "AN" ~ "Antarctica",
continent == "AS" ~ "Asia",
continent == "EU" ~ "Europe",
continent == "NA" ~ "North America",
continent == "OC" ~ "Oceania",
continent == "SA" ~ "South America",
.default = NA_character_
)
)
apts |>
readr::write_csv(stdout())
|
The critical info for the documentation, at least for an R data loader, is where |
Thanks for all the additional context! Where do you expect (or want) |
The way it is done currently is ok, I think it should just be documented. Data loader implementers will need to have this knowledge to eventually navigate their filesystem and retrieve complementary data as in my first example (reading a local file). |
It might make more sense to cd to the docs root rather than stay in the code root? |
My first example using the here("docs", "data", "monitored-airports.csv") picks |
No, I don’t think we should do that. (By analogy with JavaScript, dependencies are installed into node_modules at the project root, not within the source root.) |
I've been having exactly the same issue this morning. So +1 for documentation of this :). |
* document data loaders' cwd and import.meta.url closes #853 * format code * Update loaders.md --------- Co-authored-by: Mike Bostock <mbostock@gmail.com>
I have tried to implement an R data loader.
It works as a script but it errors in Observable Framework.
I was looking for ways/suggestions on how to debug data loaders in the documentation but failed.
Maybe it could be a useful addition...
The text was updated successfully, but these errors were encountered: