Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SDDF round 6 #49

Open
djhurio opened this issue Feb 7, 2021 · 5 comments
Open

SDDF round 6 #49

djhurio opened this issue Feb 7, 2021 · 5 comments

Comments

@djhurio
Copy link

djhurio commented Feb 7, 2021

Something strange is going on with the SDDF files for the round 6. It was working before but not any more.

For example import_sddf_country(country = "Albania", rounds = 6) returns an error:

> import_sddf_country(country = "Albania", rounds = 6)
Downloading ESS6
  |=====================================================================================| 100%
Error in vapply(dir_download, list.files, pattern = paste0(format_ext,  : 
  values must be length 1,
 but FUN(X[[1]]) result is length 2

However, I do not see anything wrong with the data file as this is working:

> download_sddf_country(country = "Albania", rounds = 6, output_dir = "~/data")
Downloading ESS6
  |=====================================================================================| 100%
All files saved to /home/djhurio/data
> haven::read_stata(file = "~/data/ESS_Albania/ESS6/ESS6_AL_SDDF.dta")
# A tibble: 1,201 x 6
   cntry  idno   psu samppoin stratify     prob
   <chr> <dbl> <dbl>    <dbl> <chr>       <dbl>
 1 AL        1    57       57 6        0.000730
 2 AL        2    80       80 9        0.000418
 3 AL        3   198      198 22       0.000193
 4 AL        4    57       57 6        0.000584
 5 AL        5    72       72 8        0.000427
 6 AL        6   185      185 20       0.0118  
 7 AL        7    49       49 5        0.000445
 8 AL        8    72       72 8        0.000427
 9 AL        9    49       49 5        0.000594
10 AL       10    55       55 6        0.000733
# … with 1,191 more rows

Thanks!

@cimentadaj
Copy link
Contributor

Weird, I can download this alright with the latest CRAN and Github version.

library(essurvey)
set_email("email here")
res <- import_sddf_country(country = "Albania", rounds = 6)
#> Downloading ESS6

head(res)
#> # A tibble: 6 x 6
#>   cntry  idno   psu samppoin stratify     prob
#>   <chr> <dbl> <dbl>    <dbl> <chr>       <dbl>
#> 1 AL        1    57       57 6        0.000730
#> 2 AL        2    80       80 9        0.000418
#> 3 AL        3   198      198 22       0.000193
#> 4 AL        4    57       57 6        0.000584
#> 5 AL        5    72       72 8        0.000427
#> 6 AL        6   185      185 20       0.0118

Can you reinstall and try this?

@djhurio
Copy link
Author

djhurio commented Feb 7, 2021

For some reason the import in the SPSS format is failing for Albania R6:

> import_sddf_country(country = "Albania", rounds = 6, format = "spss")
Downloading ESS6
  |======================================================================================| 100%
Error in foreign::read.spss(file = x, to.data.frame = TRUE, stringsAsFactors = FALSE,  : 
  error reading portable-file dictionary

As the process stopped with an error, the downloaded files are kept in the temp folders.

Now if I try to switch to the Stata format I am getting the error I was originally posting:

> import_sddf_country(country = "Albania", rounds = 6, format = "stata")
Downloading ESS6
  |======================================================================================| 100%
Error in vapply(dir_download, list.files, pattern = paste0(format_ext,  : 
  values must be length 1,
 but FUN(X[[1]]) result is length 2

This is because I have both data files in the SPSS and Stata formats in the corresponding temp folder.

After cleaning the temp folder, the import in Stata format works:

> unlink(file.path(tempdir(), dir(tempdir())), recursive = TRUE)
> import_sddf_country(country = "Albania", rounds = 6, format = "stata")
Downloading ESS6
  |======================================================================================| 100%
# A tibble: 1,201 x 6
   cntry  idno   psu samppoin stratify     prob
   <chr> <dbl> <dbl>    <dbl> <chr>       <dbl>
 1 AL        1    57       57 6        0.000730
 2 AL        2    80       80 9        0.000418
 3 AL        3   198      198 22       0.000193
 4 AL        4    57       57 6        0.000584
 5 AL        5    72       72 8        0.000427
 6 AL        6   185      185 20       0.0118  
 7 AL        7    49       49 5        0.000445
 8 AL        8    72       72 8        0.000427
 9 AL        9    49       49 5        0.000594
10 AL       10    55       55 6        0.000733
# … with 1,191 more rows

So the problem is with data in the SPSS format.

@briatte
Copy link
Contributor

briatte commented Apr 12, 2021

Do you think we should tryCatch() the read.spss function, in order to clean up afterwards (as you did) if something happens?

@djhurio
Copy link
Author

djhurio commented Apr 12, 2021

Yes, clean up of the temp directory after an error would solve the problem.

@briatte
Copy link
Contributor

briatte commented Apr 13, 2021

PR above is a quick and dirty fix that returns NULL (and a warning message) instead of an error.

FYI, the same file also fails with Stata 16:

. import spss using ~/Downloads/ESS6_AL_SDDF.spss/ESS6_AL_SDDF.por
error reading file
r(692);

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants