Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

looping with lubridate and dplyr causes r to crash. #330

Closed
cam333 opened this issue Jun 26, 2015 · 10 comments
Closed

looping with lubridate and dplyr causes r to crash. #330

cam333 opened this issue Jun 26, 2015 · 10 comments

Comments

@cam333
Copy link

cam333 commented Jun 26, 2015

I can consistently get R to crash with the following code.

See the Stackoverflow question that lead to this issue: http://stackoverflow.com/questions/30925313/dplyr-and-lubridate-chain-crashing-r?noredirect=1#comment49933044_30925313

library(dplyr)
library(lubridate)


dates <- data.frame(date = seq(ymd('2015-01-01'),ymd_hms('2019-06-20 23:00:00'),by = "hour"))

# Run this section a few times (simulate many calls in a shiny app)
for(i in 1:10)
  test_df <- dates %>%
    mutate(month = month(date),
           year = year(date))

Session info:

R version 3.1.3 (2015-03-09)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 7 x64 (build 7601) Service Pack 1

locale:
[1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United States.1252    LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C                           LC_TIME=English_United States.1252    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] lubridate_1.3.3 dplyr_0.4.2    

loaded via a namespace (and not attached):
 [1] assertthat_0.1 DBI_0.3.1      digest_0.6.8   magrittr_1.5   memoise_0.2.1  parallel_3.1.3 plyr_1.8.3     R6_2.0.1       Rcpp_0.11.6   
[10] stringi_0.4-1  stringr_1.0.0  tools_3.1.3  

Rstudio Log file error:

LOGGED FROM: void {anonymous}::rCleanup(bool) C:\Users\Administrator\rstudio\src\cpp\session\SessionMain.cpp:2311
18 Jun 2015 21:20:22 [rsession-cmohan] ERROR r error 4 (R code execution error) [errormsg=Error: cannot allocate vector of size 4.0 Gb|||]; OCCURRED AT: rstudio::core::Error rstudio::r::exec::{anonymous}::evaluateExpressionsUnsafe(SEXP, SEXP, SEXPREC**, rstudio::r::sexp::Protect*) C:\Users\Administrator\rstudio\src\cpp\r\RExec.cpp:149
@vspinu
Copy link
Member

vspinu commented Jun 26, 2015

I cannot reproduce. You are using old lubridate. Please try the github version. Also please try in plain R.

@bensoltoff
Copy link

I can reproduce this even with development versions of lubridate and dplyr. Happens in both 32 and 64 bit R, as well as in RStudio and base R. I started having this problem a couple weeks ago, possibly after updating one of the packages. One thing I noticed is that rsession.exe stays constant around 17% CPU usage and memory usage increases at a steady rate of about 200 KB/sec. It takes a few minutes before the R session actually crashes on me.

Session info:

R version 3.2.1 (2015-06-18)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 7 x64 (build 7601) Service Pack 1

locale:
[1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United States.1252    LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C                           LC_TIME=English_United States.1252    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] lubridate_1.4.0.9500 dplyr_0.4.2.9000    

loaded via a namespace (and not attached):
[1] magrittr_1.5   R6_2.0.1       assertthat_0.1 parallel_3.2.1 DBI_0.3.1      tools_3.2.1    Rcpp_0.11.6    stringi_0.5-5 
[9] stringr_1.0.0 

@bensoltoff
Copy link

Per a suggestion in the original Stack Overview thread, I reinstalled Rccp. Voila! It works!

As long as I don't close the R session. As soon as I restart it, the code causes a crash again.

@bensoltoff
Copy link

Modifying the code to perform the operation using data.table eliminates the error for me.

library(lubridate)
library(dplyr)
library(data.table)

dates <- data.table(date = seq(ymd('2015-01-01'),ymd_hms('2019-06-20 23:00:00'),by = "hour"))

# Run this section a few times (simulate many calls in a shiny app)
for(i in 1:10)
  test_df <- dates %>% .[,`:=`(
    month = month(date),
    year = year(date)
  )]

Session info:

R version 3.2.1 (2015-06-18)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 7 x64 (build 7601) Service Pack 1

locale:
[1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United States.1252    LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C                           LC_TIME=English_United States.1252    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] data.table_1.9.4     dplyr_0.4.2.9000     lubridate_1.4.0.9500

loaded via a namespace (and not attached):
 [1] plyr_1.8.3     R6_2.0.1       assertthat_0.1 magrittr_1.5   parallel_3.2.1 DBI_0.3.1      tools_3.2.1    reshape2_1.4.1
 [9] Rcpp_0.11.6    stringi_0.5-5  stringr_1.0.0  chron_2.3-47 

@vspinu
Copy link
Member

vspinu commented Jun 30, 2015

Could you please isolate the error?

Do I understand correctly that the crash happens in the for loop, right? That means that either year or month is to blame, if it were lubridate side. Could you please check which if it occurs only with year or month?

@bensoltoff
Copy link

I don't think the error is in lubridate. I downgraded dplyr to 0.4.1 and the code ran without a problem. Possibly related to this problem (Stack Overflow report)?

R version 3.2.1 (2015-06-18)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 7 x64 (build 7601) Service Pack 1

locale:
[1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United States.1252    LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C                           LC_TIME=English_United States.1252    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] lubridate_1.4.0.9500 dplyr_0.4.1         

loaded via a namespace (and not attached):
[1] lazyeval_0.1.10 magrittr_1.5    assertthat_0.1  parallel_3.2.1  DBI_0.3.1       tools_3.2.1     Rcpp_0.11.6    
[8] stringi_0.5-5   stringr_1.0.0  ```

@daltonhance
Copy link

I second the conclusion that the problem is in dplyr. I found this thread also thinking I had a problem with lubridate. I was getting weird hard crashes and knitting a word document in RMarkdown consistently failed on the same dplyr string containing a call to lubridate. Following besoltoff to the issue report in dplyr led me to downgrade dplyr. This also fixed it for me.

@tyokota
Copy link

tyokota commented Jul 17, 2015

I'm getting the same problem. Appears that lubridate and dplyr are frenemies at this point. I end up having to keep them separate-can't include functions from both packages in the same piping.

@adeldaoud
Copy link

@hadley
Copy link
Member

hadley commented Aug 2, 2015

This is likely to be dplyr related not lubridate. Get the dev version of dplyr

@hadley hadley closed this as completed Aug 2, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants