Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update data.table dependency to version 1.14.1 once memory leakage problem is fixed #415

Closed
hongyuanjia opened this issue Mar 4, 2021 · 2 comments · Fixed by #542 or #543
Closed

Comments

@hongyuanjia
Copy link
Owner

See Rdatatable/data.table#3292. Since reading IDF is a quite common operation in eplusr, this makes R memory usage grow larger and larger.

dt <- data.table::data.table(letter = sample(LETTERS, 1e6, TRUE))
data.table::fwrite(dt, "test.csv")

for (i in 1:10) {
    cat("Before: ", sep = "")
    print(pryr::mem_used())
    dt <- data.table::fread("test.csv", header = FALSE)
    rm(dt)
    gc()
    cat("After: ", sep = "")
    print(pryr::mem_used())
}
#> Before:
#> Registered S3 method overwritten by 'pryr':
#>   method      from
#>   print.bytes Rcpp
#> 53.5 MB
#> After: 46.8 MB
#> Before: 46.8 MB
#> After: 47.6 MB
#> Before: 47.6 MB
#> After: 48.4 MB
#> Before: 48.4 MB
#> After: 49.2 MB
#> Before: 49.2 MB
#> After: 50 MB
#> Before: 50 MB
#> After: 50.8 MB
#> Before: 50.8 MB
#> After: 51.6 MB
#> Before: 51.6 MB
#> After: 52.4 MB
#> Before: 52.4 MB
#> After: 53.2 MB
#> Before: 53.2 MB
#> After: 54 MB

for (i in 1:10) {
    cat("Before: ", sep = "")
    print(pryr::mem_used())
    df <- read.csv("test.csv", header = FALSE)
    rm(df)
    gc()
    cat("After: ", sep = "")
    print(pryr::mem_used())
}
#> Before: 54 MB
#> After: 54.2 MB
#> Before: 54.2 MB
#> After: 54.2 MB
#> Before: 54.2 MB
#> After: 54.2 MB
#> Before: 54.2 MB
#> After: 54.2 MB
#> Before: 54.2 MB
#> After: 54.2 MB
#> Before: 54.2 MB
#> After: 54.2 MB
#> Before: 54.2 MB
#> After: 54.2 MB
#> Before: 54.2 MB
#> After: 54.2 MB
#> Before: 54.2 MB
#> After: 54.2 MB
#> Before: 54.2 MB
#> After: 54.2 MB

Created on 2021-03-04 by the reprex package (v1.0.0)

@microsat2022
Copy link

my data.table was 1.14.2. But it seemed that this issue was not solved.
Before: 322 MB
After: 333 MB
Before: 333 MB
After: 343 MB
Before: 343 MB
After: 354 MB
Before: 354 MB
After: 365 MB
Before: 365 MB
After: 375 MB
Before: 375 MB
After: 386 MB
Before: 386 MB
After: 396 MB
Before: 396 MB
After: 407 MB
Before: 407 MB
After: 418 MB
Before: 418 MB
After: 428 MB

@hongyuanjia
Copy link
Owner Author

Yes, that's expected. Unfortunately, the PR Rdatatable/data.table#3292 was still not merged in data.table v1.42.2.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment