Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Segfault on large tables with character columns for merge() or unique() when getDTthreads() >> 100 #5186

Closed
wwang-walleye opened this issue Sep 30, 2021 · 3 comments

Comments

@wwang-walleye
Copy link

When running data.table on a machine with a lot of cores, we encountered a sneaky segfault issue that turned out to be threading related. The issue seems to crop up when calling unique or merging with a data.table that has a character column, unless OMP_THREAD_LIMIT is set appropriately low. I'm documenting below mostly in case some other poor soul encounters the same issue. This is trivial to work around otherwise, of course.

CentOS Linux release 7.8.2003 (Core)
AMD EPYC 7H12 64-Core Processor
R 4.0.2
data.table 1.14.2

Reproducible code below:

[wwang@...]$OMP_THREAD_LIMIT=128 R

R version 4.0.2 (2020-06-22) -- "Taking Off Again"
Copyright (C) 2020 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

> library(data.table)
data.table 1.14.2 using 128 threads (see ?getDTthreads).  Latest news: r-datatable.com
> n <- 1000
> m <- 200
> df <- data.table(a=as.character(rep(1:n,m)),y=sample(1:m, n*m, TRUE))
> setDTthreads(16)
> head(unique(df))
   a  y
1: 1 69
2: 2 50
3: 3 47
4: 4 53
5: 5 61
6: 6 21
> setDTthreads(64)
> head(unique(df))
   a  y
1: 1 69
2: 2 50
3: 3 47
4: 4 53
5: 5 61
6: 6 21
> setDTthreads(100)
> head(unique(df))
   a  y
1: 1 69
2: 2 50
3: 3 47
4: 4 53
5: 5 61
6: 6 21
> setDTthreads(126)
> head(unique(df))
*** Error in `/srv/walleye/local/lib64/R/bin/exec/R': double free or corruption (!prev): 0x00000000050effd0 ***
======= Backtrace: =========
/lib64/libc.so.6(+0x81299)[0x7f71ff3eb299]
/srv/walleye/local/share/rlib/data.table/libs/datatable.so(+0x2150e)[0x7f71fa91c50e]
/srv/walleye/local/share/rlib/data.table/libs/datatable.so(forder+0x999)[0x7f71fa91fd79]
@ben-schwen
Copy link
Member

Since unique.data.table calls forderv this might be related to #5077. Could you check if the error still appears on latest dev version 1.14.3 (thus, updating with update.dev.pkg())?

@wwang-walleye
Copy link
Author

wwang-walleye commented Sep 30, 2021

Nice! That newer dev version appears to solve this issue. Sorry about that!

OMP_THREAD_LIMIT=128 R

R version 4.0.2 (2020-06-22) -- "Taking Off Again"
Copyright (C) 2020 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

> library(data.table)
data.table 1.14.3 IN DEVELOPMENT built 2021-09-27 20:38:42 UTC; root using 128 threads (see ?getDTthreads).  Latest news: r-datatable.com
> n <- 1000
>  m <- 200
> df <- data.table(a=as.character(rep(1:n,m)),y=sample(1:m, n*m, TRUE))
> setDTthreads(100)
> head(unique(df))
   a   y
1: 1  67
2: 2 154
3: 3  75
4: 4  87
5: 5  68
6: 6 152
> setDTthreads(128)
> head(unique(df))
   a   y
1: 1  67
2: 2 154
3: 3  75
4: 4  87
5: 5  68
6: 6 152

@MichaelChirico
Copy link
Member

no worries, thanks for taking the effort to make it reproducible and reporting! this is the third time the bug has shown up in a few contexts -- glad it's fixed & glad our users are putting data.table to such good stress tests 😎

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants