Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

always unlock jval #3791

Merged
merged 5 commits into from
Aug 27, 2019
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 3 additions & 1 deletion NEWS.md
Original file line number Diff line number Diff line change
Expand Up @@ -234,7 +234,9 @@

30. `groupingsets` functions now properly handle alone special symbols when using an empty set to group by, [#3653](https://github.com/Rdatatable/data.table/issues/3653). Thanks to @Henrik-P for the report.

31. Some operations in `j` could leave the output with `.data.table.locked=TRUE`, preventing mutation of the table downstream, [#2245](https://github.com/Rdatatable/data.table/issues/2245). Thanks @grayskripko for raising.
31. A `data.table` created using `setDT()` on a `data.frame` containing identical columns referencing each other would cause `setkey()` to return incorrect results, [#3496](https://github.com/Rdatatable/data.table/issues/3496) and [#3766](https://github.com/Rdatatable/data.table/issues/3766). Thanks @kirillmayantsev and @alex46015 for reporting, and @jaapwalhout and @Atrebas for helping to debug and isolate the issue.

32. `x[, round(.SD, 1)]` and similar operations on the whole of `.SD` could return a locked result, incorrectly preventing `:=` on the result, [#2245](https://github.com/Rdatatable/data.table/issues/2245). Thanks @grayskripko for raising.

#### NOTES

Expand Down
7 changes: 2 additions & 5 deletions R/data.table.R
Original file line number Diff line number Diff line change
Expand Up @@ -1265,6 +1265,8 @@ replace_order = function(isub, verbose, env) {
}

jval = eval(jsub, SDenv, parent.frame())
.Call(Csetattrib, jval, '.data.table.locked', NULL) # in case jval inherits .SD's lock, #1341 #2245. Use .Call not setattr() to avoid bumping jval's MAYBE_REFERENCED.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh thanks. i was just scratching my head about what's going on here, would've taken me quite some time to figure it out 👍


# copy 'jval' when required
# More speedup - only check + copy if irows is NULL
# Temp fix for #921 - check address and copy *after* evaluating 'jval'
Expand All @@ -1280,11 +1282,6 @@ replace_order = function(isub, verbose, env) {
jval = copy(jval) # fix for #1212
}
}
# #2245 jval can be data.table even if is.null(irows); unlock then as well
if (is.data.table(jval)) {
setattr(jval, '.data.table.locked', NULL) # fix for #1341
if (!truelength(jval)) alloc.col(jval)
Copy link
Member

@mattdowle mattdowle Aug 27, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This commit (b13fdb4) also removed this alloc.col() which seems redundant. But it must have been covered since we have 100% R coverage, and the R coverage detects uncovered branches on the same line. So to find how it was covered I added it back with a stop() in commit b30e0cf. One test 2074.05 hit that stop. Which was a coverage test added fairly recently thanks to Michael. That coverage test does trigger the alloc.col but there isn't any need for the alloc.col in that test. So let's remove it for now since I don't see why the alloc.col is needed. Returning a data.table should already be over allocated which is why the coverage test needed to manually construct an invalid data.table to return.

}

if (!is.null(lhs)) {
# TODO?: use set() here now that it can add new columns. Then remove newnames and alloc logic above.
Expand Down
22 changes: 19 additions & 3 deletions inst/tests/tests.Rraw
Original file line number Diff line number Diff line change
Expand Up @@ -15771,9 +15771,25 @@ test(2086.04, DT[ , sum(a), keyby = list()], data.table(V1=55L))
test(2086.05, DT[ , sum(a), by = character()], data.table(V1=55L))
test(2086.06, DT[ , sum(a), keyby = character()], data.table(V1=55L))

## #2245 unset .data.table.locked even if is.null(irows)
x <- data.table(a=c(0.85, -0.38, 1.19), b=c(0.56, 0.63, -1.30))
test(2087, x[ , round(.SD, 1)][ , c := 8], data.table(a=c(.8, -.4, 1.2), b=c(.6,.6,-1.3), c=8))
# simple queries can create tables with columns sharing the same address, #3766
x = data.table(a=1L, b=c(1L, 4L, 2L, 3L), c=4:1)
test(2087.1, x[a == 1L, .(b, b2=b)][ , identical(address(b), address(b2))])
# setkey detects and copies shared address columns, #3496
x = data.frame(a=paste0(2:1), stringsAsFactors=FALSE)
x$b = x$a
setDT(x)
test(2087.2, setkey(x, a, verbose=TRUE), data.table(a=paste0(1:2), b=paste0(1:2), key="a"),
output='Found and copied 1 column with a shared memory address')
x = data.frame(a=paste0(2:1), stringsAsFactors=FALSE)
x$b = x$a
x$c = x$a
setDT(x)
test(2087.3, setkey(x, a, verbose=TRUE), data.table(a=paste0(1:2), b=paste0(1:2), c=paste0(1:2), key="a"),
output='Found and copied 2 columns with a shared memory address')

# clear '.data.table.locked' even when is.null(irows), #2245
x = data.table(a=c(0.85, -0.38, 1.19), b=c(0.56, 0.63, -1.30))
test(2088, x[, round(.SD, 1)][, c:=8.88], data.table(a=c(.8, -.4, 1.2), b=c(.6,.6,-1.3), c=8.88))


###################################
Expand Down
6 changes: 3 additions & 3 deletions src/assign.c
Original file line number Diff line number Diff line change
Expand Up @@ -473,12 +473,12 @@ SEXP assign(SEXP dt, SEXP rows, SEXP cols, SEXP newcolnames, SEXP values)
(TYPEOF(values)!=VECSXP && i>0) // assigning the same values to a second column. Have to ensure a copy #2540
) {
if (verbose) {
Rprintf("RHS for item %d has been duplicated because NAMED is %d, but then is being plonked. length(values)==%d; length(cols)==%d)\n",
i+1, NAMED(thisvalue), length(values), length(cols));
Rprintf("RHS for item %d has been duplicated because NAMED==%d MAYBE_SHARED==%d, but then is being plonked. length(values)==%d; length(cols)==%d)\n",
i+1, NAMED(thisvalue), MAYBE_SHARED(thisvalue), length(values), length(cols));
}
thisvalue = duplicate(thisvalue); // PROTECT not needed as assigned as element to protected list below.
} else {
if (verbose) Rprintf("Direct plonk of unnamed RHS, no copy.\n"); // e.g. DT[,a:=as.character(a)] as tested by 754.3
if (verbose) Rprintf("Direct plonk of unnamed RHS, no copy. NAMED==%d, MAYBE_SHARED==%d\n", NAMED(thisvalue), MAYBE_SHARED(thisvalue)); // e.g. DT[,a:=as.character(a)] as tested by 754.3
MichaelChirico marked this conversation as resolved.
Show resolved Hide resolved
}
SET_VECTOR_ELT(dt, coln, thisvalue); // plonk new column in as it's already the correct length
setAttrib(thisvalue, R_NamesSymbol, R_NilValue); // clear names such as DT[,a:=mapvector[a]]
Expand Down
You are viewing a condensed version of this merge commit. You can view the full changes here.