Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Weird consequence (bug?) of using setattr() #1281

Closed
tdeenes opened this issue Aug 20, 2015 · 7 comments
Closed

Weird consequence (bug?) of using setattr() #1281

tdeenes opened this issue Aug 20, 2015 · 7 comments
Assignees
Labels
Milestone

Comments

@tdeenes
Copy link
Member

tdeenes commented Aug 20, 2015

TRUE and FALSE are not affected by setattr(), as expected:

library(data.table)
out <- TRUE
setattr(out, "THIS", "is as expected")
out
# [1] TRUE
# attr(,"THIS")
# [1] "is as expected"
TRUE
# [1] TRUE
!FALSE
# [1] TRUE

However, consider the following (open a new console to exclude the possibility of any interference):

library(data.table)
out <- 3 > 0
out
# [1] TRUE
setattr(out, "WHY", "does it affect !FALSE?")
out # gets extra attribute, fine
# [1] TRUE
# attr(,"WHY")
# [1] "does it affect !FALSE?"
TRUE # no extra attribute, as expected
# [1] TRUE
!FALSE # ???, if TRUE is the return value of the `!` negation operator, it is "polluted"
# [1] TRUE
# attr(,"WHY")
# [1] "does it affect !FALSE?"

NOTE: the whole problem disappears if we ask for an explicit copy before the setattr() call (just insert a line: out <- copy(out) before the setattr() command).

My session info (with the very latest github version of data.table; the same problem also occurs on the CRAN version):

> sessionInfo()
R version 3.2.1 (2015-06-18)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Linux Mint LMDE

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] data.table_1.9.5

loaded via a namespace (and not attached):
[1] tools_3.2.1  chron_2.3-47
@tdeenes
Copy link
Member Author

tdeenes commented Aug 20, 2015

Oops, it must be related to the R internals. I just realized that the setattr() function in the bit package shows the very same behaviour. Any thoughts on this?

@eantonya
Copy link
Contributor

This is a really cool find :) It appears that TRUE does not have a fixed address, while !FALSE does (same goes for FALSE and !TRUE):

> address(TRUE)
[1] "00000000105DEAC8"
> address(TRUE)
[1] "00000000105E1370"
> address(!FALSE)
[1] "00000000003E0F20"
> address(!FALSE)
[1] "00000000003E0F20"

And that address of !FALSE is also the same as the address of any other logical that returns TRUE:

> address(3 > 0)
[1] "00000000003E0F20"

So that's what you're seeing, but I have no idea what the underlying design decisions are here.

@DavidArenburg
Copy link
Member

Maybe worth posting this in bugzilla

@arunsrinivasan
Copy link
Member

Not a bug in R. We've to account for length-1 vectors. I think this was a change in Rv3.1

@eantonya
Copy link
Contributor

@arunsrinivasan what does that mean?

@tdeenes
Copy link
Member Author

tdeenes commented Aug 20, 2015

Arun, you are right. The bug is restricted to length-1 vectors, for example calling setattr() on out <- c(1, 2) > 0 does not affect !FALSE or !c(FALSE, FALSE).

@arunsrinivasan arunsrinivasan added this to the v1.9.8 milestone Sep 22, 2015
@mattdowle
Copy link
Member

+1 @tdeenes nice find.
Yes, R internals added a global TRUE value that is reused for efficiency rather than create a new length-1 vector over and over again for the very often occurring value of a single TRUE. Similarly for FALSE. It's a nice and welcome change to R. When you type "TRUE" and "FALSE" at the prompt, though, R still allocates a new length-1 vector. That's just because they haven't got to that yet, if I remember the conversation correctly. Oddly, typing TRUE[1] and FALSE[1] return R's global addresses.
Aside: that R change broke data.table's := by reference on a single row table containing a logical column. But that was tested in the test suite and so it was caught and fixed that before R 3.1 was released.
At C level we can compare the input to setattr() to R's internal global TRUE and then i) return a copy with the attribute set and ii) issue a warning that it hasn't done it by reference.
Note that R core team plan to make the same efficiency change for common integer singletons too e.g. 0, 1, 2 and maybe up to 10. So we should add tests now in anticipation.
ScalarLogical(TRUE) at C level returns R's global TRUE value to compare to in setattr.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

5 participants