-
Notifications
You must be signed in to change notification settings - Fork 978
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unclear class for argument and return value of DT #5559
Comments
I hesitate to comment here, because I'm a newbie to I wholeheartedly agree with #OfekShilon: I too "... don't expect methods of one class to operate on another - on the contrary, I expect the class system to help me by isolating their respective functionalities." I tentatively suggest that the I'd also suggest the Introduction to data.table vignette be amended, so that it include a warning against manipulating I hope it is clear that the above are only suggestions from a newbie to R who isn't your usual sort of newbie -- because I have extensive experience with other languages, I prefer to develop in a strongly-typed language, I am happy to "hack" throwaway code in a weakly-typed scripting language, and my only prior experience with statistical programming was way back in the early 1990s -- when I used S and emacs! I'm hugely impressed with the current status of R. After my initial foray into emacs/ess, RStudio was a huge relief... although I'm still struggling with the basics e.g. I can't be bothered to figure out how to ensure that Thanks to the lovely documentation by Hadley Wickham, and so many years of well-directed behind-the-scenes work by core R developers over the years, I'm now making great progress on my rev of vote_2.3.2! |
@cthombor If I understand your intention correctly there is already an attempt to give something like "a warning against manipulating data.table objects in interpretive environments which don't include the data.table library" - check cedta, which is checked in the beginning of most (all?) data.table method calls. This is a different matter but I do feel there is some commonality - maybe the underlying bias is to try and help everyone with everything they try to do with data.tables. Which is very generous, but often leads to fragile design. Anyway you'd probably draw better attention for your problem if you open a new issue with concrete examples. Can you give examples of silent promotions of data.frames to data.tables? |
@OfekShilon thanks for your response. I have no idea whether there are any silent promotions of data.frames to data.tables; but their possible existence is one of the (many!) gaps in my knowledge which makes it difficult for me to discover the root cause(s) of the invalid Another gap in my knowledge: does And yes I believe we're on the same page with the difficulties (both to users and to its devs!) of supporting use-cases for |
@OfekShilon thanks for the pointer to cedta. However I won't expect every base method which could copy (or corrupt) a The cheat sheet for DT lists only two ways to create a Warnings can be very confusing to a newbie like me -- indeed quite distressing, if it's unclear what one can do to reliably avoid triggering such warnings in the future -- but I doubt any of my stumblings with .SD syntax and semantics would have included an attempt to use a data.frame as a value for .SD, so deprecating this particular promotion with a warning wouldn't have bothered me. Hmmm... maybe a search for |
You can use the non-exported |
While experimenting with a potential fix for #5286 this test of
DT
surfaced:This line in
[.data.table
:Seems to indicate that the desired behavior is to return a data.table. However
ans
isn't the variable that is returned directly, later on the code restores the class to data.frame before returning:The returned object has class data.frame but still has data.table attributes (namely
.internal.selfref
).Before suggesting fixes I believe some design decisions need to be explicitly discussed:
DT
operates freely on non-data-tables. The only discussion I could find of it is this single comment, and I'm not sure this is the best design.[.data.table
, as the name implies, is a method of data.table. It might accidently work in some (even most) scenarios on data.frames, but is a hefty piece of code that was designed to use data.table attributes and trying to open it would mean an avalanche of bugs and a world of pain. In general, as a user I don't expect methods of one class to operate on another - on the contrary, I expect the class system to help me by isolating their respective functionalities.In particular, rejecting non-data.table's would make
shallow
adds .internal.selfref to an input data.frame #5286 (and probably others, both current and future) moot.[.data.table
and other package methods I'd suggest usingsetDF
orsetDT
instead of settingclass
directly - assetDF
cleans up data.table-only attributes, that manual-class-set misses.The text was updated successfully, but these errors were encountered: