Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix names_sd rev dep issue 6033 #6238

Merged
merged 6 commits into from
Jul 13, 2024
Merged

Conversation

ColeMiller1
Copy link
Contributor

Closes #6033 .

The root cause was that the caller would make a call as dt[, substitute(col)]. This meant our jsub would be substitute(col) which would only evaluate to the name col instead of the value of col. I was unable to figure out why #4163 caused this issue to come to light.

This is a quick fix meant for follow-up. The plan would be to:

  1. Make PRs at rev deps that have similar issues. There are less than 10 that Michael spotted in GitHub.
  2. Upgrade to a warning in 1.17.0
  3. Upgrade to an error in 1.18.0.
  4. Remove logic in 1.19.0.

I will start looking at rev deps again to fix.

Copy link

github-actions bot commented Jul 11, 2024

Comparison Plot

Generated via commit 1024792

Download link for the artifact containing the test results: ↓ atime-results.zip

Time taken to finish the standard R installation steps: 12 minutes and 0 seconds

Time taken to run atime::atime_pkg on the tests: 3 minutes and 44 seconds

@tdhock
Copy link
Member

tdhock commented Jul 11, 2024

hi Cole thanks very much
this looks good to me, what do you think about this plan @MichaelChirico ?

R/data.table.R Outdated Show resolved Hide resolved
@MichaelChirico
Copy link
Member

I'm not clear what's being deprecated, exactly -- the use of j=substitute(...)? I don't immediately see why we'd want to disallow that.

The fix here looks good but I wonder if we need to change anything in the longer-term, i.e., it looks fine to maintain this per my current understanding.

@ColeMiller1
Copy link
Contributor Author

That's a good point. Maybe deprecation is not needed.

I was leaning that way because we need to add additional NSE to support a use-case that probably doesn't make sense. We don't see iris[, substitute(my_cols)] in the wild and while dt[, substitute(my_cols) := 3] used to work, my guess is that it was more undocumented behavior than something intended.

My inclination is that we should not support that use case - the caller should substitute prior to it getting to us. But I would also say it's easier just to keep the check to determine if the lhs is substitute and it's not really that bad.

@MichaelChirico
Copy link
Member

We don't see iris[, substitute(my_cols)] in the wild

Isn't that what caused the revdep breakage though? Think I'm missing something

#6033 revdep. Slowly deprecate in 1.17.0. Caller has given us `dt[, substitute(names(.SD))]` which means
# jsub is actually substitute(names(.SD)) instead of just names(.SD)
if (lhs %iscall% 'substitute')
lhs = eval(lhs, parent.frame(), parent.frame())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need to double-eval()? Or can we do if (lhs %iscall% 'substitute') lhs = eval() else lhs = eval()?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's my understanding which is why I never understood how it used to work.

my_col = "a"
jsub = substitute(substitute(my_col))
eval(jsub, parent.frame(), parent.frame())
#  my_col
eval(eval(jsub))
# [1] "a"

At the same time, there's this head scratcher:

eval(jsub, list(my_col = "b"), parent.frame())
# [1] "b"

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Head scratcher indeed... OK I'm more of the opinion we shouldn't allow j=substitute(...), indeed it's hard to reason about what jsub = substitute(subsitute(...)) might do. We can start a deprecation process & see if there's a good use case, my guess is, the end user would be better served by some other approach though.

@ColeMiller1 ColeMiller1 changed the title fix names_sd rev dev issue 6033 fix names_sd rev dep issue 6033 Jul 12, 2024
@ColeMiller1
Copy link
Contributor Author

We don't see iris[, substitute(my_cols)] in the wild

Isn't that what caused the revdep breakage though? Think I'm missing something

I was unclear, my bad. A lot of times we compare data.table to data.frame or other packages. data.frame doesn't work with substitute.

Regardless, it seems like we're on the same page. I'll make a follow-up issue after this is merged to start deprecating the allowance of j = substitute(...).

@MichaelChirico MichaelChirico merged commit e7e09d8 into master Jul 13, 2024
3 of 4 checks passed
@MichaelChirico MichaelChirico deleted the names_sd_revdep_followup branch July 13, 2024 01:13
@MichaelChirico MichaelChirico added this to the 1.16.0 milestone Jul 28, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

revdep msmtools (substitute in j) multiple failures after names(.SD) fix
4 participants