Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

revdep msmtools (substitute in j) multiple failures after names(.SD) fix #6033

Closed
tdhock opened this issue Mar 29, 2024 · 18 comments · Fixed by #6238
Closed

revdep msmtools (substitute in j) multiple failures after names(.SD) fix #6033

tdhock opened this issue Mar 29, 2024 · 18 comments · Fixed by #6238
Assignees
Labels
revdep Reverse dependencies
Milestone

Comments

@tdhock
Copy link
Member

tdhock commented Mar 29, 2024

since merging #4163 we get the following new check failures, not sure why, @ColeMiller1 @MichaelChirico could you please investigate?


* checking examples ... ERROR
Running examples in 'msmtools-Ex.R' failed
The error most likely occurred in:

> base::assign(".ptime", proc.time(), pos = "CheckExEnv")
> ### Name: augment
> ### Title: A fast and general method for building augmented data
> ### Aliases: augment
> 
> ### ** Examples
> 
> # loading data
> data( hosp )
> 
> # 1.
> # augmenting hosp
> hosp_augmented = augment( data = hosp, data_key = subj, n_events = adm_number,
+                           pattern = label_3, t_start = dateIN, t_end = dateOUT,
+                           t_cens = dateCENS )
Warning in augment(data = hosp, data_key = subj, n_events = adm_number,  :
  no t_death has been passed. Assuming that dateCENS contains both censoring and death times
-------------------------------------
# # # # setting everything up # # # #
-------------------------------------
checking monotonicity of adm_number
Ok, adm_number is monotonic
---
checking label_3 and defining patterns
Ok, detected 3 values
---
augmenting data
Ok, data have been augmented
---
defining dimesions
Ok, dimensions computed
---
adding status flag
status flag has been added successfully 
---
adding numeric status flag
numeric status has been added successfully 
---
adding sequential status flag
sequential status flag has been added successfully 
---
adding variable augmented as new time variable
Error: LHS of := must be a symbol, or an atomic vector (column names or positions).
Execution halted
* checking for unstated dependencies in 'tests' ... OK
* checking tests ...
  Running 'testthat.R'
 ERROR
Running the tests in 'tests/testthat.R' failed.
Last 13 lines of output:
   7.   +-final[status == state[[1]], `:=`(substitute(t_augmented), get(t_start))]
   8.   \-data.table::`[.data.table`(...)
   9.     \-data.table:::stopf("LHS of := must be a symbol, or an atomic vector (column names or positions).")
  -- Error ('test_dataset_return.R:18:1'): (code run outside of `test_that()`) ---
  Error: LHS of := must be a symbol, or an atomic vector (column names or positions).
  Backtrace:
      x
   1. \-msmtools::augment(...) at test_dataset_return.R:18:1
   2.   +-final[status == state[[1]], `:=`(substitute(t_augmented), get(t_start))]
   3.   \-data.table::`[.data.table`(...)
   4.     \-data.table:::stopf("LHS of := must be a symbol, or an atomic vector (column names or positions).")
  
  [ FAIL 2 | WARN 3 | SKIP 0 | PASS 1 ]
  Error: Test failures
  Execution halted
* checking for unstated dependencies in vignettes ... OK
* checking package vignettes ... OK
* checking re-building of vignette outputs ... ERROR
Error(s) in re-building vignettes:
  ...
--- re-building 'msmtools.Rmd' using rmarkdown_notangle

Quitting from lines 80-87 [running_augment] (msmtools.Rmd)
Error: processing vignette 'msmtools.Rmd' failed with diagnostics:
LHS of := must be a symbol, or an atomic vector (column names or positions).
--- failed re-building 'msmtools.Rmd'

SUMMARY: processing the following file failed:
  'msmtools.Rmd'

Error: Vignette re-building failed.
@tdhock tdhock added the revdep Reverse dependencies label Mar 29, 2024
@ColeMiller1
Copy link
Contributor

The error is reproducible with:

dt = data.table(a = 1)
my_col = "a"
dt[, substitute(my_col) := .(3)]
## Error: LHS of := must be a symbol, or an atomic vector (column names or positions).

AFAIU their code, the most straightforward fix would be to patch their code to use (my_col) instead of using substitute(my_col). By the time of evaluating, my_col should be a character string and no substitution would be necessary.

https://github.com/contefranz/msmtools/blob/6788523ad2d7f749771d2d290c64dfb05b533984/R/augment.R#L617

I will research more this weekend to figure out why the change caused the failures. Maybe there's an easy tweak on our side.

@ColeMiller1 ColeMiller1 self-assigned this Mar 29, 2024
@jangorecki
Copy link
Member

I would say to fix it in DT, those are only pkgs on CRAN, there are many more.

@MichaelChirico
Copy link
Member

@ColeMiller1
Copy link
Contributor

I installed 1.15.2 and see the same error when I run the reprex. The code change for #4163 was small - from
eval(lhs, parent.frame(), parent.frame() to eval(lhs, list(.SD = {things_for_names_SD_to_work}), parent.frame()). At the same time, msmtools works with 1.15.2 but not with dev. I am not sure why the change causes the error.

I plan to make a change to check to see if lhs is substitute() and eval once. I also plan on setting this up as a warning to allow for deprecation - I am not sure we would support this if there weren't existing examples in the wild.

Regarding those examples - there are only 2 unique examples that are uncommented and 2 unique examples that are commented out. There were some duplicates with forks / mirrors and then there were also some that were better use cases of substitute like substitute(dt[, (something) :=, ...]). I'll likely reach out to those users, but I want to get the PR filed first (likely tomorrow or Sunday).

@tdhock tdhock added this to the 1.16.0 milestone Jun 5, 2024
@ben-schwen
Copy link
Member

@ColeMiller1 any update on this?

@ColeMiller1
Copy link
Contributor

@ben-schwen working on it now. I'll get it ASAP.

@tdhock
Copy link
Member Author

tdhock commented Jul 16, 2024

I thought this was supposed to be fixed by #6238? It is still showing up on revdep checker, https://rcdata.nau.edu/genomic-ml/data.table-revdeps/analyze/2024-07-16/

@tdhock tdhock reopened this Jul 16, 2024
@ColeMiller1
Copy link
Contributor

Thanks @tdhock I'll look again - I thought my MRE adequately reproduced the issue but it appears not.

@ColeMiller1
Copy link
Contributor

ColeMiller1 commented Jul 16, 2024

@tdhock are we sure there is still a problem? On master, I manually ran the vignettes and examples and didn't see any data.table issues. When I try to build msmtools, I get the same first error as the log:

--- re-building 'msmtools.Rmd' using rmarkdown_notangle
Quitting from lines 300-318 [multistate_model] (msmtools.Rmd)
Error: processing vignette 'msmtools.Rmd' failed with diagnostics:
object 'augmented_int' not found
--- failed re-building 'msmtools.Rmd'

@MichaelChirico
Copy link
Member

I've got it narrowed down to this line:

https://github.com/contefranz/msmtools/blob/6788523ad2d7f749771d2d290c64dfb05b533984/R/augment.R#L625C5-L625C104

I'm really struggling to come up with an MRE though. On CRAN data.table, that step produces a column named augmented_int, on dev the column is named t_augmented_int. Every similar example I can come up with similarly produces t_augmented_int on both CRAN and dev.

@ColeMiller1

This comment was marked as resolved.

@MichaelChirico
Copy link
Member

Hmm, but the downstream code doesn't error in the := call -- a column is created, with the "wrong" name. Later code assuming the "right" name is what causes the error. So I'm not convinced that's an MRE.

@ColeMiller1

This comment was marked as outdated.

@ColeMiller1
Copy link
Contributor

I have looked some - could I make a PR on their side changing the line that errors? I've done it on my local and the vignette does not produce the error. Whatever made CRAN version of data.table work with this, I am not sure, but I believe this is more of an issue on their side.

# old
final[ , paste( substitute( t_augmented ), '_int', sep = '' ) := as.integer( get( t_augmented ) ) ]

# new
final[ , paste0(t_augmented, '_int') := as.integer( get( t_augmented ) ) ]

A few lines up from what you found, there is also this code. It doesn't seem to be related but if t_augmented was provided, then I think the variable would always be added as t_augmented_int due to as.character(substitute(t_augmented)).

  if ( missing( t_augmented ) ) {
    t_augmented = 'augmented'
  } else {
    t_augmented = as.character( substitute( t_augmented ) )
  }

@tdhock
Copy link
Member Author

tdhock commented Jul 17, 2024

yes please send them a PR. You can use this issue template, https://github.com/Rdatatable/data.table/wiki/revdep-issue-template
Does that change our plan for what changes we need to make to data.table?
Do we need to change the tests or code added in #6238?

@tdhock tdhock changed the title revdep msmtools multiple failures revdep msmtools (substitute in j) multiple failures after names(.SD) fix Jul 17, 2024
@ColeMiller1
Copy link
Contributor

No additional changes need to be made on our side. The #6238 is still needed because that was an earlier error we encountered with this.

I will make the PR over there this evening. Thanks for the link to the template.

@MichaelChirico
Copy link
Member

I am not sure, but I believe this is more of an issue on their side.

I agree, the code looks unusual to me. Should work fine without substitute(); env= can be used if really needed.

@MichaelChirico
Copy link
Member

Downstream PR is filed and I just gave notice of the plan to release in August; closing here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
revdep Reverse dependencies
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants