Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Keep IPCW results in the list column format predicted by the predict() methods #937

Merged
merged 19 commits into from
Apr 1, 2023

Conversation

topepo
Copy link
Member

@topepo topepo commented Mar 22, 2023

The current code unnests the data. This is a problem if we also are using static predictions (like the predicted event time).

This PR keeps the list format and adds the required columns to each tibble within the list.

R/ipcw.R Show resolved Hide resolved
R/ipcw.R Show resolved Hide resolved
R/ipcw.R Show resolved Hide resolved
@topepo topepo marked this pull request as ready for review March 22, 2023 21:59
@topepo topepo requested a review from hfrick March 22, 2023 21:59
topepo added a commit to tidymodels/extratests that referenced this pull request Mar 23, 2023
(could be `NULL` but not `FALSE`)
Copy link
Member

@hfrick hfrick left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like that it just expects the output of predict(), that seems sensible. I fell down a rabbit hole a little bit while looking at this code so this review also contains elements that are likely out of scope for this PR. I wanted to jot them down though so we don't loose it.

Specifically on the unnesting/this PR:

The .filter_eval_time() functions doesn't get used anymore now which means infinite time points can get through to the prodlim function which returns the probabilities of being censored (which makes it error). I supposed add_graf_weights_vec() would be a good place to doctor with the eval_time values.

Generally:

We should look a consistent handling of "non-standard" values for eval_time, i.e. -Inf, generally < 0, Inf (and NA). Most predictions from censored are less restrictive than what's outlined in .filter_eval_time() which I think is nice. survival and prodLim work (at least in parts) with negative values for time so I don't think it's such a big philosophical stretch to get to infinite values. Regardless, if we decide to restrict possible values for eval_time rather than pad results appropriately, we should at least warn when we do that.

The other thing I noticed is that we are slightly modifying the censoring probabilities in two places:

  • predict.censoring_model_reverse_km() adds an epsilon to the probabilities, the amount is data-derived, not set via an argument

and then, after that:

  • trunc_probs() prevents very small probs, ie adds enough to make them at least trunc

So depending how the data is inside of predict.censoring_model_reverse_km() we might be changing it to more than trunc, so we might want to make the amount an argument to predict.censoring_model_reverse_km() to give us control of the final/total amount.

R/ipcw.R Show resolved Hide resolved
R/ipcw.R Show resolved Hide resolved
R/ipcw.R Show resolved Hide resolved
R/ipcw.R Outdated Show resolved Hide resolved
R/ipcw.R Outdated Show resolved Hide resolved
@topepo topepo merged commit 2fa853e into main Apr 1, 2023
@topepo topepo deleted the nested-ipcw branch April 1, 2023 00:21
topepo added a commit to tidymodels/tune that referenced this pull request Apr 3, 2023
topepo added a commit to tidymodels/tune that referenced this pull request Apr 3, 2023
@github-actions
Copy link

This pull request has been automatically locked. If you believe you have found a related problem, please file a new issue (with a reprex: https://reprex.tidyverse.org) and link to this issue.

@github-actions github-actions bot locked and limited conversation to collaborators Apr 15, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants