Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Interval support functions: add/remove impute (#379) #384

Open
wants to merge 42 commits into
base: main
Choose a base branch
from

Conversation

Gero1999
Copy link
Contributor

@Gero1999 Gero1999 commented Feb 5, 2025

Contributes to solve #379

New interval supporting functions acting to add/remove imputations from the impute column in PKNCAdata$intervals

Add imputation

interval_add_impute(data, target_impute, after = Inf, target_params = NULL, target_groups = NULL)

impute is the imputation character string to add, matching the behavior of PKNCAdata()
after follows similar behavior to the after argument of base::append(); 0 indicates it will be added as the first imputation method; Inf (or any number greater than the number of methods currently specified) indicates that it will be added as the last imputation method;
If there is already an imputation:

the imputation will be separated by commas (i.e. strsplit(current_impute, split = "[, ]"))
the new imputation will be added at the correct place (defined by after), and
The final imputation method will be put back together separated by commas (i.e. vapply(X = new_impute, FUN = paste, collapse = ",", FUN.VALUE = ""))

Remove imputation

interval_remove_impute(data, target_impute, target_params = NULL, target_groups = NULL)

split any imputation from the intervals and remove it
warn if the imputation was not found in any of the intervals

Note
For both functions optional arguments can be used to specify on which parameters (target_param, character vector) or at which specific interval rows (target_groups; list of column names and values) the action takes place

@Gero1999 Gero1999 changed the title Interval support functions (#379) Interval support functions: add/remove impute (#379) Feb 5, 2025
Copy link
Owner

@billdenney billdenney left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've not yet reviewed the tests since I requested some bigger changes in the other code.

#' @param data A PKNCAdata object containing the intervals and data components.
#' @param target_impute A character string specifying the imputation method to be removed.
#' @param target_params A character vector specifying the parameters to be targeted (optional). If missing, all TRUE in the intervals are taken.
#' @param target_groups A named list specifying the intervals to be targeted (optional). If missing, all relevant groups are considered.
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the concept of how you've implemented this. Please make this into a data.frame so that it can work more simply with the way people are accustomed to working with intervals.

Copy link
Contributor Author

@Gero1999 Gero1999 Feb 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey data will also be the intervals dataframe. I guess in this case we cannot guess the impute column always, sobwould need another arg

Indeed I think using a dataframe as an input would make a lot of sense also to match the intervals for the target_groups arg, which we can call instead something like target_intervals. Here is then an example:

If this is my initial interval table:

USUBJID ANALYTE cmax tmax aucinf.pred impute
001 Analyte1 TRUE TRUE TRUE start_predose

I would then like to remove only for cmax and tmax the imputation, while keeping it for the other parameters:

interval_remove_impute(PKNCAdata$intervals,
                       target_impute = "start_predose",
                       target_params = c("cmax", "tmax"),
                       target_intervals = data.frame(ANALYTE="Analyte1") )

This produces:

USUBJID ANALYTE cmax tmax aucinf.pred impute
001 Analyte1 FALSE FALSE TRUE start_predose
001 Analyte1 TRUE TRUE FALSE NA

That way we keep it intuitive: target_intervals indicates the rows to be considered (even if it can also contain parameter or impute specifications) while target_params the columns for which the removal applies.

@Gero1999
Copy link
Contributor Author

Gero1999 commented Feb 9, 2025

I think all the demanded changes were implemented. The tests should also be complete. I added additional arguments to control the specific behavior of the functions:

new_rows_after_original to control if the new intervals at the position of their respective original ones or at the end of the data frame.

allow_duplication for interval_add_impute to ask the user that if in the case of an impute method being already present in the interval they want to add it on top or not.

Feel free to express any concern, propose a renaming/removal of arguments or re-specify default behaviors.

@Gero1999 Gero1999 requested a review from billdenney February 9, 2025 16:57
@Gero1999 Gero1999 marked this pull request as ready for review February 9, 2025 20:44
Copy link
Owner

@billdenney billdenney left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are making good progress! My kids are waking up as I completed the review of the R/ code, so I didn't have a chance to review the tests yet. Please take a look at these and let me know what you think.


#' @export
interval_remove_impute.PKNCAdata <- function(data, target_impute, target_params = NULL, target_groups = NULL, impute_column = NULL, new_rows_after_original = TRUE) {
if (is.null(impute_column) && !is.na(data$impute)) {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For removing impute, please have this behave differently than adding impute. If impute is given via data$impute please only remove from there rather than creating a column.

@billdenney
Copy link
Owner

I think all the demanded changes were implemented. The tests should also be complete. I added additional arguments to control the specific behavior of the functions:

new_rows_after_original to control if the new intervals at the position of their respective original ones or at the end of the data frame.

allow_duplication for interval_add_impute to ask the user that if in the case of an impute method being already present in the interval they want to add it on top or not.

Feel free to express any concern, propose a renaming/removal of arguments or re-specify default behaviors.

I think that default behavior of the new rows showing up immediately after the original rows should be fine. I don't think that we need to control sort order for the users more than that (if they want to manage the sort order, then they can do that with other R tools rather than building it into PKNCA).

I don't see a use case where we would want duplicated imputation methods. Can you think of one? If so, we can keep those options. If not, we should not add the feature.

@Gero1999
Copy link
Contributor Author

I don't see a use case where we would want duplicated imputation methods. Can you think of one? If so, we can keep those options. If not, we should not add the feature.

@billdenney I had the same thought but wasn’t sure. However, what do you think the default behavior should be? My concern is that if the user specifies an existing impute for an integer position and we don’t allow duplication, it’s unclear whether or not the default behavior should replace the original impute—and if so, where (should the after consider the old impute or not?)

@billdenney
Copy link
Owner

I think that we should remove imputation first when duplicated and then put it at the location requested.

In general, the imputation method order should not cause conflict in real cases. The different imputation methods will modify different parts of the curve (usually the start and end), so there wouldn't be conflicts.

@Gero1999
Copy link
Contributor Author

Gero1999 commented Feb 12, 2025

Hopefully now it is better @billdenney ! Some comments on the implementation:

  1. Regarding this comment I am only changing directly the data$impute when the user does not specify on which intervals to perform the action (target_groups, target_params). Otherwise I create the column to only remove it on those.

Note: Won't it make sense to also have this behavior for interval_add_impute? Or is not possible to have PKNCAdata$impute = "start_conc0, start_predose"?

  1. I changed all dplyr lines for base R, knowing that is preferred. This also includes the tests.

  2. I refactored the code, although I did not use for loops. I think now all looks more clear, but f you feel that for this it is still needed I can refactor again without problem!

  3. I am considering that when parameters are not calculated they can be NAs/FALSE in the input. New rows in the output will only produce NAs.

  4. I used the function you provided me here as I liked it more. I just made some small changes to prevent problems with NA or empty vectors. Feel free to give more feedback if you can think of anything else

Let me know your thoughts, hopefully we are getting closer to the solution! 😉

@Gero1999 Gero1999 requested a review from billdenney February 13, 2025 08:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants