Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Decrease dependency #618

Closed
donyunardi opened this issue Sep 28, 2022 · 26 comments · Fixed by #981
Closed

Decrease dependency #618

donyunardi opened this issue Sep 28, 2022 · 26 comments · Fixed by #981
Assignees
Labels

Comments

@donyunardi
Copy link
Contributor

donyunardi commented Sep 28, 2022

 Imports includes 26 non-default packages.
  Importing from so many packages makes the package vulnerable to any of
  them becoming unavailable.  Move as many as possible to Suggests and
  use conditionally.

Let us create an issue for the number of imports as cran will complain about it (similar to insightsengineering/teal.modules.general#456)

Originally posted by @mhallal1 in #602 (review)

@donyunardi donyunardi mentioned this issue Jan 13, 2024
39 tasks
@shajoezhu shajoezhu mentioned this issue Jan 15, 2024
38 tasks
@m7pr
Copy link
Contributor

m7pr commented Jan 26, 2024

Yes, CRAN might complain but I don't think it's a deal breaker for the release on CRAN

@m7pr
Copy link
Contributor

m7pr commented Jan 26, 2024

And now it's 31 packages : )

   Imports includes 31 non-default packages.
   Importing from so many packages makes the package vulnerable to any of
   them becoming unavailable.  Move as many as possible to Suggests and
   use conditionally.

@m7pr
Copy link
Contributor

m7pr commented Jan 26, 2024

One of my propositions is to either go with assertthat or checkmate. This package currently uses both, which have the same purpose.

@m7pr
Copy link
Contributor

m7pr commented Jan 26, 2024

rmarkdown might be moved to Suggests. It is only used in vignettes and as one-liner in R/tm_t_pp_laboratory.R

rmarkdown::html_dependency_bootstrap("default")

Maybe this function can be copied to the package or we can put a requireNamespace('rmarkdown') stop to review this package is installed before this function is used.

@chlebowa
Copy link
Contributor

One of my propositions is to either go with assertthat or checkmate. This package currently uses both, which have the same purpose.

How about the other way around? Core packages use checkmate throughout.

@m7pr
Copy link
Contributor

m7pr commented Jan 26, 2024

We also have dplyr and magrittr in Imports. magrittr is only imported to use %>%. We have the same operator in dplyr, so maybe we can get rid of magrittr at all and only load this from dplyr.

@shajoezhu
Copy link
Contributor

we should get rid of "%>%" all together. But I think we should do it

@m7pr
Copy link
Contributor

m7pr commented Jan 26, 2024

ggrepel (an addition upon ggplot) is used in 2 functions: R/tm_g_pp_adverse_events.R and in R/tm_g_pp_patient_timeline.R - we can also move to Suggests and add a test that verifies if this package is installed for those functions to be used.

@m7pr
Copy link
Contributor

m7pr commented Jan 26, 2024

rlang used only in below cases

Image

We can create our own function for is_empty

> rlang::is_empty
function (x) 
length(x) == 0

and for .data and := we would need to decide if we can rewrite statements like below in a modern fashion that does not require := nor .data

dplyr::mutate(param_char := clean_description(.data[[param_char]]))

@m7pr
Copy link
Contributor

m7pr commented Jan 26, 2024

scales used in 3 places, maybe we can copy functions and provide custom substitution:

Image

@m7pr
Copy link
Contributor

m7pr commented Jan 26, 2024

oooh and a good one

styler used only in R/utils.R

#' teal.modules.clinical:::styled_expr(expr)
styled_expr <- function(expr) { # nolint
  styler::style_text(text = deparse(expr))
}

that is not used anywhere so can be removed and dependencies can be removed

Image

@m7pr
Copy link
Contributor

m7pr commented Jan 26, 2024

tidyr used only in R/tm_t_pp_laboratory.R in pivot_wider

Image

Maybe there is a way we can use base function for that stats::reshape is the less comfortable equivalent for tidyr::spread / tidyr::pivot_wider

@m7pr
Copy link
Contributor

m7pr commented Jan 26, 2024

vistime the same as ggrepel is only used in one function R/tm_g_pp_patient_timeline.R. Maybe we can add the same check that reviews if this package is installed with requireNamespace. Package normally would be in Suggests.

Image

@averissimo
Copy link
Contributor

Not only on this package, but we might consider re-exporting all functions from {logger} in {teal.logger} so that we can avoid having both in all packages.

(or just a subset of those)

@m7pr m7pr self-assigned this Jan 26, 2024
@m7pr
Copy link
Contributor

m7pr commented Jan 26, 2024

Will try to prepare initial PR and let's see where we can go with the limitation

@m7pr
Copy link
Contributor

m7pr commented Jan 26, 2024

the tidyr::pivot_wider substitution can be done with base::reshape() function, check the example below

library(teal.modules.clinical)
data <- 
  tmc_ex_adlb %>%
  dplyr::select(ADY, PARAMCD, PARAM, AVAL, AVALU, ANRIND) %>%
  dplyr::arrange(ADY) %>%
  dplyr::select(-ADY) %>%
  dplyr::group_by(PARAMCD, PARAM) %>%
  dplyr::mutate(INDEX = dplyr::row_number()) %>%
  dplyr::ungroup() %>%
  dplyr::mutate(aval_anrind = paste(AVAL, ANRIND)) %>%
  dplyr::select(-c(AVAL, ANRIND))

data_pivot <-
  data %>%
  tidyr::pivot_wider(names_from = INDEX, values_from = aval_anrind)

data_reshape <-
  data %>%
    as.data.frame() %>%
    reshape(
      direction = "wide", 
      idvar = c("PARAMCD", "PARAM", "AVALU"),
      v.names = "aval_anrind",
      timevar = "INDEX"
    )

colnames(data_reshape)[-c(1:3)] <- unique(data$INDEX)
attr(data_reshape, "reshape") <- NULL
attr(data_reshape, "reshapeWide") <- NULL

dim(data_pivot)
#> [1]    3 1003
dim(data_reshape)
#> [1]    3 1003

identical(
  as.data.frame(data_pivot)[1:3, 1:1003],
  data_reshape[1:3, 1:1003]
)
#> [1] TRUE

Created on 2024-01-26 with reprex v2.1.0

Final objects just differ in terms of order of remaining attributes (row.names, names and class)

@m7pr
Copy link
Contributor

m7pr commented Jan 26, 2024

rlang := and .data in cases like below (param_char is a variable holding column name in its value)

labor_table_raw <- labor_table_raw %>%
          dplyr::mutate(param_char := clean_description(.data[[param_char]]))

can be substituted with

labor_table_raw[[param_char]] <- clean_description(labor_table_raw[[param_char]])

@m7pr
Copy link
Contributor

m7pr commented Jan 26, 2024

The same for rlang := in the evaluation of the column which named is passed by a variable, can be substituted from

anl <- anl %>% dplyr::mutate(`:=`(hlt, as.character(hlt)))

to

anl[[hlt]] <- as.character(anl[[hlt]])

and we can get rid of rlang complitely

@m7pr
Copy link
Contributor

m7pr commented Jan 26, 2024

For asserthat package substitution with checkmate package - next sprint the whole core team will review all modules/functions in the package. We will split and analyze each module by one developer. I will ask the team to rewrite asserthat:: to checkmate:: equivalents during those verifications.

@m7pr
Copy link
Contributor

m7pr commented Jan 29, 2024

@shajoezhu @pawelru @donyunardi would the split of teal.modules.clinical into two separate packages teal.modules.clinical.tables and teal.modules.clinical.graphs be a good way to maintain dependencies? This package contains two groups of functions with 2 separate types of output. Maybe a split could be the way to trim down dependencies, or at least make it easier to maintain as this package has 30+ functions? +20 functions that reshape some input into teal.transform format

@m7pr
Copy link
Contributor

m7pr commented Jan 29, 2024

Hey, a summary of the work for the reduction of dependencies is below

Look at the stuff from the end-user perspective.
Usually environment set-up is the very first thing to do before moving on to the analysis. I would like it to be a finish-able action that is once set-up, you never come back to this again. You might of course modify it as the scope of the analysis changes but in general that should remain true.
If you do install package deps as you use a given package - then that won't be the case because (assuming extreme state) this cannot be completed. If you run yet unused function it might ask you for a new lib. I also doubt it is a pleasant user experience to be often halted to do dep install action.

@shajoezhu
Copy link
Contributor

shajoezhu commented Jan 29, 2024

@shajoezhu @pawelru @donyunardi would the split of teal.modules.clinical into two separate packages teal.modules.clinical.tables and teal.modules.clinical.graphs be a good way to maintain dependencies? This package contains two groups of functions with 2 separate types of output. Maybe a split could be the way to trim down dependencies, or at least make it easier to maintain as this package has 30+ functions? +20 functions that reshape some input into teal.transform format

hi @m7pr , i am so glad you raise this again. I considered this back in 2021 for both tern, and tmc. for tern we have started doing some. but not exactly the split of table and graphs.

for tmc, it is more complex, i would like to rethink, and consider merging teal.goshawk and teal.osperay before the new split

@m7pr m7pr mentioned this issue Jan 29, 2024
4 tasks
@Melkiades
Copy link
Contributor

@shajoezhu @pawelru @donyunardi would the split of teal.modules.clinical into two separate packages teal.modules.clinical.tables and teal.modules.clinical.graphs be a good way to maintain dependencies? This package contains two groups of functions with 2 separate types of output. Maybe a split could be the way to trim down dependencies, or at least make it easier to maintain as this package has 30+ functions? +20 functions that reshape some input into teal.transform format

This makes sense if they do not overlap. I believe there are a couple of tables with graphs, though; we need to check that those do not overlap with the function dep split. For the pkg names, we could make them more compact (e.g. tmc.tables or teal.clinical.tables).

@m7pr
Copy link
Contributor

m7pr commented Jan 31, 2024

I like teal.clinical.tables and teal.clinical.graphs. @shajoezhu any way you can point to the discussion about merging teal.goshawk and teal.osperay? Would you like to merge them together, or merge them into teal.modules.clinical?

@pawelru
Copy link
Contributor

pawelru commented Jan 31, 2024

My 2c in the discussion. I propose to:

  • not touch osprey and goshawk and leave them as is as a community packages that we only maintain (no active development etc.)
  • leave the topic of tmc split into a separate discussion / issue. As Joe and Davide pointed out - it's a little bit more complex and require more thinking. I feel it's very unlikely (if not impossible) to complete it before tmc release.

Let's not distract ourselves with these topics. I'm very much looking forward for this PR to happen and I feel that discussing these here might introduce significant delays.

@m7pr
Copy link
Contributor

m7pr commented Jan 31, 2024

Sure!

@m7pr m7pr closed this as completed in #981 Feb 1, 2024
m7pr added a commit that referenced this issue Feb 1, 2024
Close #618 

Moved from `Imports` to `Suggests`
- `styler` d42dbfc because of
#618 (comment)
and usage in vignettes 8a9a80e

Removed
- `magrittr` 772cf9d because of
#618 (comment)
- `tidyr` e585e0a because of
#618 (comment)
and
#618 (comment)
- `rlang` f2c0bfd
04537f1
a94b84b because of
#618 (comment)
and
#618 (comment)

---------

Signed-off-by: Marcin <133694481+m7pr@users.noreply.github.com>
Signed-off-by: Davide Garolini <davide.garolini@roche.com>
Co-authored-by: Davide Garolini <davide.garolini@roche.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants