Simplify package #498

gogonzo · 2023-12-13T08:15:56Z

WIP

fixes

Remove access to data from UI modules #132 (together with the same branch in teal)
Fix reactive link between filter panel and modules teal#903
Implement bindCache wherever possible to avoid unnecessary reactivity on "unfiltered" data. #62

This is a proposition to simplify FilteredData as much as possible (for me today). Some information are written in vignette for developers and in R/methods-documentation.R file.

Changes:

removed FilteredDataset and FilterStates
FIlteredData is the class to keep and manage all states
Instead of making rigid class structures, I propose little dependency injection where different datasets can be handled by set of S3 methods (get_code, get_slice_vector, module_add, module_active, module_overview etc.). Thanks to this change there is a possibility to make arbitrary nested objects structure. For example SummarizedExperiment can be standalone dataset and could be nested inside MultiAssayExperiment object. Same could apply to lists, "spatial" objects and other data classes composed of other smalled structures. This can support any nesting dept
Approach also allows to override any S3 method and write own - for example to change how ui looks like or replace dplyr::filter call with data.frame. Theoretically, we should export only xyz.default method so everyone can override methods for specific classes (for example data.frame). If we register S3 method for data.frame it will be not possible to override data.frame method. Confirmed here Change dispatch method to S3 instead of custom (target: @132_filter_panel_ui@main) #501

testing app

options(teal.log_level = "TRACE")
library(shiny)
library(cond)
data(airway)
pkgload::load_all("teal.slice")
pkgload::load_all("teal")

data <- teal_data() |>
  within({
    set.seed(1)
    library(scda)
    library(MultiAssayExperiment)
    data(miniACC, envir = environment())
    data(airway, envir = environment())
    ADSL <- synthetic_cdisc_data("latest")$adsl
    ADSL$categorical <- sample(letters[1:3], size = nrow(ADSL), replace = TRUE, prob = c(.1, .3, .6))
    ADTTE <- synthetic_cdisc_data("latest")$adtte
    ADRS <- synthetic_cdisc_data("latest")$adrs
  })

datanames(data) <- c("ADSL", "ADTTE", "ADRS", "miniACC", "airway")
join_keys(data) <- default_cdisc_join_keys[c("ADSL", "ADTTE", "ADRS")]

default_filters <- teal.slice::teal_slices(
  teal_slice(dataname = "ADSL", varname = "categorical", selected = c("a"), id = "categorical", multiple = FALSE),
  teal_slice(id = "SE", title = "Safety-Evaluable", dataname = "ADSL", expr = "SAFFL == 'Y'"),
  teal_slice(dataname = "miniACC", varname = "years_to_birth", selected = c(30, 50), keep_na = TRUE, keep_inf = FALSE),
  teal_slice(dataname = "miniACC", varname = "gender", selected = "female", keep_na = TRUE),
  teal_slice(dataname = "miniACC", varname = "ARRAY_TYPE", selected = "protein_level", experiment = "RPPAArray", arg = "subset"),
  teal_slice(dataname = "miniACC", varname = "Genes", selected = "G6PD", experiment = "RPPAArray", arg = "subset"),
  count_type = "none"
)

app <- init(
  data = data,
  modules = modules(
    modules(
      label = "tab1",
      example_module("funny"),
      example_module("funny2", datanames = "ADTTE") # will limit datanames to ADTTE and ADSL (parent)
    )
  ),
  filter = default_filters
)

runApp(app)

replace classes

pawelru · 2023-12-20T10:50:38Z

R/FilterState.R

@@ -575,15 +562,17 @@ FilterState <- R6::R6Class( # nolint
    # @description
    # Return variable name prefixed by `dataname` to be evaluated as extracted object,
    # for example `data$var`
+    # @param dataname `character(1)` name of a dataset
+    # @param extract_type `character(1)` type of extraction, one of `character(0)`, `"list"`, `"matrix"`


I kind of dislike mixing character(0) with character(1). Usually there is a support of "none" value instead of character(0).

If you do the above you can then very easily validate values with match.arg(). Current implementation is rather: "list", "matrix" and everything else. (don't know if this is checked elsewhere - I'm looking at this method only)

I don't know exactly how to call each way of extraction:

character(0) means that variables in a subset call are not prefixed (data |> subset(x == "a"))

list means that variables are extracted by $ in a subset call (data |> subsetByColData(data$x == "a"))

matrix means that variables are extracted by [[ in a subset call (data |> subset(data[, "x"] == "a"))

Somebody has better terms for above cases?

Suggested change

# @param extract_type `character(1)` type of extraction, one of `character(0)`, `"list"`, `"matrix"`

# @param extract_type `character(1)` type of extraction, one of `"nse"`, `"list"`, `"matrix"`

Perhaps change the name of the argument to var_prefix or something like that and then c("none", "list", "matrix") will be proper options?

Perhaps change the name of the argument to var_prefix or something like that and than c("none", "list", "matrix") will be proper options?

Yes exactly.

I have one more thought but I'm not yet super convinced to this. I'm interested in hearing your thoughts on this.
Instead of a string which is then matched to sprintf pattern, we can allow providing sprintf pattern or the whole sprintf call.
Taking this idea further - we are actually talking about an expression and not message / warning / anything else that requires a character type of data. Hance, instead of a sptrintf that produced a char which is then parsed to expression -> move to substitute that returns expression directly without string based step. I mean - operate on the right data type.

Note this is an internal class and the user never provides anything, unless you count senior classes as users.

We have gone through operating on strings and language objects throughout teal.slice back in Spring. Trust me, the code to build this as a string is much, much simpler than that required to build expression. This is currently (I believe) the only departure from building expressions from language objects in the package.

I have one more thought but I'm not yet super convinced to this. I'm interested in hearing your thoughts on this.
Instead of a string which is then matched to sprintf pattern, we can allow providing sprintf pattern or the whole sprintf call.

Theoretically you can just provide get_call(varname = "dataname$varname"), but this doesn't make sense, because dataname and varname is an information kept inside of FilterState class. When gathering calls from all FilterState object, parent (method which gathers) doesn't know anything about varname.

Providing sprintf pattern is not enough neither, because:

sprintf("%s$%s", dataname, varname)

sprintf('%s[, "%s"]', dataname, varname)

`sprintf("%s", varname)

As you can see, dataname in first two is a first argument and varname second. In case of (3) dataname is not an argument for sprintf at all and varname is second.

Pattern is hard do handle because different arguments are taken in all cases.

Just so this thread doesn't get buried and forgotten, I suggest to change extract_type to
var_prefix = c("none", "list", "matrix")
which could then be handled with
var_prefix <- match.arg(var_prefix)
and
ans <- switch(var_prefix, ...)

pawelru · 2023-12-20T10:59:16Z

R/FilteredData-utils.R

@@ -9,7 +9,7 @@
 #' library(shiny)
 #' datasets <- teal.slice::init_filtered_data(list(iris = iris, mtcars = mtcars))
 #' @export
-init_filtered_data <- function(x, join_keys = teal.data::join_keys(), code, check) { # nolint
+init_filtered_data <- function(x = list(), join_keys = teal.data::join_keys(), code, check) { # nolint


I don't think x should have a default value. Usually a default value is a signal that a given argument is not obligatory which is not the case.

If we are talking about this line - there are code and check argumnets that are both deprecated and without a default value. I think we can safely add default values. Note that missing() used below only checks if a value was explicitely specified in the function call.

foo <- \(x = NULL) {`if`(missing(x), 0, 1)} foo() #> [1] 0 foo(10) #> [1] 1

^{Created on 2023-12-20 with reprex v2.0.2}

pawelru · 2023-12-20T11:21:36Z

Can you please provide some more explanation behind new dispatching? I can't stop having a feeling that it's a step back from what we are having currently.

You raised a data.table case which I think should be addressed by a new S3 method for data.table instead of overwriting existing one for data.frame. That should be true for anything else that extends data.frame. I personally don't follow the requirement for nesting in MAE and SE.

chlebowa · 2023-12-20T11:37:53Z

If we register S3 method for data.frame it will be not possible to override data.frame method.

This is debatable and under investigation.

Can you please provide some more explanation behind new dispatching? I can't stop having a feeling that it's a step back from what we are having currently.

Rather than keeping R6 classes for every supported class of dataset, we have FilteredData calling generic functions on datasets (e.g., ui_active, get_code) and S3 methods to support particular classes. It is much easier to add a set of S3 methods than a set of R6 classes and there will be instructions specifying what is strictly required.

You raised a data.table case which I think should be addressed by a new S3 method for data.table instead of overwriting existing one for data.frame.

I think what @gogonzo was not talking about a dataset being a data.table but rather: we use dplyr::filter to apply filtering to data.frame datasets, which is encoded in the data.frame method here, but if someone wants to use data.table filtering instead, one can define their own method for data.frame to do just that.

I personally don't follow the requirement for nesting in MAE and SE.

MAE is composed of experiments and therefore is a nested structure. Experiments can be data.frame, SummarizedExperiment and other classes. This nesting has always been an issue, the difference here is in how it is handled.

By creating standalone methods (method suites, actually) for any class that we expect to be an experiment in MAE we allow them to be standalone datasets themselves. The methods for MAE simply dispatch to methods relevant for particular experiment classes.

pawelru · 2023-12-20T12:19:41Z

Thanks @chlebowa for the context.

I agree with you that extending should be done with a new S3 method which is much easier than a new R6 class. I was more referring to the big if statement (that is conceptually more a switch than if) in foo.default that is calling foo_bar and foo_baz instead of having foo.bar and foo.baz directly. The latter is much easier to extend compared to the former one that would require additional value in that switch statement.
I think it's connected to the overwriting S3 methods investigation so I'm looking forward for the outcome.

I think what @gogonzo was not talking about a dataset being a data.table but rather: we use dplyr::filter to apply filtering to data.frame datasets, which is encoded in the data.frame method here, but if someone wants to use data.table filtering instead, one can define their own method for data.frame to do just that.

I think we are in agreement here - this should be a new method. Please just note that data.table way of filtering works only if a filtered object is a data.table. It does not work for a data.frame - the function used is data.table:::[.data.table. Because of above I think this should be a separate foo.data.table S3 method.

chlebowa · 2023-12-20T12:25:31Z

I think it's connected to the overwriting S3 methods investigation so I'm looking forward for the outcome.

Yup, exactly.

Because of above I think this should be a separate foo.data.table S3 method.

One can wish to handle data.frame x by as.data.frame(data.table::as.data.table(x)[... and that would be a method for dataset of class data.frame. If the input dataset is a data.table, yes, a full suite* of methods for that class should be provided.

(*) Since data.table inherits from data.frame, a limited suite can be sufficient.

gogonzo · 2023-12-21T07:46:38Z

I personally don't follow the requirement for nesting in MAE and SE.

Currently (on main) SummarizedExperiment, Matrix are handled only as a subitem of MAE. In this PR it doesn't matter if the this is a part of a larger-nested structure or not. It opens possibility to apply filter-panel for objects with any nested structure.

I was more referring to the big if statement (that is conceptually more a switch than if) in foo.default that is calling foo_bar and foo_baz instead of having foo.bar and foo.baz directly. The latter is much easier to extend compared to the former one that would require additional value in that switch statement.

I've checked that foo.bar registered in a package is prioritized before foo.bar outside of the package. Doesn't seem legit, I agree, this is why @averissimo double checks my conclusion. If we register foo.bar in the teal.slice then generic foo would not use foo.bar registered outside of the package. This is why I made foo.default in a weird way. We will confirm soon if it is true.

You raised a data.table case which I think should be addressed by a new S3 method for data.table instead of overwriting existing one for data.frame.

My mistake. I don't know why i wrote data.table. It should be data.frame. And yes, overriding defaults for data.frame should be done through registering a new S3 method for data.table.

chlebowa

I love the idea but I think it needs a lot of refinement.

get_filter_call requires exposing the FilterState class, and the FilteredData$state_list member, both of which were hitherto private, and for good reason.
Extending support to a new class would require deep knowledge of the entire FilterPanel, probably more than I have myself. I find a third party writing their own srv_add and srv_active functions unrealistic.

Some specific suggestions for get_filter_call:

should take predicates, not FilterStates
should use hard-coded subsetting function to build subsetting expression
could take dataname as argument
perhaps move varname preixing from FilterState$get_call to this function?
-- keep_na and keep_inf could be tough to handle
-- perhaps use something akin to tmc::substitute_names?
final stages of filter call construction are esoteric

rhs <- as.call(c(str2lang("dplyr::filter"), c(list(dataname_lang), combined_predicate)))
substitute(
  expr = dataname_lang <- rhs,
  env = list(dataname_lang = dataname_lang, rhs = rhs)
)

-- they will be looked at by people implementing their own methods
-- perhaps introduce a helper function like get_final_call <- function(dataname, predicates(prefixed), FUN)

We can still accept this as a good direction of change even if all the mechanisms remain internal and the Core Dev team takes on responsibility for adding support for all requested classes.

chlebowa · 2023-12-28T11:02:05Z

R/calls_combine_by.R

@@ -22,9 +22,9 @@
 #' @keywords internal
 calls_combine_by <- function(calls, operator) {
  checkmate::assert_list(calls)
-  if (length(calls) > 0L) checkmate::assert_list(calls, types = c("call", "name"))
  checkmate::assert_string(operator)


Perhaps we could limit operator to valid operators & and | (possibly xor)?

chlebowa · 2023-12-28T14:43:41Z

R/FilterState.R

@@ -575,15 +562,17 @@ FilterState <- R6::R6Class( # nolint
    # @description
    # Return variable name prefixed by `dataname` to be evaluated as extracted object,
    # for example `data$var`
+    # @param dataname `character(1)` name of a dataset
+    # @param extract_type `character(1)` type of extraction, one of `character(0)`, `"list"`, `"matrix"`


Just so this thread doesn't get buried and forgotten, I suggest to change extract_type to
var_prefix = c("none", "list", "matrix")
which could then be handled with
var_prefix <- match.arg(var_prefix)
and
ans <- switch(var_prefix, ...)

chlebowa · 2023-12-28T14:44:41Z

R/FilterStateExpr.R

@@ -124,8 +124,10 @@ FilterStateExpr <- R6::R6Class( # nolint
    #' for selected variable type.
    #' Method is using internal reactive values which makes it reactive
    #' and must be executed in reactive or isolated context.
+    #' @param extract_type (`character(0)`, `character(1)`)\cr
+    #'  ignored


Suggested change

#' ignored

#' for compatibility with `FilterState$get_call`; ignored

chlebowa · 2023-12-28T15:26:14Z

R/methods-documentation.R

+#' @param id (`character`) Module id
+#' @param data (`object`) Object of any class
+#' @param data_filtered (`object`) Object of any class but must be the same class as `data`
+#' @param dataname (`character`) Name of the dataset
+#' @param filtered_data (`FilteredData`) Object of class `FilteredData`.
+#' @param states_list (`list`) List of `FilterState` objects


Suggested change

#' @param id (`character`) Module id

#' @param data (`object`) Object of any class

#' @param data_filtered (`object`) Object of any class but must be the same class as `data`

#' @param dataname (`character`) Name of the dataset

#' @param filtered_data (`FilteredData`) Object of class `FilteredData`.

#' @param states_list (`list`) List of `FilterState` objects

#' @param id (`character`) `shiny` module id.

#' @param data Data object of any class.

#' @param data_filtered Subset of `data`.

#' @param dataname (`character`) Name of the dataset.

#' @param filtered_data (`FilteredData`) Object of class `FilteredData`.

#' @param states_list (`list`) List of `FilterState` objects.

chlebowa · 2023-12-28T15:33:22Z

R/methods-documentation.R

+#'
+#' `teal.slice` provides set of methods to manage filters for different data types.
+#' # todo: write more about [srv_add()] and consequences of creating `teal_slice` for certain dataset
+#' #     - how slice defines location of the variable


This is covered in ?teal_slice.

chlebowa · 2023-12-28T16:04:21Z

R/methods-documentation.R

Documentation says one of the supported class is "matrix" but the functions use "array". This latter should be changed as we support matrices but not (n-dimensional) arrays.

I think we should provide methods for DataFrame because they (DataFrames) are found in SE and MAE slots and they do not inherit from data.frame.

chlebowa · 2023-12-28T16:45:18Z

R/FilteredData.R

FilteredData $format and $print methods were lost.

donyunardi · 2024-01-19T18:39:54Z

Just a reminder not to merge this before our teal release. Pushing this PR to later sprint.

m7pr · 2024-02-19T14:22:56Z

@donyunardi is this the proper sprint for this card?

donyunardi · 2024-02-19T16:33:58Z

@donyunardi is this the proper sprint for this card?

Maybe not as we still have other priorities.
I'll remove it from the board for now.

gogonzo added 3 commits December 13, 2023 09:13

WIP

1a0f7b7

replace classes

remove old

7346ae6

remove unnecessary

2b18539

gogonzo added the core label Dec 13, 2023

gogonzo added 6 commits December 13, 2023 13:45

WIP

237af9c

Merge remote-tracking branch 'origin/main' into 132_filter_panel_ui@main

7a5ae7b

supporting nested objects

e840664

fix remove datasets and single filter state

82f84fa

some asserts

9445f38

fix cyclic reactivity

c9a8341

gogonzo linked an issue Dec 13, 2023 that may be closed by this pull request

Remove access to data from UI modules #132

Closed

gogonzo added 9 commits December 15, 2023 07:50

support unsupported

d2e0095

unsupported datasets

753f86b

tidyup documentation

b2feb19

WIP

7b1b82f

WIP

09f15b8

some fixes

772eeed

wip

11703ea

further simplifications

f505d13

fix vignette chunks. Example App is working

1a77820

gogonzo changed the title ~~WIP~~ Simplify package Dec 20, 2023

gogonzo mentioned this pull request Dec 20, 2023

[CRAN Release]: v0.5.0 #499

Closed

34 tasks

pawelru reviewed Dec 20, 2023

View reviewed changes

fix subset for metrices

c4270fa

chlebowa assigned averissimo and chlebowa Dec 21, 2023

chlebowa suggested changes Dec 28, 2023

View reviewed changes

gogonzo mentioned this pull request Jan 8, 2024

507 reevaluate variable_types #508

Merged

chlebowa mentioned this pull request Jan 30, 2024

clean up after pre-release-cleanup #537

Merged

chlebowa mentioned this pull request Mar 25, 2024

Expose FilterStates objects out of teal.slice #102

Open

gogonzo closed this Aug 6, 2024

github-actions bot locked and limited conversation to collaborators Aug 6, 2024

gogonzo deleted the 132_filter_panel_ui@main branch August 6, 2024 08:43

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Simplify package #498

Simplify package #498

gogonzo commented Dec 13, 2023 •

edited

Loading

pawelru Dec 20, 2023

gogonzo Dec 21, 2023

chlebowa Dec 21, 2023 •

edited

Loading

pawelru Dec 21, 2023

chlebowa Dec 21, 2023

gogonzo Dec 21, 2023

chlebowa Dec 28, 2023 •

edited

Loading

pawelru Dec 20, 2023

pawelru commented Dec 20, 2023

chlebowa commented Dec 20, 2023

pawelru commented Dec 20, 2023

chlebowa commented Dec 20, 2023 •

edited

Loading

gogonzo commented Dec 21, 2023

chlebowa left a comment

chlebowa Dec 28, 2023

chlebowa Dec 28, 2023 •

edited

Loading

chlebowa Dec 28, 2023

chlebowa Dec 28, 2023

chlebowa Dec 28, 2023

chlebowa Dec 28, 2023 •

edited

Loading

chlebowa Dec 28, 2023 •

edited

Loading

donyunardi commented Jan 19, 2024

m7pr commented Feb 19, 2024

donyunardi commented Feb 19, 2024

	# @param extract_type `character(1)` type of extraction, one of `character(0)`, `"list"`, `"matrix"`
	# @param extract_type `character(1)` type of extraction, one of `"nse"`, `"list"`, `"matrix"`

	#' ignored
	#' for compatibility with `FilterState$get_call`; ignored

Simplify package #498

Simplify package #498

Conversation

gogonzo commented Dec 13, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

chlebowa Dec 21, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

chlebowa Dec 28, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

pawelru commented Dec 20, 2023

chlebowa commented Dec 20, 2023

pawelru commented Dec 20, 2023

chlebowa commented Dec 20, 2023 • edited Loading

gogonzo commented Dec 21, 2023

chlebowa left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

chlebowa Dec 28, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

chlebowa Dec 28, 2023 • edited Loading

Choose a reason for hiding this comment

chlebowa Dec 28, 2023 • edited Loading

Choose a reason for hiding this comment

donyunardi commented Jan 19, 2024

m7pr commented Feb 19, 2024

donyunardi commented Feb 19, 2024

gogonzo commented Dec 13, 2023 •

edited

Loading

chlebowa Dec 21, 2023 •

edited

Loading

chlebowa Dec 28, 2023 •

edited

Loading

chlebowa commented Dec 20, 2023 •

edited

Loading

chlebowa Dec 28, 2023 •

edited

Loading

chlebowa Dec 28, 2023 •

edited

Loading

chlebowa Dec 28, 2023 •

edited

Loading