Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dplyr+janitor+one column matrices in filter() #6679

Closed
larry77 opened this issue Feb 2, 2023 · 2 comments · Fixed by #6706
Closed

Dplyr+janitor+one column matrices in filter() #6679

larry77 opened this issue Feb 2, 2023 · 2 comments · Fixed by #6706
Milestone

Comments

@larry77
Copy link

larry77 commented Feb 2, 2023

Hello,

I report this here (as instructed, see warning in the reprex below), but I think it is an issue of the excellent janitor package rather than dplyr.

In any case, it would be a pity if in the future this broke my workflow, so can anyone look into it?

Many thanks!

library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union
library(janitor)
#> 
#> Attaching package: 'janitor'
#> The following objects are masked from 'package:stats':
#> 
#>     chisq.test, fisher.test

df <- structure(list(member_state_3_letter_codes = c("AUT", "AUT", 
"AUT", "AUT", "AUT", "AUT", "AUT", "AUT", "AUT", "AUT", "AUT", 
"AUT", "AUT", "AUT", "AUT", "AUT", "AUT", "AUT", "AUT", "AUT"
), procedure_name = c("General Block Exemption Regulation", "Notified Aid", 
"Notified Aid", "Notified Aid", "Notified Aid", "Notified Aid", 
"Notified Aid", "Notified Aid", "General Block Exemption Regulation", 
"General Block Exemption Regulation", "General Block Exemption Regulation", 
"General Block Exemption Regulation", "General Block Exemption Regulation", 
"General Block Exemption Regulation", "General Block Exemption Regulation", 
"General Block Exemption Regulation", "General Block Exemption Regulation", 
"General Block Exemption Regulation", "General Block Exemption Regulation", 
"Notified Aid")), row.names = c(NA, -20L), class = c("tbl_df", 
"tbl", "data.frame"))


df2 <- df|>tabyl(procedure_name)
#> Warning: Using one column matrices in `filter()` was deprecated in dplyr 1.1.0.
#> ℹ Please use one dimensional logical vectors instead.
#> ℹ The deprecated feature was likely used in the dplyr package.
#>   Please report the issue at <https://github.com/tidyverse/dplyr/issues>.

df2
#>                      procedure_name  n percent
#>  General Block Exemption Regulation 12     0.6
#>                        Notified Aid  8     0.4

sessionInfo()
#> R version 4.2.2 (2022-10-31)
#> Platform: x86_64-pc-linux-gnu (64-bit)
#> Running under: Debian GNU/Linux 11 (bullseye)
#> 
#> Matrix products: default
#> BLAS:   /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3
#> LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.13.so
#> 
#> locale:
#>  [1] LC_CTYPE=en_GB.UTF-8       LC_NUMERIC=C              
#>  [3] LC_TIME=en_GB.UTF-8        LC_COLLATE=en_GB.UTF-8    
#>  [5] LC_MONETARY=en_GB.UTF-8    LC_MESSAGES=en_GB.UTF-8   
#>  [7] LC_PAPER=en_GB.UTF-8       LC_NAME=C                 
#>  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
#> [11] LC_MEASUREMENT=en_GB.UTF-8 LC_IDENTIFICATION=C       
#> 
#> attached base packages:
#> [1] stats     graphics  grDevices utils     datasets  methods   base     
#> 
#> other attached packages:
#> [1] janitor_2.1.0 dplyr_1.1.0  
#> 
#> loaded via a namespace (and not attached):
#>  [1] compiler_4.2.2    pillar_1.8.1      highr_0.9         R.methodsS3_1.8.2
#>  [5] R.utils_2.12.1    tools_4.2.2       digest_0.6.30     lubridate_1.9.0  
#>  [9] evaluate_0.17     lifecycle_1.0.3   tibble_3.1.8      R.cache_0.16.0   
#> [13] timechange_0.1.1  pkgconfig_2.0.3   rlang_1.0.6       reprex_2.0.2     
#> [17] cli_3.6.0         yaml_2.3.6        xfun_0.34         fastmap_1.1.0    
#> [21] withr_2.5.0       styler_1.8.0      stringr_1.5.0     knitr_1.40       
#> [25] generics_0.1.3    fs_1.5.2          vctrs_0.5.2       tidyselect_1.2.0 
#> [29] glue_1.6.2        snakecase_0.11.0  R6_2.5.1          fansi_1.0.4      
#> [33] rmarkdown_2.17    purrr_1.0.1       tidyr_1.3.0       magrittr_2.0.3   
#> [37] htmltools_0.5.3   utf8_1.2.2        stringi_1.7.8     R.oo_1.25.0

Created on 2023-02-02 with reprex v2.0.2

@sfirke
Copy link
Contributor

sfirke commented Feb 2, 2023

I'm just starting to look at this but I wonder if it's https://github.com/sfirke/janitor/blob/main/R/tabyl.R#L122 ? I can explore that further and refactor if so. I don't understand why this didn't pop up in my testing & revdep testing though.

@DavisVaughan DavisVaughan added this to the 1.1.1 milestone Feb 7, 2023
@DavisVaughan
Copy link
Member

DavisVaughan commented Feb 9, 2023

@sfirke the warning doesn't show up for me with CRAN janitor, but it does show up for janitor 2.1.0, so I think that is why you didn't see it.

In 2.1.0 the line you pointed to is

result %>% dplyr::filter(!is.na(.[1]))

rather than

result %>% dplyr::filter(!is.na(.[,1]))

and that , makes the difference when result is a bare data frame (which it is for janitor here) because the drop = TRUE default turns it into a vector (so you don't get a warning).

Do you know the name of that column though? I'd encourage you to use result %>% dplyr::filter(!is.na(dat)) if that is always the name of the column, or result %>% dplyr::filter(!is.na(.[[1]])) which is slightly safer and would work if result was a tibble too (which has drop = FALSE as a default for type stability)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants