Skip to content

Reduce use of pipe in codebase + refactor tests + linting #1666

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 12 commits into from
May 18, 2024

Conversation

olivroy
Copy link
Collaborator

@olivroy olivroy commented May 15, 2024

Summary

Just some linting around the package

  • Refactors some dplyr code into base R (improve performance)
  • Linting using lintr::lint()
  • regular expressions matching with fixed = TRUE
  • Less magrittr pipe (performance / readability)

Checklist

Note: a render_unit() test should go on R CMD check, as I made a crucial mistake, but only test-coverage.yaml (and local testing) saved me

@rich-iannone
Copy link
Member

This is great! On a related note, one thing I've been trying to do (slowly) is get rid of magrittr pipes in the codebase. There's not too many of them left and the advantage is that we could remove one dependency from the package. If you wanted to take that on here (or in another PR) it would be a great improvement.

@olivroy
Copy link
Collaborator Author

olivroy commented May 15, 2024

I have exactly this locally, I targeted especially the utils_xml file! Will update the PR in the next few days.

Copy link
Collaborator Author

@olivroy olivroy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I gotta say I don't really like (and am not super talented) fiddling with this type of code. (base R data subsetting and string manipulation)

but the performance improvements seem worth it

library(magrittr)
bench::mark(
  mtcars[mtcars$vs == 1,],
  mtcars |> dplyr::filter(vs == 1),
  mtcars %>% dplyr::filter(vs == 1)
)
#> # A tibble: 3 × 6
#>   expression                            min  median `itr/sec` mem_alloc `gc/sec`
#>   <bch:expr>                        <bch:t> <bch:t>     <dbl> <bch:byt>    <dbl>
#> 1 mtcars[mtcars$vs == 1, ]             87µs  95.2µs     9687.   15.42KB    15.1 
#> 2 dplyr::filter(mtcars, vs == 1)      806µs 849.6µs     1072.    3.81MB    10.4 
#> 3 mtcars %>% dplyr::filter(vs == 1)   813µs 849.9µs     1076.    7.45KB     8.28

Created on 2024-05-16 with reprex v2.1.0

For the review, I'd recommend you to take a look at commits individually.

The tests were incredibly useful as a safeguard. They caught my mistakes

I made mistakes in the commits that resulted in CI failures, so I'd say that these three commits were made in a hurry:
c6fa2a6, b153108, and 7c92c8d, but the rest of the commits should all be good to look at individually!

@olivroy olivroy marked this pull request as ready for review May 16, 2024 21:00
@olivroy olivroy changed the title First round of lint Reduce use of pipe in codebase + refactor tests + linting May 16, 2024
Co-authored-by: olivroy <52606734+olivroy@users.noreply.github.com>
Copy link
Member

@rich-iannone rich-iannone left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great to me, and thank you so much for your hard work on this!

@rich-iannone
Copy link
Member

rich-iannone commented May 18, 2024

I looked through everything in this PR and agree with all of it. Also, didn't know that select() and distinct() could be replaced by one well-crafted distinct()! Will merge once CI fully passes.

@rich-iannone rich-iannone merged commit 5ccf14d into rstudio:master May 18, 2024
12 checks passed
@olivroy
Copy link
Collaborator Author

olivroy commented May 18, 2024

Thanks for looking it over!

Yes, but be careful distinct() (and arrange) is like mutate() (I have been bitten by it before), which means:

# doesn't work
mtcars |>
  distinct(x  = "vs")
# works
mtcars |> 
  select(x = "vs")
mtcars |>
  distinct(x = .data$vs)
mtcars |>
  arrange(.data$vs)
# deprecated
mtcars |>
  select(.data$vs)
# doesn't work
mtcars |>
  distinct(all_of("vs"))
# solution (with across's little friend)
mtcars |>
  distinct(pick(vs)) # works quoted too.
mtcars |> 
  distinct(pick(all_of("vs")))
# pick doesn't allow renaming with selection 
# doesn't work
  mtcars |> 
  distinct(pick(all_of(c(x = "vs"))))

The more you know!

@rich-iannone
Copy link
Member

Wow, that’s potentially dangerous! But I trust you fully with any new-school distinct() work. :)

@olivroy olivroy deleted the lint branch May 20, 2024 11:45
@olivroy olivroy mentioned this pull request Jul 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants