Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Request: Add a .by argument to slice_ functions to match main verbs #6989

Closed
orgadish opened this issue Feb 10, 2024 · 3 comments
Closed

Request: Add a .by argument to slice_ functions to match main verbs #6989

orgadish opened this issue Feb 10, 2024 · 3 comments

Comments

@orgadish
Copy link

The slice_X() verbs have a by argument while slice() and the other main verbs have the new .by argument, but as far as I can tell none of the documentation explain why this has to be this way.

I know it probably doesn't make sense to make a breaking change and replace the argument, but can a redundant .by be added instead to support new dplyr 1.1 use?

(This is one of my only pain points in the otherwise wonderful update that is the new dplyr 1.1!)

I think it would also be worth considering adding .by for the joins, but I've been using those for so long that I don't get those confused.

@orgadish
Copy link
Author

I just saw the discussion in #6647, but I think that in that discussion @DavisVaughan was weighing more heavily the comparison between the arguments within slice_, as opposed to the comparison between different functions in a pipeline.

I think the latter is a mental hurdle that people (like me) will come across all the time, despite the many times I've seen the very helpful error.

In my opinion, in this case, it's more important to sacrifice consistency within slice_ and the consistency of why arguments are dot-prefixed, in favor of consistency across the functions. However, I understand, of course, that this is a subjective design decision.

@orgadish
Copy link
Author

PS I was inspired to submit this request after watching @mine-cetinkaya-rundel's talk at Monash from mid 2023 in which she talked about this as a major pain point she hopes will be addressed, and I didn't see any features/discussion here about changing it.

@DavisVaughan
Copy link
Member

We are satisfied with our current by vs .by naming decisions, and we think that our error message in the slice_*() family is pretty good when you do make the typo (autocompletion in RStudio does make this somewhat rare)

library(dplyr, warn.conflicts = FALSE)

mtcars %>%
  slice_head(n = 2, .by = cyl)
#> Error in `slice_head()`:
#> ! Can't specify an argument named `.by` in this verb.
#> ℹ Did you mean to use `by` instead?

The docs do note that it is "a technical difference" https://dplyr.tidyverse.org/reference/dplyr_by.html#supported-verbs. We don't go into detail because that isn't the point of those docs, but as you saw, I talked about the reason here #6647 (comment).

I also still think mtcars %>% slice_head(n = 2, .by = cyl) looks quite weird with n vs .by in the same function.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants