Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve pivot_longer(names_pattern) syntax #1417

Open
hadley opened this issue Nov 7, 2022 · 4 comments
Open

Improve pivot_longer(names_pattern) syntax #1417

hadley opened this issue Nov 7, 2022 · 4 comments
Labels
feature a feature request or enhancement pivoting ♻️ pivot rectangular data to different "shapes"

Comments

@hadley
Copy link
Member

hadley commented Nov 7, 2022

To match new separate_wider_regex()

@DavisVaughan
Copy link
Member

DavisVaughan commented Jan 3, 2023

Does that look something like

Previously:

who %>% pivot_longer(
  cols = new_sp_m014:newrel_f65,
  names_to = c("diagnosis", "gender", "age"),
  names_pattern = "new_?(.*)_(.)(.*)",
  values_to = "count"
)

New: (where it makes names_to unnecessary?)

who %>% pivot_longer(
  cols = new_sp_m014:newrel_f65,
  names_pattern = c("new_?", diagnosis = ".*", "_", gender = ".", age = ".*"),
  values_to = "count"
)

It looks like maybe we could also use features from separate_wider_delim() and separate_wider_position()?

  • separate_wider_position() could probably use the same idea as names_pattern above, but applied to names_sep when it is supplied as numeric positions
  • separate_wider_delim() is basically already supported through a string names_sep, but our current internal usage of str_separate() doesn't allow stringr modifiers like stringr::regex() so maybe we'd be able to relax that? This one would still need names_to to be specified, like how separate_wider_delim() has names.

@DavisVaughan
Copy link
Member

I think the toughest part is deciding when we are doing old separate/extract() like behavior vs using the new separate_wider_*() style behavior.

Possibly if we have a named character vector for names_pattern we use the new behavior and if we have an unnamed character vector we use the old behavior? Feels a little risky but might work. We could soft deprecate the unnamed character vector behavior

@DavisVaughan
Copy link
Member

Other tough part is that we probably want to use the algorithm of str_separate_wider_regex() and friends, but we don't want their error messages because they refer to too_few = and those separate_wider_*() specific args

@DavisVaughan
Copy link
Member

Could also introduce 3 new arguments, names_delim, names_patterns, and names_positions that align with separate_wider_*() and soft deprecate names_sep and names_pattern

@hadley hadley added feature a feature request or enhancement pivoting ♻️ pivot rectangular data to different "shapes" labels Nov 1, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature a feature request or enhancement pivoting ♻️ pivot rectangular data to different "shapes"
Projects
None yet
Development

No branches or pull requests

2 participants