-
Notifications
You must be signed in to change notification settings - Fork 418
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
pivot_longer
combined with names_sep = "|" pipe symbol gives unexpected results
#1503
Comments
pivot_longer
combined with names_sep = "|" pipe symbol gives unexpected results
Minimum reprex tidyr:::str_split_n(c("foo;bar", "a;b"), ";")
#> [[1]]
#> [1] "foo" "bar"
#>
#> [[2]]
#> [1] "a" "b"
tidyr:::str_split_n(c("foo|bar", "a|b"), "|")
#> [[1]]
#> [1] "" "f" "o" "o" "|" "b" "a" "r"
#>
#> [[2]]
#> [1] "" "a" "|" "b"
# escaped backslash
tidyr:::str_split_n(c("foo|bar", "a|b"), "\\|")
#> [[1]]
#> [1] "foo" "bar"
#>
#> [[2]]
#> [1] "a" "b" So if you use Somewhat related to #1417 (comment) I think |
If I understand correctly, the So imho
EDIT: Even if there is some real-life edge-case where you would want to use "|" as regex, you should probably use something like |
I think this is pretty clearly documented:
It's pretty hard to change this sort of behaviour without affecting a lot of existing code and given the seeming rarity of it being a problem, I unfortunately don't think it's worth the effort. (But this is certainly something I imagine we would change if at some point in the distant future we take another stab at these functions.) |
I was working on pivoting Excel sheets with labdata and ran into an issue with
pivot_longer
, also see this question on stackoverflow.When using
pivot_longer
you can give anames_sep
parameter to split the header names into muliple columns when pivoting the data. However when I usenames_sep = "|"
it gives unexpected results compared to when using other separator characters. I suspect it has to do with the "names_sep" parameter being interpreted as a regular expression instead of a separator character. See example code below.In the resulting dataframe the columns
Tube
,TubePos
,WeightFact
contain the header names split on individual characters, so character for character, seexls_data_1
below:This only seems to happen when using the pipe symbol
|
, when I use other separators like comma,
or semicolon;
then it works fine and the result is as expected, seexls_data_2
below:Btw it could be that
pivot_wider
also has this same issue withnames_sep
I haven't checked. And here's my system info should it be needed:The text was updated successfully, but these errors were encountered: