Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

unnest_longer/unnest inconsistency with list columns of more lists? #1584

Open
oliverbeagley-pgg opened this issue Dec 12, 2024 · 0 comments

Comments

@oliverbeagley-pgg
Copy link

I have some code that depending on previous processes can have a zero row dataframe with correct columns and types or rows that for some columns will be a list of more lists e.g. it could be something like

foo <- function(x) {
  list(
    list(a = x * 10),
    list(a = x * 100)
  )
}

df_empty <- tibble::tibble(x = integer()) |>
  dplyr::mutate(y = lapply(x, foo))

df_empty
# # A tibble: 0 × 2
# # ℹ 2 variables: x <int>, y <list>

df_valued <- tibble::tibble(x = 1:3) |>
  dplyr::mutate(y = lapply(x, foo))

df_valued
# # A tibble: 3 × 2
#       x y         
#   <int> <list>    
# 1     1 <list [2]>
# 2     2 <list [2]>
# 3     3 <list [2]>

I'm trying to write code that will happily work with either and still result in proper columns being produced, though I'm issues getting unnest_longer to play well with the empty dataframe compared to unnest (I'd prefer to use unnest_longer as I believe it is more clear as to what it is doing).

With unnest it is:

df_empty |> tidyr::unnest("y", ptype = list())
# # A tibble: 0 × 2
# # ℹ 2 variables: x <int>, y <list>

df_valued |> tidyr::unnest("y", ptype = list())
# # A tibble: 6 × 2
#       x y               
#   <int> <list>          
# 1     1 <named list [1]>
# 2     1 <named list [1]>
# 3     2 <named list [1]>
# 4     2 <named list [1]>
# 5     3 <named list [1]>
# 6     3 <named list [1]>

Though with trying something similar with unnest_longer:

df_empty |> tidyr::unnest_longer("y", ptype = list())
# Error in `tidyr::unnest_longer()`:
# ! Can't convert `x` <logical> to <list>.
# Run `rlang::last_trace()` to see where the error occurred.

df_valued |> tidyr::unnest_longer("y", ptype = list())
# # A tibble: 6 × 2
#       x            y
#   <int> <list<list>>
# 1     1          [1]
# 2     1          [1]
# 3     2          [1]
# 4     2          [1]
# 5     3          [1]
# 6     3          [1]

I can see the outputs are slightly different based on the tibble info, but the downstream operations I'm using don't seem to care about this e.g. tacking on |> tidyr::hoist("y", "a", .ptype = list(a = integer())) with either works for df_valued.

I've tried a bunch of the arguments of unnest_longer without much success, is there something I'm missing or is this a limitation of unnest_longer? As mentioned I can use unnest so I have a work around, but it would be nice to use unnest_longer instead.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant