-
Notifications
You must be signed in to change notification settings - Fork 884
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FEA] Enable structs of lists in the new row lexicographic comparator #11672
Comments
For comparing lists, we need to generate dremel data. However:
So in order to fix this completely, we need both to generate+consider dremel data for nested lists columns as well as fixing the issue with type dispatcher above. |
Fixing dremel issue is easy, just a few LOC. For fixing the type dispatcher, I can't find any way to avoid using such
By doing so, we can avoid dispatching |
This fixes the lexicographic comparator that cannot handle the input having structs of lists. The new implementation mainly changes the helper functions `decompose_structs`. In particular: * If a structs column has its first child is a lists column, the first column of the result table will no longer be `Struct<Struct<...<List<SomeType>...>` (i.e., nested structs ultimately having one child). * Instead, the first output column will be nested empty structs: `Struct<...Struct<>>...>`. The innermost child column `List<SomeType>` is output as the second column in the result table. Depends on: * #12995 Closes #11672. Authors: - Nghia Truong (https://github.com/ttnghia) - Vyas Ramasubramani (https://github.com/vyasr) Approvers: - Vyas Ramasubramani (https://github.com/vyasr) - Divye Gala (https://github.com/divyegala) - Bradley Dice (https://github.com/bdice) URL: #13005
Is your feature request related to a problem? Please describe.
#11129 enables comparisons of list dtypes with the new comparator. In principle, this code should (in conjunction with prior work like #10164) be sufficient to enable structs of lists (but not lists of structs) via the struct decomposition approach used to preprocess structs. Currently, we forbid comparison of lists of structs in the comparator. During development of #11129 I attempted to remove this restriction, but I observed that this led to unexpected test failures of groupby_structs_test/lists_are_unsupported. These failures were not caused by a failure of
EXPECT_THROW
(which would be natural since I removed a restriction), but were actual failures in the computed outputs. I was unable to reproduce these failures locally, indicating that there is perhaps some unknown indeterminacy in the ordering of some step in the comparison.Describe the solution you'd like
The struct of list restriction in
check_lex_compatibility
should be removed and the failing test should be fixed such that structs of lists are properly supported.The text was updated successfully, but these errors were encountered: