Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Provide field and schema metadata missing on distinct aggregations. #12691

Merged
merged 2 commits into from
Oct 1, 2024

Conversation

wiedld
Copy link
Contributor

@wiedld wiedld commented Oct 1, 2024

Which issue does this PR close?

Closes #12687

Rationale for this change

A PR merged (and released in 42.0.0) has added a check that the logical and physical plan input schemas (to an aggregate node) are the same.

Once released into the wild, it began being triggered. One example case is issue #12687 where field metadata is in the logical plan, but missing from the physical plan.

What changes are included in this PR?

commit 1 = made regression tests, which are failing.
commit 2 = fixed those tests.

Are these changes tested?

Yes

Are there any user-facing changes?

No

@github-actions github-actions bot added physical-expr Physical Expressions sqllogictest SQL Logic Tests (.slt) labels Oct 1, 2024
@wiedld wiedld marked this pull request as ready for review October 1, 2024 03:29
Copy link
Contributor

@jayzhan211 jayzhan211 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense to me!

@alamb alamb merged commit 84ac4f9 into apache:main Oct 1, 2024
24 checks passed
@alamb
Copy link
Contributor

alamb commented Oct 1, 2024

Thanks @jayzhan211 for the review and @wiedld for the fix

I verified that the test fails without the code fix:

Running "metadata.slt"
External error: query failed: DataFusion error: Internal error: Physical input schema should be the same as the one converted from logical input schema..
This was likely caused by a bug in DataFusion's code and we would welcome that you file an bug report in our issue tracker
[SQL] select count(distinct name) from table_with_metadata;
at test_files/metadata.slt:64

🐛 🥾

wiedld added a commit to influxdata/arrow-datafusion that referenced this pull request Oct 2, 2024
…pache#12691)

* test(12687): reproducer of missing metadata bug

* fix(12687): minimum change needed to fix the missing metadata
wiedld added a commit to influxdata/arrow-datafusion that referenced this pull request Oct 4, 2024
…pache#12691)

* test(12687): reproducer of missing metadata bug

* fix(12687): minimum change needed to fix the missing metadata
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
physical-expr Physical Expressions sqllogictest SQL Logic Tests (.slt)
Projects
None yet
3 participants