Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incorrect return type on aggregate functions implemented by AggregateUDF when upgrading to latest DataFusion #511

Closed
viirya opened this issue Jun 4, 2024 · 1 comment · Fixed by #518
Assignees
Labels
bug Something isn't working
Milestone

Comments

@viirya
Copy link
Member

viirya commented Jun 4, 2024

Describe the bug

While working on #403, one new test error happens:

- first/last *** FAILED *** (291 milliseconds)
  org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 17002.0 failed 1 times, most recent failure: Lost task 0.0 in stage 17002.0 (TID 49989) (1d885a59c17c executor driver): org.apache.comet.CometNativeException: Arrow error: Invalid argument error: column types must match schema types, expected Null but found Int32 at column index 0

The root cause of it is new design of aggregate expressions in DataFusion. Some aggregate expressions are now implemented using AggregateUDF. To create the physical aggregate expression (i.e., AggregateFunctionExpr ) for them, we need to call create_aggregate_expr API. But there is a circular relationship on the required arguments: apache/datafusion#10785. It makes Comet cannot determine correct input_phy_exprs so an incorrect return type is determined during the call to create_aggregate_expr.

To fix it, we may need to revamp the design in DataFusion.

Steps to reproduce

No response

Expected behavior

No response

Additional context

No response

@viirya
Copy link
Member Author

viirya commented Jun 4, 2024

Actually, let me try to fix this in Scala code. If it works, we may not need to change DataFusion.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants