Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-48016][SQL] Fix a bug in try_divide function when with decimals #46286

Closed
wants to merge 2 commits into from

Conversation

gengliangwang
Copy link
Member

What changes were proposed in this pull request?

Currently, the following query will throw DIVIDE_BY_ZERO error instead of returning null

SELECT try_divide(1, decimal(0)); 

This is caused by the rule DecimalPrecision:

case b @ BinaryOperator(left, right) if left.dataType != right.dataType =>
  (left, right) match {
 ...
    case (l: Literal, r) if r.dataType.isInstanceOf[DecimalType] &&
        l.dataType.isInstanceOf[IntegralType] &&
        literalPickMinimumPrecision =>
      b.makeCopy(Array(Cast(l, DataTypeUtils.fromLiteral(l)), r)) 

The result of the above makeCopy will contain ANSI as the evalMode, instead of TRY.
This PR is to fix this bug by replacing the makeCopy method calls with withNewChildren

Why are the changes needed?

Bug fix in try_* functions.

Does this PR introduce any user-facing change?

Yes, it fixes a long-standing bug in the try_divide function.

How was this patch tested?

New UT

Was this patch authored or co-authored using generative AI tooling?

No

@github-actions github-actions bot added the SQL label Apr 29, 2024
@gengliangwang
Copy link
Member Author

There is another approach at #46251, which fails a test case in the CanonicalizeSuite.
Using withNewChildren is better here.

@dongjoon-hyun
Copy link
Member

cc @yaooqinn too

Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, LGTM.

@dongjoon-hyun
Copy link
Member

BTW, thank you for adding new test coverage, @gengliangwang .

@gengliangwang
Copy link
Member Author

gengliangwang commented Apr 29, 2024

@HyukjinKwon @dongjoon-hyun @yaooqinn Thanks for the review. Merging to master/branch-3.5

gengliangwang added a commit that referenced this pull request Apr 29, 2024
 Currently, the following query will throw DIVIDE_BY_ZERO error instead of returning null
 ```
SELECT try_divide(1, decimal(0));
```

This is caused by the rule `DecimalPrecision`:
```
case b  BinaryOperator(left, right) if left.dataType != right.dataType =>
  (left, right) match {
 ...
    case (l: Literal, r) if r.dataType.isInstanceOf[DecimalType] &&
        l.dataType.isInstanceOf[IntegralType] &&
        literalPickMinimumPrecision =>
      b.makeCopy(Array(Cast(l, DataTypeUtils.fromLiteral(l)), r))
```
The result of the above makeCopy will contain `ANSI` as the `evalMode`, instead of `TRY`.
This PR is to fix this bug by replacing the makeCopy method calls with withNewChildren

Bug fix in try_* functions.

Yes, it fixes a long-standing bug in the try_divide function.

New UT

No

Closes #46286 from gengliangwang/avoidMakeCopy.

Authored-by: Gengliang Wang <gengliang@apache.org>
Signed-off-by: Gengliang Wang <gengliang@apache.org>
(cherry picked from commit 3fbcb26)
Signed-off-by: Gengliang Wang <gengliang@apache.org>
gengliangwang added a commit to gengliangwang/spark that referenced this pull request Apr 29, 2024
 Currently, the following query will throw DIVIDE_BY_ZERO error instead of returning null
 ```
SELECT try_divide(1, decimal(0));
```

This is caused by the rule `DecimalPrecision`:
```
case b  BinaryOperator(left, right) if left.dataType != right.dataType =>
  (left, right) match {
 ...
    case (l: Literal, r) if r.dataType.isInstanceOf[DecimalType] &&
        l.dataType.isInstanceOf[IntegralType] &&
        literalPickMinimumPrecision =>
      b.makeCopy(Array(Cast(l, DataTypeUtils.fromLiteral(l)), r))
```
The result of the above makeCopy will contain `ANSI` as the `evalMode`, instead of `TRY`.
This PR is to fix this bug by replacing the makeCopy method calls with withNewChildren

Bug fix in try_* functions.

Yes, it fixes a long-standing bug in the try_divide function.

New UT

No

Closes apache#46286 from gengliangwang/avoidMakeCopy.

Authored-by: Gengliang Wang <gengliang@apache.org>
Signed-off-by: Gengliang Wang <gengliang@apache.org>
(cherry picked from commit 3fbcb26)
Signed-off-by: Gengliang Wang <gengliang@apache.org>
dongjoon-hyun added a commit that referenced this pull request May 1, 2024
### What changes were proposed in this pull request?

This is a follow-up of SPARK-48016 to update the missed Java 21 golden file.
- #46286

### Why are the changes needed?

To recover Java 21 CIs:
- https://github.com/apache/spark/actions/workflows/build_java21.yml
- https://github.com/apache/spark/actions/workflows/build_maven_java21.yml
- https://github.com/apache/spark/actions/workflows/build_maven_java21_macos14.yml

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

Manual tests. I regenerated all in Java 21 and this was the only one affected.
```
$ SPARK_GENERATE_GOLDEN_FILES=1 build/sbt "sql/testOnly org.apache.spark.sql.SQLQueryTestSuite"
```

### Was this patch authored or co-authored using generative AI tooling?

No.

Closes #46313 from dongjoon-hyun/SPARK-48016.

Authored-by: Dongjoon Hyun <dhyun@apple.com>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
@dongjoon-hyun
Copy link
Member

dongjoon-hyun commented May 1, 2024

Hi, @gengliangwang .

It's a little weird because

  • This PR doesn't contain sql/core/src/test/resources/log4j2.properties.
  • master and branch-3.4 commit also does.
  • However, branch-3.5 commit contains sql/core/src/test/resources/log4j2.properties whose content is suspicious.
- logger.parquet_outputcommitter.name = org.apache.parquet.hadoop.ParquetOutputCommitter
+ logger.parquet_outputcommitter.name = org.sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercion.scalaapache.parquet.hadoop.ParquetOutputCommitter

And, the above suspicious change broke branch-3.5.

@dongjoon-hyun
Copy link
Member

Let me revert branch-3.5 commit first to recover the CI. Could you make a PR to branch-3.5, @gengliangwang ?

@gengliangwang
Copy link
Member Author

@dongjoon-hyun Thanks a lot. I will create a backport for 3.5.

gengliangwang added a commit to gengliangwang/spark that referenced this pull request May 1, 2024
 Currently, the following query will throw DIVIDE_BY_ZERO error instead of returning null
 ```
SELECT try_divide(1, decimal(0));
```

This is caused by the rule `DecimalPrecision`:
```
case b  BinaryOperator(left, right) if left.dataType != right.dataType =>
  (left, right) match {
 ...
    case (l: Literal, r) if r.dataType.isInstanceOf[DecimalType] &&
        l.dataType.isInstanceOf[IntegralType] &&
        literalPickMinimumPrecision =>
      b.makeCopy(Array(Cast(l, DataTypeUtils.fromLiteral(l)), r))
```
The result of the above makeCopy will contain `ANSI` as the `evalMode`, instead of `TRY`.
This PR is to fix this bug by replacing the makeCopy method calls with withNewChildren

Bug fix in try_* functions.

Yes, it fixes a long-standing bug in the try_divide function.

New UT

No

Closes apache#46286 from gengliangwang/avoidMakeCopy.

Authored-by: Gengliang Wang <gengliang@apache.org>
Signed-off-by: Gengliang Wang <gengliang@apache.org>
(cherry picked from commit 3fbcb26)
Signed-off-by: Gengliang Wang <gengliang@apache.org>
JacobZheng0927 pushed a commit to JacobZheng0927/spark that referenced this pull request May 11, 2024
### What changes were proposed in this pull request?

 Currently, the following query will throw DIVIDE_BY_ZERO error instead of returning null
 ```
SELECT try_divide(1, decimal(0));
```

This is caused by the rule `DecimalPrecision`:
```
case b  BinaryOperator(left, right) if left.dataType != right.dataType =>
  (left, right) match {
 ...
    case (l: Literal, r) if r.dataType.isInstanceOf[DecimalType] &&
        l.dataType.isInstanceOf[IntegralType] &&
        literalPickMinimumPrecision =>
      b.makeCopy(Array(Cast(l, DataTypeUtils.fromLiteral(l)), r))
```
The result of the above makeCopy will contain `ANSI` as the `evalMode`, instead of `TRY`.
This PR is to fix this bug by replacing the makeCopy method calls with withNewChildren

### Why are the changes needed?

Bug fix in try_* functions.

### Does this PR introduce _any_ user-facing change?

Yes, it fixes a long-standing bug in the try_divide function.

### How was this patch tested?

New UT

### Was this patch authored or co-authored using generative AI tooling?

No

Closes apache#46286 from gengliangwang/avoidMakeCopy.

Authored-by: Gengliang Wang <gengliang@apache.org>
Signed-off-by: Gengliang Wang <gengliang@apache.org>
JacobZheng0927 pushed a commit to JacobZheng0927/spark that referenced this pull request May 11, 2024
### What changes were proposed in this pull request?

This is a follow-up of SPARK-48016 to update the missed Java 21 golden file.
- apache#46286

### Why are the changes needed?

To recover Java 21 CIs:
- https://github.com/apache/spark/actions/workflows/build_java21.yml
- https://github.com/apache/spark/actions/workflows/build_maven_java21.yml
- https://github.com/apache/spark/actions/workflows/build_maven_java21_macos14.yml

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

Manual tests. I regenerated all in Java 21 and this was the only one affected.
```
$ SPARK_GENERATE_GOLDEN_FILES=1 build/sbt "sql/testOnly org.apache.spark.sql.SQLQueryTestSuite"
```

### Was this patch authored or co-authored using generative AI tooling?

No.

Closes apache#46313 from dongjoon-hyun/SPARK-48016.

Authored-by: Dongjoon Hyun <dhyun@apple.com>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants