Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-38211][SQL][DOCS] Add SQL migration guide on restoring loose upcast from string to other types #35519

Closed
wants to merge 1 commit into from

Conversation

manuzhang
Copy link
Contributor

What changes were proposed in this pull request?

Add doc on restoring loose upcast from string to other types (behavior before 2.4.1) to SQL migration guide.

Why are the changes needed?

After SPARK-24586, loose upcasting from string to other types are not allowed by default. User can still set spark.sql.legacy.looseUpcast=true to restore old behavior but it's not documented.

Does this PR introduce any user-facing change?

No.

How was this patch tested?

Only doc change.

@HyukjinKwon
Copy link
Member

Merged to master, branch-3.2, branch-3.1 and branch-3.0.

HyukjinKwon pushed a commit that referenced this pull request Feb 15, 2022
…pcast from string to other types

### What changes were proposed in this pull request?
Add doc on restoring loose upcast from string to other types (behavior before 2.4.1) to SQL migration guide.

### Why are the changes needed?
After [SPARK-24586](https://issues.apache.org/jira/browse/SPARK-24586), loose upcasting from string to other types are not allowed by default. User can still set `spark.sql.legacy.looseUpcast=true` to restore old behavior but it's not documented.

### Does this PR introduce _any_ user-facing change?
No.

### How was this patch tested?
Only doc change.

Closes #35519 from manuzhang/spark-38211.

Authored-by: tianlzhang <tianlzhang@ebay.com>
Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>
(cherry picked from commit 78514e3)
Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>
HyukjinKwon pushed a commit that referenced this pull request Feb 15, 2022
…pcast from string to other types

### What changes were proposed in this pull request?
Add doc on restoring loose upcast from string to other types (behavior before 2.4.1) to SQL migration guide.

### Why are the changes needed?
After [SPARK-24586](https://issues.apache.org/jira/browse/SPARK-24586), loose upcasting from string to other types are not allowed by default. User can still set `spark.sql.legacy.looseUpcast=true` to restore old behavior but it's not documented.

### Does this PR introduce _any_ user-facing change?
No.

### How was this patch tested?
Only doc change.

Closes #35519 from manuzhang/spark-38211.

Authored-by: tianlzhang <tianlzhang@ebay.com>
Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>
(cherry picked from commit 78514e3)
Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>
HyukjinKwon pushed a commit that referenced this pull request Feb 15, 2022
…pcast from string to other types

### What changes were proposed in this pull request?
Add doc on restoring loose upcast from string to other types (behavior before 2.4.1) to SQL migration guide.

### Why are the changes needed?
After [SPARK-24586](https://issues.apache.org/jira/browse/SPARK-24586), loose upcasting from string to other types are not allowed by default. User can still set `spark.sql.legacy.looseUpcast=true` to restore old behavior but it's not documented.

### Does this PR introduce _any_ user-facing change?
No.

### How was this patch tested?
Only doc change.

Closes #35519 from manuzhang/spark-38211.

Authored-by: tianlzhang <tianlzhang@ebay.com>
Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>
(cherry picked from commit 78514e3)
Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>
kazuyukitanimura pushed a commit to kazuyukitanimura/spark that referenced this pull request Aug 10, 2022
…pcast from string to other types

### What changes were proposed in this pull request?
Add doc on restoring loose upcast from string to other types (behavior before 2.4.1) to SQL migration guide.

### Why are the changes needed?
After [SPARK-24586](https://issues.apache.org/jira/browse/SPARK-24586), loose upcasting from string to other types are not allowed by default. User can still set `spark.sql.legacy.looseUpcast=true` to restore old behavior but it's not documented.

### Does this PR introduce _any_ user-facing change?
No.

### How was this patch tested?
Only doc change.

Closes apache#35519 from manuzhang/spark-38211.

Authored-by: tianlzhang <tianlzhang@ebay.com>
Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>
(cherry picked from commit 78514e3)
Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>
@@ -420,7 +420,7 @@ license: |
need to specify a value with units like "30s" now, to avoid being interpreted as milliseconds; otherwise,
the extremely short interval that results will likely cause applications to fail.

- When turning a Dataset to another Dataset, Spark will up cast the fields in the original Dataset to the type of corresponding fields in the target DataSet. In version 2.4 and earlier, this up cast is not very strict, e.g. `Seq("str").toDS.as[Int]` fails, but `Seq("str").toDS.as[Boolean]` works and throw NPE during execution. In Spark 3.0, the up cast is stricter and turning String into something else is not allowed, i.e. `Seq("str").toDS.as[Boolean]` will fail during analysis.
- When turning a Dataset to another Dataset, Spark will up cast the fields in the original Dataset to the type of corresponding fields in the target DataSet. In version 2.4 and earlier, this up cast is not very strict, e.g. `Seq("str").toDS.as[Int]` fails, but `Seq("str").toDS.as[Boolean]` works and throw NPE during execution. In Spark 3.0, the up cast is stricter and turning String into something else is not allowed, i.e. `Seq("str").toDS.as[Boolean]` will fail during analysis. To restore the behavior before 2.4.1, set `spark.sql.legacy.looseUpcast` to `true`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You modified the item for migration to 2.4.1. Does the 2.4.1 have the config spark.sql.legacy.looseUpcast. Seems not, see https://github.com/apache/spark/blob/v2.4.1-rc9/sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants