Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-47900] Fix check for implicit (UTF8_BINARY) collation #46116

Closed
wants to merge 2 commits into from

Conversation

stefankandic
Copy link
Contributor

What changes were proposed in this pull request?

Fix method name and logic for cases where we want to check if the string has the UTF8 binary (implicit) collation.

Why are the changes needed?

#45592 introduced session level collation which meant that the concept of default collation is not the same as the implicit collation. Method in SchemaUtils had its name and logic changed to mean that the collation is binary orderable but that was not it's original intent. It should check if the collation is not implicit/UTF8 binary (id != 0).

Does this PR introduce any user-facing change?

No.

How was this patch tested?

Existing unit tests.

Was this patch authored or co-authored using generative AI tooling?

No

@github-actions github-actions bot added the SQL label Apr 18, 2024
@stefankandic stefankandic changed the title [SPARK-47900] Fix check for implicit collation [SPARK-47900] Fix check for implicit (UTF8_BINARY) collation Apr 18, 2024
@stefankandic
Copy link
Contributor Author

@cloud-fan can we merge this?

@cloud-fan
Copy link
Contributor

thanks, merging to master!

@cloud-fan cloud-fan closed this in b20356e Apr 22, 2024
JacobZheng0927 pushed a commit to JacobZheng0927/spark that referenced this pull request May 11, 2024
### What changes were proposed in this pull request?

Fix method name and logic for cases where we want to check if the string has the UTF8 binary (implicit) collation.

### Why are the changes needed?

apache#45592 introduced session level collation which meant that the concept of default collation is not the same as the implicit collation. Method in `SchemaUtils` had its name and logic changed to mean that the collation is binary orderable but that was not it's original intent. It should check if the collation is not  implicit/UTF8 binary (`id != 0`).

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

Existing unit tests.

### Was this patch authored or co-authored using generative AI tooling?

No

Closes apache#46116 from stefankandic/fixBinaryCheckLogic.

Authored-by: Stefan Kandic <stefan.kandic@databricks.com>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants