Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: null character not permitted in chr function #513

Merged
merged 11 commits into from
Jun 10, 2024

Conversation

vaibhawvipul
Copy link
Contributor

@vaibhawvipul vaibhawvipul commented Jun 4, 2024

Which issue does this PR close?

Closes #479 .

Rationale for this change

Improves compatibility with apache spark.

What changes are included in this PR?

Spark compatible implementation of chr scala func.

How are these changes tested?

New test case added to check null character. All test case pass.

@vaibhawvipul vaibhawvipul changed the title bug: adding chr null test bug: null character not permitted in chr function Jun 4, 2024
@vaibhawvipul vaibhawvipul marked this pull request as ready for review June 5, 2024 07:07
@vaibhawvipul
Copy link
Contributor Author

@andygrove this is ready for review.

val table = "test0"
withTable(table) {
sql(s"create table $table(c9 int, c4 int) using parquet")
sql(s"insert into $table values(0, 0), (66, null), (null, 70), (null, null)")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to see a wider range of values being tested here such as large numbers and negative numbers

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, ignore that. We have a separate issue related to large/negative numbers: #480

Copy link
Member

@andygrove andygrove left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks @vaibhawvipul. I will plan on merging this early next week if there is no other feedback.

@kazuyukitanimura kazuyukitanimura changed the title bug: null character not permitted in chr function fix: null character not permitted in chr function Jun 7, 2024
@vaibhawvipul vaibhawvipul requested a review from viirya June 9, 2024 04:47
.iter()
.map(|integer: Option<i64>| {
integer
.map(|integer| match core::char::from_u32(integer as u32) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Spark Chr expression has the behavior:

  1. If input < 0, return empty char.
  2. If input % 0xFF == 0, return '\u0000'.
  3. Otherwise, return input & 0xFF as char.

Does core::char::from_u32 follow all these behaviors?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems this is for the issue of null char, it is okay to fix it in follow up.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes i am tracking that, #480 is the one I will work on next. That one handles the cases you mentioned above!

@codecov-commenter
Copy link

codecov-commenter commented Jun 9, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 34.37%. Comparing base (7ba5693) to head (b42d9e3).
Report is 31 commits behind head on main.

Additional details and impacted files
@@             Coverage Diff              @@
##               main     #513      +/-   ##
============================================
+ Coverage     34.05%   34.37%   +0.31%     
+ Complexity      859      808      -51     
============================================
  Files           116      106      -10     
  Lines         38679    38548     -131     
  Branches       8567     8573       +6     
============================================
+ Hits          13173    13250      +77     
+ Misses        22745    22544     -201     
+ Partials       2761     2754       -7     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@andygrove andygrove merged commit 487053d into apache:main Jun 10, 2024
43 checks passed
himadripal pushed a commit to himadripal/datafusion-comet that referenced this pull request Sep 7, 2024
* adding chr null test

* removing comments

* repro bug

* adding chr scalarfunc compatible with spark

* remove ANSI configs, as it is not needed for chr

* removing redundant test

* Update core/src/execution/datafusion/expressions/scalar_funcs/chr.rs

Co-authored-by: Liang-Chi Hsieh <viirya@gmail.com>

* adding support for scalar values

* fix clippy errs

* simplifying the scalar/array handling fn

* fix fmt errors

---------

Co-authored-by: Liang-Chi Hsieh <viirya@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

bug: null character not permitted in chr function
5 participants