Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[GLUTEN-7475][VL] fix: remove unnecessary trim function in CAST, cuz velox does it #7476

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

Henry2SS
Copy link

What changes were proposed in this pull request?

remove unnecessary trim node in CAST, when input type is VARCHAR.

(Fixes: #7475)

How was this patch tested?

Integration tests passed locally. The performance is 1.x times faster.

@github-actions github-actions bot added CORE works for Gluten Core VELOX labels Oct 11, 2024
Copy link

Thanks for opening a pull request!

Could you open an issue for this pull request on Github Issues?

https://github.com/apache/incubator-gluten/issues

Then could you also rename commit message and pull request title in the following format?

[GLUTEN-${ISSUES_ID}][COMPONENT]feat/fix: ${detailed message}

See also:

Copy link

Run Gluten Clickhouse CI

1 similar comment
Copy link

Run Gluten Clickhouse CI

@Henry2SS Henry2SS changed the title [VL] remove unnecessary trim function in CAST, cuz velox does it [GLUTEN-7476][VL] remove unnecessary trim function in CAST, cuz velox does it Oct 11, 2024
@Henry2SS Henry2SS changed the title [GLUTEN-7476][VL] remove unnecessary trim function in CAST, cuz velox does it [GLUTEN-7476][VL] fix: remove unnecessary trim function in CAST, cuz velox does it Oct 11, 2024
Copy link

#7476

@Henry2SS Henry2SS changed the title [GLUTEN-7476][VL] fix: remove unnecessary trim function in CAST, cuz velox does it [GLUTEN-7475][VL] fix: remove unnecessary trim function in CAST, cuz velox does it Oct 11, 2024
Copy link

#7475

Copy link

Run Gluten Clickhouse CI

@PHILO-HE
Copy link
Contributor

@Henry2SS, thanks for your pr! This piece of scala code is for making Gluten/Velox consistent with Spark. Seems the current Velox code doesn't have the consistent white spaces definition? Could you check further?

@Henry2SS
Copy link
Author

Henry2SS commented Oct 11, 2024

@Henry2SS, thanks for your pr! This piece of scala code is for making Gluten/Velox consistent with Spark. Seems the current Velox code doesn't have the consistent white spaces definition? Could you check further?

Thanks for your reply!
There is a method called isUnicodeWhiteSpace in velox/functions/lib/string/StringImpl.h.

And I noticed that, from facebookincubator/velox#7377 this PR, the author proposed different corner cases for Presto and Spark on CAST, including removeWhiteSpaces.

Unfortunately, I don't have much knowledge about this, please confirm.

@Henry2SS
Copy link
Author

@Henry2SS, thanks for your pr! This piece of scala code is for making Gluten/Velox consistent with Spark. Seems the current Velox code doesn't have the consistent white spaces definition? Could you check further?

And do we have some unit-tests to test CAST from VARCHAR in CI?
Is it possible to confirm it by all the tests in CI?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CORE works for Gluten Core VELOX
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[VL] remove unnecessary trim node in CAST, cuz velox does it
2 participants