Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

functions: Add lower support #3521

Merged
merged 4 commits into from
Dec 17, 2021
Merged

functions: Add lower support #3521

merged 4 commits into from
Dec 17, 2021

Conversation

Xuanwo
Copy link
Member

@Xuanwo Xuanwo commented Dec 17, 2021

Signed-off-by: Xuanwo github@xuanwo.io

I hereby agree to the terms of the CLA available at: https://databend.rs/policies/cla/

Summary

Add lower support

Checklist

  • Implement functios
  • Add stateless tests
  • Add website docs

Changelog

  • New Feature

Related Issues

Test Plan

Unit Tests

Stateless Tests

Signed-off-by: Xuanwo <github@xuanwo.io>
@databend-bot databend-bot added the pr-feature this PR introduces a new feature to the codebase label Dec 17, 2021
@databend-bot
Copy link
Member

Thanks for the contribution!
I have applied any labels matching special text in your PR Changelog.

Please review the labels and make any necessary changes.

Signed-off-by: Xuanwo <github@xuanwo.io>
Signed-off-by: Xuanwo <github@xuanwo.io>
Signed-off-by: Xuanwo <github@xuanwo.io>
@Xuanwo Xuanwo marked this pull request as ready for review December 17, 2021 07:18
@Xuanwo Xuanwo requested a review from BohuTANG as a code owner December 17, 2021 07:18
@Xuanwo
Copy link
Member Author

Xuanwo commented Dec 17, 2021

/review @sundy-li @PsiACE

@databend-bot
Copy link
Member

Take the reviewer to sundy-li @PsiACE

@sundy-li
Copy link
Member

BTW Upper can be addressed too.

@Xuanwo
Copy link
Member Author

Xuanwo commented Dec 17, 2021

BTW Upper can be addressed too.

Yep, I will add in the next PR 😄

@@ -38,6 +38,7 @@ hex = "0.4.3"
base64 = "0.13.0"
itertools = "0.10.3"
num-format = "0.4"
bstr = "0.2.17"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what's this crate for?

Copy link
Member Author

@Xuanwo Xuanwo Dec 17, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bstr is a byte string library. I use the ByteSlice trait so that we don't need to implement s.char_indices() on [u8] again.

Copy link
Member

@PsiACE PsiACE left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

impl StringOperator for Lower {
#[inline]
fn apply_with_no_null<'a>(&'a mut self, s: &'a [u8], buffer: &mut [u8]) -> usize {
buffer.copy_from_slice(&s.to_lowercase());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After some research, I found:

  • to_lowercase_into requires a &mut Vec<u8> and we can't fit the requirement.

One possible solution is to implement the to_lowercase_into by hand. What's your idea?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just use the codes inside to_lowercase_into:

 #[inline]
    fn to_lowercase_into(&self, buf: &mut Vec<u8>) {
        // TODO: This is the best we can do given what std exposes I think.
        // If we roll our own case handling, then we might be able to do this
        // a bit faster. We shouldn't roll our own case handling unless we
        // need to, e.g., for doing caseless matching or case folding.

        // TODO(BUG): This doesn't handle any special casing rules.

        buf.reserve(self.as_bytes().len());
        for (s, e, ch) in self.char_indices() {
            if ch == '\u{FFFD}' {
                buf.push_str(&self.as_bytes()[s..e]);
            } else if ch.is_ascii() {
                buf.push_char(ch.to_ascii_lowercase());
            } else {
                for upper in ch.to_lowercase() {
                    buf.push_char(upper);
                }
            }
        }
    }

@codecov-commenter
Copy link

Codecov Report

Merging #3521 (bb66818) into main (e3dc408) will decrease coverage by 0%.
The diff coverage is 97%.

Impacted file tree graph

@@          Coverage Diff          @@
##            main   #3521   +/-   ##
=====================================
- Coverage     60%     60%   -1%     
=====================================
  Files        622     623    +1     
  Lines      34840   34846    +6     
=====================================
- Hits       21244   21240    -4     
- Misses     13596   13606   +10     
Impacted Files Coverage Δ
common/meta/types/src/user_stage.rs 100% <ø> (ø)
query/src/configs/config.rs 57% <90%> (ø)
metasrv/src/configs/config.rs 95% <96%> (ø)
query/src/configs/config_storage.rs 87% <96%> (ø)
common/functions/src/scalars/strings/lower.rs 100% <100%> (ø)
common/functions/src/scalars/strings/string.rs 97% <100%> (+<1%) ⬆️
common/meta/raft-store/src/config.rs 75% <100%> (ø)
common/meta/types/src/cluster.rs 88% <100%> (ø)
common/meta/types/src/user_info.rs 87% <100%> (ø)
common/meta/types/src/user_quota.rs 80% <100%> (ø)
... and 13 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 7cb8185...bb66818. Read the comment docs.

@databend-bot
Copy link
Member

CI Passed
Reviewers Approved
Let's Merge
Thank you for the PR @Xuanwo

@databend-bot databend-bot merged commit 20c53f8 into databendlabs:main Dec 17, 2021
@Xuanwo
Copy link
Member Author

Xuanwo commented Dec 17, 2021

Oh, the pr already merged...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
pr-feature this PR introduces a new feature to the codebase
Projects
None yet
Development

Successfully merging this pull request may close these issues.

LCASE() | Synonym for LOWER() LOWER() | Return the argument in lowercase
5 participants