Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix like matching with newlines and no wildcard #23404

Merged
merged 1 commit into from
Aug 12, 2024

Conversation

rschlussel
Copy link
Contributor

@rschlussel rschlussel commented Aug 8, 2024

Description

LIKE expressions were not looking past newlines when there was no wild card. That means that certain expressions were incorrectly returning matches when after the newline the input did not match. This change fixes that behavior. For example:

SELECT 'foo\nbar' LIKE 'foo'.

Previously that query would return "true". Now it will return false.

Motivation and Context

Fixes #23281

Fixes a bug with like expressions incorrectly matching only to a newline when there is no wildcard.

Impact

Like expressions with patterns that do not end in a wildcard may now return false when previously they would return true

Test Plan

unit tests
verification on our production workload

Contributor checklist

  • Please make sure your submission complies with our development, formatting, commit message, and attribution guidelines.
  • PR description addresses the issue accurately and concisely. If the change is non-trivial, a GitHub Issue is referenced.
  • Documented new properties (with its default value), SQL syntax, functions, or other functionality.
  • If release notes are required, they follow the release notes guidelines.
  • Adequate tests were added if applicable.
  • CI passed.

Release Notes

Please follow release notes guidelines and fill in the release notes below.

== RELEASE NOTES ==

General Changes
* Fix a bug where like predicates would only match to a newline when there was no wildcard at the end. :pr:`23404`

@jp-sivaprasad
Copy link
Contributor

jp-sivaprasad commented Aug 8, 2024

Suggested additional tests

@Test
public void testLikeNewlineAtMatch()
{
    Regex regex = likePattern(utf8Slice("foo_bar"));
    assertTrue(likeVarchar(utf8Slice("foo\nbar"), regex));
}

@Test
public void testLikeNewlineInMatchWithSingle()
{
    Regex regex = likePattern(utf8Slice("fo_%"));
    assertTrue(likeVarchar(utf8Slice("foo\nbar"), regex));
}

cc @tdcmeehan

steveburnett
steveburnett previously approved these changes Aug 8, 2024
Copy link
Contributor

@steveburnett steveburnett left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! (docs)

Pull branch, new local doc build, looks good. Thanks!

@steveburnett
Copy link
Contributor

Nit, suggest adding the PR number to the release note entry

== RELEASE NOTES ==

General Changes
* Fix a bug where like predicates would only match to a newline when there was no wildcard at the end. :pr:`23404`

@rschlussel
Copy link
Contributor Author

Suggested additional tests

@Test
public void testLikeNewlineAtMatch()
{
    Regex regex = likePattern(utf8Slice("foo_bar"));
    assertTrue(likeVarchar(utf8Slice("foo\nbar"), regex));
}

@Test
public void testLikeNewlineInMatchWithSingle()
{
    Regex regex = likePattern(utf8Slice("fo_%"));
    assertTrue(likeVarchar(utf8Slice("foo\nbar"), regex));
}

cc @tdcmeehan

@jp-sivaprasad Thanks for the review! I added the first test. The second case I felt the existing newline tests with wildcards provided enough coverage, so I left it out. Also, I see now that you had assigned this issue to yourself. Sorry for not coordinating with you first!

@facebook-github-bot
Copy link
Collaborator

@rschlussel has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@rschlussel rschlussel marked this pull request as ready for review August 8, 2024 23:51
Copy link
Member

@hantangwangd hantangwangd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me, some little nits.

presto-docs/src/main/sphinx/functions/comparison.rst Outdated Show resolved Hide resolved
presto-docs/src/main/sphinx/functions/comparison.rst Outdated Show resolved Hide resolved
LIKE expressions were not looking past newlines when there was no wild
card. That means that certain expressions were incorrectly returning
matches when after the newline the input did not match. This change
fixes that behavior. For example:

SELECT 'foo\nbar' LIKE 'foo'.

Previously that query would return "true". Now it will return
false.
@facebook-github-bot
Copy link
Collaborator

@rschlussel has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@rschlussel rschlussel merged commit 9db7e1b into prestodb:master Aug 12, 2024
58 of 59 checks passed
@rschlussel rschlussel deleted the like-newline branch August 12, 2024 19:55
@tdcmeehan tdcmeehan mentioned this pull request Aug 23, 2024
34 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

LIKE matches till new line
7 participants