forked from facebookincubator/velox
-
Notifications
You must be signed in to change notification settings - Fork 4
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Fix url_extract_* functions for malformed URL's (facebookincubator#7668)
Summary: Fixes facebookincubator#7038 Malformed URL's which contain invalid escape sequences (%xx) used to throw in Velox, but not in Presto. Also, for absolute URI's, url_extract_path used to return NULL when it should return the path, e.g url_extract_path('foo') should return 'foo'. Fix this by making the scheme/authority/path regex to be compliant with the URI RFC (https://www.rfc-editor.org/rfc/rfc3986#appendix-A). Some examples of the new changes: ``` > SELECT url_extract_path('https://www.ucu.edu.uy/agenda/evento/%%UCUrlCompartir%%'); Before: throws exception. After: returns NULL. > SELECT url_extract_path('foo'); Before: returns NULL. After: returns 'foo'. ``` Pull Request resolved: facebookincubator#7668 Reviewed By: mbasmanova Differential Revision: D51487947 Pulled By: kgpai fbshipit-source-id: 3c80e196b1512f62cfcb4f3465ac75fc96b8482c
- Loading branch information
1 parent
b69844d
commit 70320dd
Showing
4 changed files
with
164 additions
and
35 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters