-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix ListingTableUrl
to decode percent
#3750
Conversation
@@ -80,6 +80,7 @@ ordered-float = "3.0" | |||
parking_lot = "0.12" | |||
parquet = { version = "24.0.0", features = ["arrow", "async"] } | |||
paste = "^1.0" | |||
percent-encoding = "2.2.0" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
datafusion/core
uses url
crate by servo.
percent-encoding
is also published by the same repository.
@@ -108,7 +109,8 @@ impl ListingTableUrl { | |||
|
|||
/// Creates a new [`ListingTableUrl`] from a url and optional glob expression | |||
fn new(url: Url, glob: Option<Pattern>) -> Self { | |||
let prefix = Path::parse(url.path()).expect("should be URL safe"); | |||
let decoded_path = percent_encoding::percent_decode_str(url.path()).decode_utf8_lossy(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the url.path()
, it is still percent encoded string.
percent_decode_str
decodes it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this will cause the parse call below to fail in the event of non-ASCII characters.
I think it should just be a case of change Path::parse
to Path::from
@alamb PTAL? |
@@ -246,6 +248,9 @@ mod tests { | |||
let url = ListingTableUrl::parse("file:///foo").unwrap(); | |||
let child = Path::parse("/foob/bar").unwrap(); | |||
assert!(url.strip_prefix(&child).is_none()); | |||
|
|||
let url = ListingTableUrl::parse("file:///with space/foo/bar").unwrap(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we get a test of a non-ASCII character such as a percent encoded emoji or something
We should also test non-URL safe characters like a percent encoded ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. I added some test cases at 439f6dc
if it's not enough or wrong to test, please tell me about it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me 👍
I think this just needs a cargo fmt and then we can get it in 😃 |
@tustvold Oh sorry, I added a commit for that😺 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @unvalley
Benchmark runs are scheduled for baseline = ac1631a and contender = 2352f3e. 2352f3e is a master commit associated with this PR. Results will be available as each benchmark for each run completes. |
Which issue does this PR close?
Closes #3589
Rationale for this change
What changes are included in this PR?
percent_encoding
crate todatafusion/core
to decode percent encoded url path.Are there any user-facing changes?