-
Notifications
You must be signed in to change notification settings - Fork 334
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Pass "new" tests #537
Pass "new" tests #537
Conversation
f3fbe89
to
3e4557c
Compare
The goal would be to merge this branch into the tests one, before merging both I suppose. |
bc05da8
to
47fbd08
Compare
47fbd08
to
8dd3dbd
Compare
Rebased into master, and fixed the merge issues. |
8dd3dbd
to
3ec8a66
Compare
I don't understand this part. |
My bad, I finally changed the target branch to master. the sentence is not relevant anymore |
Could you maybe squash some of the commits together to make the tryouts commits and whatnot disappear, to help with review? |
Sure, would you rather have one huge commit, or several smaller ones ? |
Small ones are ok as long as they are self-contained, for example the commits for setter fixes are ok, but the ones with commented-out assertions aren't. |
03df7ad
to
6aa53cc
Compare
I've narrowed it down to 5-6 commits, not sure I can do more though :/ |
That's great already! Thanks a lot for that. |
6aa53cc
to
20e7a17
Compare
src/host.rs
Outdated
fn from(host: Host<S>) -> HostInternal { | ||
match host { | ||
Host::Domain(ref s) if s.to_string().is_empty() => HostInternal::None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What exactly is this doing? This allocates a new string every time. Do we really need the generics here? Couldn't it be From<Host<String>>
or From<Host<&'_ str>>
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We actually don't need the generics here.
The main goal of this change is to make sure HostInternal::None is returned when an empty domain is given.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How can it be empty?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here's a list of the failing tests that occur if I don't add this check:
failures:
"file://hi/x".host = ""
"file://hi/x".hostname = ""
"file://y/".host = "loc%41lhost"
"file://y/".hostname = "loc%41lhost"
Maybe the fix should occur somewhere else, ie after the host parsing ?
........................................................................................................................................................................thread '"file://hi/x".hostname = "" ' panicked at 'called `Result::unwrap()` on an `Err` value: "!( !host_str.is_empty() ) for URL \"file:///x\""', src/libcore/result.rs.:999:5
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace.
..thread '"file://hi/x".host = "" ' panicked at 'called `Result::unwrap()` on an `Err` value: "!( !host_str.is_empty() ) for URL \"file:///x\""', src/libcore/result.rs:999:5
.......F.F...thread '"file://y/".host = "loc%41lhost" ' panicked at 'called `Result::unwrap()` on an `Err` value: "!( !host_str.is_empty() ) for URL \"file:///\""thread '', "file://y/".hostname = "loc%41lhost" src/libcore/result.rs' panicked at ':called `Result::unwrap()` on an `Err` value: "!( !host_str.is_empty() ) for URL \"file:///\""999', :src/libcore/result.rs5:
999:5
..F.F............................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where is the <From<Host<S>>>::from
call done in those?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I m not really sure this is the question you're asking me, but the conversion I'm trying to work on is host.into()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not a complete review yet.
src/lib.rs
Outdated
let mut has_host = true; // FIXME | ||
parser.parse_path_start(scheme_type, &mut has_host, parser::Input::new(path)); | ||
if scheme_type.is_file() { | ||
parser::trim_path(&mut parser.serialization, path_start); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where is the spec for this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had to dig quite a bit to figure out how to pass the path tests.
All I could find was this discussion.
Here's a list of test that fail when the path isn't trimmed:
running 713 tests
..........................................................................................................................................................thread '"file:///unicorn".pathname = "//\\/" File URLs and (back)slashes' panicked at 'assertion failed: `(left == right)`
left: `"file://////"`,
right: `"file:///"`.', tests/data.rs:192:5
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace.
thread '"file:///unicorn".pathname = "//monkey/..//" File URLs and (back)slashes' panicked at 'assertion failed: `(left == right)`
left: `"file://///"`,
right: `"file:///"`.', tests/data.rs:192:5
..........FF.......thread '"file://monkey/".pathname = "\\\\" File URLs and (back)slashes' panicked at 'assertion failed: `(left == right)`
left: `"file://monkey//"`,
right: `"file://monkey/"`', .tests/data.rs:192:5
.F.......................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................
failures:
failures:
"file:///unicorn".pathname = "//\\/" File URLs and (back)slashes
"file:///unicorn".pathname = "//monkey/..//" File URLs and (back)slashes
"file://monkey/".pathname = "\\\\" File URLs and (back)slashes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's not a spec issue, please link to a spec commit.
If I understand correctly, you trim the path only after it was built and not trimmed? Couldn't it just not be pushing empty segments on repeated slashes in the first place?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have just found https://bugzilla.mozilla.org/show_bug.cgi?id=1351603, but I still cannot find any spec commit :/
I ll have a look at parse_path_start and try to not push the empty segments
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe this was the spec bug for the tests: whatwg/url#278
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If url’s scheme is "file" and c is the EOF code point, U+003F (?), or U+0023 (#),
then while url’s path’s size is greater than 1 and url’s path[0] is the empty string,
validation error, remove the first item from url’s path.
is probably the culprit ! thanks for the hint @valenting :D
There's probably a way to avoid pushing the empty segments, which I ll work on now that I see the spec :)
} | ||
|
||
pub fn trim_tab_and_newlines(original_input: &'i str) -> Self { | ||
let input = original_input.trim_matches(ascii_tab_or_new_line); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mmmh, when trimming those, are they no longer syntax violations?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh my bad, they are !
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand anymore what this code is supposed to be doing anymore. Why are we not trimming c0_control_or_space
anymore? This method looks oddly like with_log
now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As defined in the basic url parser spec, removing C0 control or space from input only occurs when If url is not given.
Removing tabs and newlines occurs either way though.
This means we need to use different parser::Input
s.
I think It would be more explicit if with_log / trim_tab_and_newlines mentionned that, ie
parser::Input::with_base_url / parser::Input::no_base_url or something, or having a factory that would decide on the best parser to build depending on having a base_url.
Setters would use the former, and the default parser would use the latter.
What do you think ?
src/parser.rs
Outdated
return; | ||
} | ||
// If url’s scheme is "file", path’s size is 1, and path[0] is a normalized Windows drive letter, then return. | ||
let segments: Vec<&str> = self.serialization[path_start..] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems bad to me that we allocate a vector all the time here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is, I'll try to figure something else out :/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Found it ! I'm pushing it as a fixup, will do a rebase once we're both safisfied with most of the fixes
c83a24a
to
63c69e8
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's wait for #566 to be closed and we can merge this.
Reviewed 2 of 3 files at r9.
Reviewable status: all files reviewed, 1 unresolved discussion
Nox doesn't have time for the review. I've done it instead.
The two json files were taken from web-platform-tests/wpt@e69af82 > test result: FAILED. 624 passed; 89 failed; 0 ignored; 0 measured
> test result: FAILED. 637 passed; 76 failed; 0 ignored; 0 measured
> test result: FAILED. 640 passed; 73 failed; 0 ignored; 0 measured
> test result: FAILED. 642 passed; 71 failed; 0 ignored; 0 measured
> test result: FAILED. 650 passed; 63 failed; 0 ignored; 0 measured
3b394f6
to
4464840
Compare
@valenting Just rebased ! 🤞 |
Issue servo#537 fixed a large number of small bugs that affected spec compliance. We should publish a new crate version with these changes.
PR servo#537 fixed a large number of issues which affected compliance with the URL spec. We should release a new crate version with these changes.
Update version to 2.1.1 PR #537 fixed a large number of issues which affected compliance with the URL spec. We should release a new crate version with these changes.
current status :
Please note I tried my best to follow the spec, but it sometimes (often?) felt like I was just hacking my way until tests passed.
I know the PR is huge, I would love to split it (and then squash it) if it makes sense.
Mentioning @nox , @SimonSapin and @valenting because they're the contributors I've seen the most in the commit logs :)
Please let me know if there's anything I can do to improve the PR, and thank you for your time!
This change is