Pass "new" tests #537

o0Ignition0o · 2019-08-02T22:21:31Z

current status :

test result: ok. 63 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out

Please note I tried my best to follow the spec, but it sometimes (often?) felt like I was just hacking my way until tests passed.

I know the PR is huge, I would love to split it (and then squash it) if it makes sense.

Mentioning @nox , @SimonSapin and @valenting because they're the contributors I've seen the most in the commit logs :)

Please let me know if there's anything I can do to improve the PR, and thank you for your time!

This change is

o0Ignition0o · 2019-08-03T22:31:46Z

The goal would be to merge this branch into the tests one, before merging both I suppose.
I can of course also make a pull request directly to master if need be

o0Ignition0o · 2019-08-16T08:27:35Z

Rebased into master, and fixed the merge issues.
I think a lot of us are on PTO right now, please let me know when you're back :)

nox · 2019-08-19T10:46:17Z

The goal would be to merge this branch into the tests one, before merging both I suppose.
I can of course also make a pull request directly to master if need be

I don't understand this part.

o0Ignition0o · 2019-08-19T10:49:09Z

I don't understand this part.

My bad, I finally changed the target branch to master. the sentence is not relevant anymore

nox · 2019-08-19T10:50:40Z

Could you maybe squash some of the commits together to make the tryouts commits and whatnot disappear, to help with review?

o0Ignition0o · 2019-08-19T10:52:36Z

Sure, would you rather have one huge commit, or several smaller ones ?
I know the PR is pretty big, it's probably a PITA to review, I'm willing to do anything I can to ease the review process, I just don't know how to make it easier to read.

nox · 2019-08-19T10:53:24Z

Small ones are ok as long as they are self-contained, for example the commits for setter fixes are ok, but the ones with commented-out assertions aren't.

o0Ignition0o · 2019-08-19T13:18:35Z

I've narrowed it down to 5-6 commits, not sure I can do more though :/

nox · 2019-08-19T13:19:03Z

That's great already! Thanks a lot for that.

nox · 2019-08-19T13:48:03Z

src/host.rs

    fn from(host: Host<S>) -> HostInternal {
        match host {
+            Host::Domain(ref s) if s.to_string().is_empty() => HostInternal::None,


What exactly is this doing? This allocates a new string every time. Do we really need the generics here? Couldn't it be From<Host<String>> or From<Host<&'_ str>>?

We actually don't need the generics here.
The main goal of this change is to make sure HostInternal::None is returned when an empty domain is given.

How can it be empty?

Here's a list of the failing tests that occur if I don't add this check:

failures: "file://hi/x".host = "" "file://hi/x".hostname = "" "file://y/".host = "loc%41lhost" "file://y/".hostname = "loc%41lhost"

Maybe the fix should occur somewhere else, ie after the host parsing ?

........................................................................................................................................................................thread '"file://hi/x".hostname = "" ' panicked at 'called `Result::unwrap()` on an `Err` value: "!( !host_str.is_empty() ) for URL \"file:///x\""', src/libcore/result.rs.:999:5 note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace. ..thread '"file://hi/x".host = "" ' panicked at 'called `Result::unwrap()` on an `Err` value: "!( !host_str.is_empty() ) for URL \"file:///x\""', src/libcore/result.rs:999:5 .......F.F...thread '"file://y/".host = "loc%41lhost" ' panicked at 'called `Result::unwrap()` on an `Err` value: "!( !host_str.is_empty() ) for URL \"file:///\""thread '', "file://y/".hostname = "loc%41lhost" src/libcore/result.rs' panicked at ':called `Result::unwrap()` on an `Err` value: "!( !host_str.is_empty() ) for URL \"file:///\""999', :src/libcore/result.rs5: 999:5 ..F.F............................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................

Where is the <From<Host<S>>>::from call done in those?

I m not really sure this is the question you're asking me, but the conversion I'm trying to work on is host.into()

src/parser.rs

nox

Not a complete review yet.

src/parser.rs

src/quirks.rs

nox · 2019-08-19T14:04:17Z

src/lib.rs

                let mut has_host = true; // FIXME
                parser.parse_path_start(scheme_type, &mut has_host, parser::Input::new(path));
+                if scheme_type.is_file() {
+                    parser::trim_path(&mut parser.serialization, path_start);
+                }


Where is the spec for this?

I had to dig quite a bit to figure out how to pass the path tests.
All I could find was this discussion.

Here's a list of test that fail when the path isn't trimmed:

running 713 tests ..........................................................................................................................................................thread '"file:///unicorn".pathname = "//\\/" File URLs and (back)slashes' panicked at 'assertion failed: `(left == right)` left: `"file://////"`, right: `"file:///"`.', tests/data.rs:192:5 note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace. thread '"file:///unicorn".pathname = "//monkey/..//" File URLs and (back)slashes' panicked at 'assertion failed: `(left == right)` left: `"file://///"`, right: `"file:///"`.', tests/data.rs:192:5 ..........FF.......thread '"file://monkey/".pathname = "\\\\" File URLs and (back)slashes' panicked at 'assertion failed: `(left == right)` left: `"file://monkey//"`, right: `"file://monkey/"`', .tests/data.rs:192:5 .F....................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................... failures: failures: "file:///unicorn".pathname = "//\\/" File URLs and (back)slashes "file:///unicorn".pathname = "//monkey/..//" File URLs and (back)slashes "file://monkey/".pathname = "\\\\" File URLs and (back)slashes

That's not a spec issue, please link to a spec commit.

If I understand correctly, you trim the path only after it was built and not trimmed? Couldn't it just not be pushing empty segments on repeated slashes in the first place?

I have just found https://bugzilla.mozilla.org/show_bug.cgi?id=1351603, but I still cannot find any spec commit :/
I ll have a look at parse_path_start and try to not push the empty segments

I believe this was the spec bug for the tests: whatwg/url#278

If url’s scheme is "file" and c is the EOF code point, U+003F (?), or U+0023 (#), then while url’s path’s size is greater than 1 and url’s path[0] is the empty string, validation error, remove the first item from url’s path.

is probably the culprit ! thanks for the hint @valenting :D

There's probably a way to avoid pushing the empty segments, which I ll work on now that I see the spec :)

nox · 2019-08-19T14:06:46Z

src/parser.rs

+    }
+
+    pub fn trim_tab_and_newlines(original_input: &'i str) -> Self {
+        let input = original_input.trim_matches(ascii_tab_or_new_line);


Mmmh, when trimming those, are they no longer syntax violations?

Oh my bad, they are !

I don't understand anymore what this code is supposed to be doing anymore. Why are we not trimming c0_control_or_space anymore? This method looks oddly like with_log now.

As defined in the basic url parser spec, removing C0 control or space from input only occurs when If url is not given.

Removing tabs and newlines occurs either way though.

This means we need to use different parser::Input s.

I think It would be more explicit if with_log / trim_tab_and_newlines mentionned that, ie
parser::Input::with_base_url / parser::Input::no_base_url or something, or having a factory that would decide on the best parser to build depending on having a base_url.

Setters would use the former, and the default parser would use the latter.
What do you think ?

src/parser.rs

nox · 2019-08-19T14:09:15Z

src/parser.rs

+            return;
+        }
+        // If url’s scheme is "file", path’s size is 1, and path[0] is a normalized Windows drive letter, then return.
+        let segments: Vec<&str> = self.serialization[path_start..]


Seems bad to me that we allocate a vector all the time here.

It is, I'll try to figure something else out :/

Found it ! I'm pushing it as a fixup, will do a rebase once we're both safisfied with most of the fixes

valenting

Let's wait for #566 to be closed and we can merge this.

Reviewed 2 of 3 files at r9.
Reviewable status: all files reviewed, 1 unresolved discussion

Nox doesn't have time for the review. I've done it instead.

The two json files were taken from web-platform-tests/wpt@e69af82 > test result: FAILED. 624 passed; 89 failed; 0 ignored; 0 measured

> test result: FAILED. 637 passed; 76 failed; 0 ignored; 0 measured

> test result: FAILED. 640 passed; 73 failed; 0 ignored; 0 measured

> test result: FAILED. 642 passed; 71 failed; 0 ignored; 0 measured

> test result: FAILED. 650 passed; 63 failed; 0 ignored; 0 measured

o0Ignition0o · 2020-01-07T12:57:02Z

@valenting Just rebased ! 🤞

Issue servo#537 fixed a large number of small bugs that affected spec compliance. We should publish a new crate version with these changes.

PR servo#537 fixed a large number of issues which affected compliance with the URL spec. We should release a new crate version with these changes.

Update version to 2.1.1 PR #537 fixed a large number of issues which affected compliance with the URL spec. We should release a new crate version with these changes.

o0Ignition0o mentioned this pull request Aug 2, 2019

Update port on scheme change + host parsing rules to the host setter + hash parsing rules #523

Closed

o0Ignition0o force-pushed the 40_tests_left branch 3 times, most recently from f3fbe89 to 3e4557c Compare August 3, 2019 21:31

o0Ignition0o changed the title ~~WIP: Pass "new" tests~~ Pass "new" tests Aug 3, 2019

o0Ignition0o marked this pull request as ready for review August 3, 2019 22:28

o0Ignition0o force-pushed the 40_tests_left branch from bc05da8 to 47fbd08 Compare August 16, 2019 08:19

o0Ignition0o changed the base branch from tests to master August 16, 2019 08:21

o0Ignition0o force-pushed the 40_tests_left branch from 47fbd08 to 8dd3dbd Compare August 16, 2019 08:26

o0Ignition0o force-pushed the 40_tests_left branch from 8dd3dbd to 3ec8a66 Compare August 16, 2019 08:40

o0Ignition0o force-pushed the 40_tests_left branch 4 times, most recently from 03df7ad to 6aa53cc Compare August 19, 2019 13:17

o0Ignition0o force-pushed the 40_tests_left branch from 6aa53cc to 20e7a17 Compare August 19, 2019 13:20

nox reviewed Aug 19, 2019

View reviewed changes

src/parser.rs Outdated Show resolved Hide resolved

nox suggested changes Aug 19, 2019

View reviewed changes

o0Ignition0o mentioned this pull request Aug 19, 2019

Make sure a windows drive letter segment always ends with a slash. #538

Closed

o0Ignition0o force-pushed the 40_tests_left branch 2 times, most recently from c83a24a to 63c69e8 Compare August 19, 2019 14:40

valenting approved these changes Dec 10, 2019

View reviewed changes

valenting mentioned this pull request Dec 12, 2019

2019-12-12 meeting notes mozilla-necko/meeting-notes#53

Closed

nox and others added 15 commits January 7, 2020 13:56

Update tests from wpt

1655a76

The two json files were taken from web-platform-tests/wpt@e69af82 > test result: FAILED. 624 passed; 89 failed; 0 ignored; 0 measured

Fix percent encoding of fragments (closes servo#491)

fa9f044

> test result: FAILED. 637 passed; 76 failed; 0 ignored; 0 measured

Refactor parse_file to look more like the spec

412266a

Fix a Windows quirk

e93f999

> test result: FAILED. 640 passed; 73 failed; 0 ignored; 0 measured

Properly copy hosts of base file:// URLs when needed

efe9ab9

> test result: FAILED. 642 passed; 71 failed; 0 ignored; 0 measured

Path and file parsing.

54a158b

Host parsing rules.

0586854

Hash getter and setter.

26ccc0d

Fix scheme setter

7efdc53

> test result: FAILED. 650 passed; 63 failed; 0 ignored; 0 measured

removing unused imports.

736d7bc

Pleasing the 1.33.0 borrow checker.

a9ca033

Make sure a windows drive letter segment always ends with a slash.

8ef4847

trim file paths if needed.

aeef54f

Avoid allocation when checking for windows drive letters.

925ec94

Comments and nits fixups.

4464840

o0Ignition0o force-pushed the 40_tests_left branch from 3b394f6 to 4464840 Compare January 7, 2020 12:56

valenting merged commit 9cd6467 into servo:master Jan 7, 2020

This was referenced Jan 7, 2020

Add url.includes_credentials() and url.is_special(). #520

Open

Avoid reparse issues with non-special URLs #459

Closed

o0Ignition0o deleted the 40_tests_left branch January 7, 2020 14:01

valenting added a commit to valenting/rust-url that referenced this pull request Jan 8, 2020

Bump version to 2.1.1

9e701cd

Issue servo#537 fixed a large number of small bugs that affected spec compliance. We should publish a new crate version with these changes.

valenting added a commit to valenting/rust-url that referenced this pull request Jan 8, 2020

Update version to 2.1.1

1593578

PR servo#537 fixed a large number of issues which affected compliance with the URL spec. We should release a new crate version with these changes.

valenting mentioned this pull request Jan 8, 2020

Update version to 2.1.1 #575

Merged

utsavoza mentioned this pull request May 7, 2020

Investigate failure of URL parsing tests servo/servo#26287

Open

erickt mentioned this pull request Nov 18, 2022

Missing "/" in path when url contains a scheme other than "https" #773

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Pass "new" tests #537

Pass "new" tests #537

o0Ignition0o commented Aug 2, 2019 •

edited

Loading

o0Ignition0o commented Aug 3, 2019

o0Ignition0o commented Aug 16, 2019

nox commented Aug 19, 2019

o0Ignition0o commented Aug 19, 2019 •

edited

Loading

nox commented Aug 19, 2019

o0Ignition0o commented Aug 19, 2019 •

edited

Loading

nox commented Aug 19, 2019

o0Ignition0o commented Aug 19, 2019

nox commented Aug 19, 2019

nox Aug 19, 2019

o0Ignition0o Aug 19, 2019

nox Aug 19, 2019

o0Ignition0o Aug 19, 2019 •

edited

Loading

nox Aug 20, 2019

o0Ignition0o Nov 10, 2019

nox left a comment

nox Aug 19, 2019

o0Ignition0o Aug 19, 2019

nox Aug 20, 2019

o0Ignition0o Oct 24, 2019 •

edited

Loading

valenting Oct 24, 2019

o0Ignition0o Oct 25, 2019 •

edited

Loading

nox Aug 19, 2019

o0Ignition0o Aug 19, 2019

nox Aug 20, 2019

o0Ignition0o Aug 22, 2019

nox Aug 19, 2019

o0Ignition0o Aug 19, 2019

o0Ignition0o Nov 10, 2019

valenting left a comment

o0Ignition0o commented Jan 7, 2020

Pass "new" tests #537

Pass "new" tests #537

Conversation

o0Ignition0o commented Aug 2, 2019 • edited Loading

o0Ignition0o commented Aug 3, 2019

o0Ignition0o commented Aug 16, 2019

nox commented Aug 19, 2019

o0Ignition0o commented Aug 19, 2019 • edited Loading

nox commented Aug 19, 2019

o0Ignition0o commented Aug 19, 2019 • edited Loading

nox commented Aug 19, 2019

o0Ignition0o commented Aug 19, 2019

nox commented Aug 19, 2019

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

o0Ignition0o Aug 19, 2019 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

nox left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

o0Ignition0o Oct 24, 2019 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

o0Ignition0o Oct 25, 2019 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

valenting left a comment

Choose a reason for hiding this comment

o0Ignition0o commented Jan 7, 2020

o0Ignition0o commented Aug 2, 2019 •

edited

Loading

o0Ignition0o commented Aug 19, 2019 •

edited

Loading

o0Ignition0o commented Aug 19, 2019 •

edited

Loading

o0Ignition0o Aug 19, 2019 •

edited

Loading

o0Ignition0o Oct 24, 2019 •

edited

Loading

o0Ignition0o Oct 25, 2019 •

edited

Loading