Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix parsing error of email addresses with query params #809

Merged
merged 4 commits into from
Nov 5, 2022
Merged

Conversation

mre
Copy link
Member

@mre mre commented Nov 5, 2022

Email addresses with query parameters often get used in
contact forms on websites. They can also be found in
other documents like Markdown.

A common use-case is to add a subject line to the email
as a parameter e.g. mailto:mail@example.com?subject="Hello".

Previously we handled such cases incorrectly by interpreting
them as files. The reason was that our email parsing was too strict
to allow for query parameters at the end of an email address.
With email_address we switched to a more permissive parser.

Note that this does not affect the actual address email checking,
as this is still done with check-if-email-exists, which has more strict
check functionality. For the check we pass the path to check-if-email-exists
and ignore the query params because they aren't part the email address
but rather get passed to the mail client.

This change modifies the URI parsing for all links.
All tests pass, so I hope I didn't break anything, but I'd appreciate some
review/testing before merging this.
As an added benefit I optimized the common path, which is a non-email
URL, so parsing should overall be slightly faster, even though I haven't
done any benchmarks and the performance benefit is probably negligible.

Fixes #653.
@egrieco, @Stargator, @wolf99 fyi

mre added 2 commits November 5, 2022 14:01
Email addresses with query parameters often get used in
contact forms on websites. They can also be found in
other documents like Markdown.

A common use-case is to add a subject line to the email
as a parameter e.g. `mailto:mail@example.com?subject="Hello"`.

Previously we handled such cases incorrectly by recognizing
them as files. The reason was that our email parsing was too strict
to allow for that use-case.
With `email_address` we switched to a more permissive parser.

Note that this does not affect the actual address email checking,
as this is still done `check-if-email-exists`, which has more strict
check functionality.
Skip query params (if exists).
@Stargator
Copy link

Thank you, @mre. Nothing stands out from a code review, but Ruby is not my strongest language.

I'll be able to run the tests later and double check all is good.

@mre
Copy link
Member Author

mre commented Nov 5, 2022

Ruby is not my strongest language

Mine neither. 🤣 But I know you meant Rust.

I'll be able to run the tests later and double check all is good.

Thanks for the support on this one. 👍

@Stargator
Copy link

Mine neither. 🤣 But I know you meant Rust.

Oh...yeah... I didn't even look at the file extension. I just thought the syntax looked ruby-like. Thanks

@Stargator
Copy link

@mre When I run the two new tests the output does not seem to state the tests were actually executed.

Below is an example. All I see is "running 0 tests", am I doing anything wrong?

~/Workspace/GitHub/lychee (mailto *)$ cargo test test_email_html_with_subject
    Finished test [unoptimized + debuginfo] target(s) in 0.44s
     Running unittests src/main.rs (target/debug/deps/lychee-27966ae4bc81e0dc)

running 0 tests

test result: ok. 0 passed; 0 failed; 0 ignored; 0 measured; 12 filtered out; finished in 0.00s

     Running tests/example_domains.rs (target/debug/deps/example_domains-8d3a6e088dd0d97d)

running 0 tests

test result: ok. 0 passed; 0 failed; 0 ignored; 0 measured; 1 filtered out; finished in 0.00s

     Running tests/local_files.rs (target/debug/deps/local_files-88fa86338aa53c8d)

running 0 tests

test result: ok. 0 passed; 0 failed; 0 ignored; 0 measured; 2 filtered out; finished in 0.00s

     Running tests/usage.rs (target/debug/deps/usage-6b8f80821d17bf05)

running 0 tests

test result: ok. 0 passed; 0 failed; 0 ignored; 0 measured; 1 filtered out; finished in 0.00s

     Running unittests src/lib.rs (target/debug/deps/lychee_lib-4364c2b43d456ab2)

running 0 tests

test result: ok. 0 passed; 0 failed; 0 ignored; 0 measured; 112 filtered out; finished in 0.00s

@mre
Copy link
Member Author

mre commented Nov 5, 2022

This would be the full command:

cargo test --package lychee --test cli --features check_example_domains -- cli::test_email_html_with_subject --exact --nocapture

@mre
Copy link
Member Author

mre commented Nov 5, 2022

I did test it with the dart repo and it looks fine. Anything else I should test it with?

@Stargator
Copy link

@mre Could you test it with the exercism/problem-specifications repo?

@Stargator
Copy link

This would be the full command

Thank you. I played around with the test, changing the expected values as well as the TEST_EMAIL_QUERY_PARAMS html and md files, just to verify it accepts net and org domains as well as verifying it works with no query params.

Outside of the problem-specifications repo, I think that would be it.

@mre
Copy link
Member Author

mre commented Nov 5, 2022

Very nice. Thanks for taking a look.
The exercism/problem-specifications works, too:

problem-specifications ❯❯❯ lychee .
🔍 94 Total ✅ 94 OK 🚫 0 Errors

Merging.

@mre mre merged commit d61105e into master Nov 5, 2022
@mre mre deleted the mailto branch November 5, 2022 22:40
@mre
Copy link
Member Author

mre commented Nov 9, 2022

New version of lychee-action with the fix is out.

@egrieco
Copy link

egrieco commented Nov 17, 2022

Awesome! You beat me to the fix (unsurprisingly). 😄

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Mailto urls with query parameters incorrectly converted to file urls
3 participants