-
Notifications
You must be signed in to change notification settings - Fork 13.6k
get rid of some false negatives in rustdoc::broken_intra_doc_links #132748
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
get rid of some false negatives in rustdoc::broken_intra_doc_links #132748
Conversation
r? @notriddle rustbot has assigned @notriddle. Use |
df3d0f6
to
8ae6586
Compare
This comment has been minimized.
This comment has been minimized.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks like a good idea, but I have found one problem. Consider a sample like this:
//@ check-pass
#![no_std]
#![deny(rustdoc::broken_intra_doc_links)]
// regression test for https://github.com/rust-lang/rust/issues/54191
// false positives
//! This is not an intra-doc link: [`foobar`]
//!
//! [`foobar`]: /index.html
That test case fails, because it thinks /index.html
is an intra-doc link. You're right that certain kinds of links aren't allowed to be URLs—that's the Unknown type links.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
welp, turns out the standard library itself contains several of these false negatives! |
...or perhaps there's some deeper bug with intra-doc link generation? there seems to be some false positives containing spaces, even though there is a matching reference definition... I guess we can just re-enable unconditionally ignoring links with spaces? |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
Ok, this breaks a lot of ui tests. This might actually have a pretty big impact, maybe we should do a crater run or something..? |
It's looks fine to me. Here's the PRs where each of these test cases were added:
The last two are both contrived ICE tests, and all of them are intended to be parsed as intra-doc links. Making it so that they warn seems fine to me. r? @GuillaumeGomez do you also think this is a suitable bug fix? |
I think so. I'm not completely satisfied with the current impact though. Intra-doc links should not be triggered on items that contain characters which can't be in an ident or a path, like |
@GuillaumeGomez what about just amending the current warning to account for other common mistakes? honestly makes me wish we had regardless, i think that should be a separate issue, since the lint output isn't the most helpful in general (it just recommends escaping the brackets, and doesn't mention other common issues, such as misspelled link references) |
Do you have an example of before/after of what you have in mind by any chance? |
well, the most obvious one is some sort of "consider adding a reference: and perhaps we should also mention surrounding code by backticks, since code snippets should generally be in |
Sounds good to me. The distance between two items could be nice too (as a follow-up?). |
Then let's go, thanks everyone! @bors r+ rollup |
…k-warn-more-54191, r=GuillaumeGomez get rid of some false negatives in rustdoc::broken_intra_doc_links rustdoc will not try to do intra-doc linking if the "path" of a link looks too much like a "real url". however, only inline links (`[text](url)`) can actually contain a url, other types of links (reference links, shortcut links) contain a *reference* which is later resolved to an actual url. the "path" in this case cannot be a url, and therefore it should not be skipped due to looking like a url. fixes rust-lang#54191 to minimize the number of false positives that will be introduced, the following heuristic is used: If there's no backticks, be lenient revert to old behavior. This is to prevent churn by linting on stuff that isn't meant to be a link. only shortcut links have simple enough syntax that they are likely to be written accidentlly, collapsed and reference links need 4 metachars, and reference links will not usually use backticks in the reference name. therefore, only shortcut syntax gets the lenient behavior. here's a truth table for how link kinds that cannot be urls are handled: | | is shortcut link | not shortcut link | |--------------|--------------------|-------------------| | has backtick | never ignore | never ignore | | no backtick | ignore if url-like | never ignore |
…k-warn-more-54191, r=GuillaumeGomez get rid of some false negatives in rustdoc::broken_intra_doc_links rustdoc will not try to do intra-doc linking if the "path" of a link looks too much like a "real url". however, only inline links (`[text](url)`) can actually contain a url, other types of links (reference links, shortcut links) contain a *reference* which is later resolved to an actual url. the "path" in this case cannot be a url, and therefore it should not be skipped due to looking like a url. fixes rust-lang#54191 to minimize the number of false positives that will be introduced, the following heuristic is used: If there's no backticks, be lenient revert to old behavior. This is to prevent churn by linting on stuff that isn't meant to be a link. only shortcut links have simple enough syntax that they are likely to be written accidentlly, collapsed and reference links need 4 metachars, and reference links will not usually use backticks in the reference name. therefore, only shortcut syntax gets the lenient behavior. here's a truth table for how link kinds that cannot be urls are handled: | | is shortcut link | not shortcut link | |--------------|--------------------|-------------------| | has backtick | never ignore | never ignore | | no backtick | ignore if url-like | never ignore |
…k-warn-more-54191, r=GuillaumeGomez get rid of some false negatives in rustdoc::broken_intra_doc_links rustdoc will not try to do intra-doc linking if the "path" of a link looks too much like a "real url". however, only inline links (`[text](url)`) can actually contain a url, other types of links (reference links, shortcut links) contain a *reference* which is later resolved to an actual url. the "path" in this case cannot be a url, and therefore it should not be skipped due to looking like a url. fixes rust-lang#54191 to minimize the number of false positives that will be introduced, the following heuristic is used: If there's no backticks, be lenient revert to old behavior. This is to prevent churn by linting on stuff that isn't meant to be a link. only shortcut links have simple enough syntax that they are likely to be written accidentlly, collapsed and reference links need 4 metachars, and reference links will not usually use backticks in the reference name. therefore, only shortcut syntax gets the lenient behavior. here's a truth table for how link kinds that cannot be urls are handled: | | is shortcut link | not shortcut link | |--------------|--------------------|-------------------| | has backtick | never ignore | never ignore | | no backtick | ignore if url-like | never ignore |
Rollup of 15 pull requests Successful merges: - #132748 (get rid of some false negatives in rustdoc::broken_intra_doc_links) - #143374 (Unquerify extern_mod_stmt_cnum.) - #143838 (std: net: uefi: Add support to query connection data) - #144014 (don't link to the nightly version of the Edition Guide in stable lints) - #144094 (Ensure we codegen the main fn) - #144218 (Use serde for target spec json deserialize) - #144221 (generate elf symbol version in raw-dylib) - #144240 (Add more test case to check if the false note related to sealed trait suppressed) - #144247 (coretests/num: use ldexp instead of hard-coding a power of 2) - #144276 (Use less HIR in check_private_in_public.) - #144317 (pass build.npm from bootstrap to tidy and use it for npm install) - #144320 (rustdoc: avoid allocating a temp String for aliases in search index) - #144334 (rustc_resolve: get rid of unused rustdoc::span_of_fragments_with_expansion) - #144335 (Don't suggest assoc ty bound on non-angle-bracketed problematic assoc ty binding) - #144358 (Stop using the old `validate_attr` logic for stability attributes) r? `@ghost` `@rustbot` modify labels: rollup
@bors try jobs=dist-x86_64-linux-alt |
…-54191, r=<try> get rid of some false negatives in rustdoc::broken_intra_doc_links try-job: dist-x86_64-linux-alt
EDIT: nvm there's multiple comments referencing README.md in that file, i was looking at the wrong one. |
This comment has been minimized.
This comment has been minimized.
💔 Test failed (CI). Failed jobs:
|
rustdoc will not try to do intra-doc linking if the "path" of a link looks too much like a "real url". however, only inline links ([text](url)) can actually contain a url, other types of links (reference links, shortcut links) contain a *reference* which is later resolved to an actual url. the "path" in this case cannot be a url, and therefore it should not be skipped due to looking like a url. Co-authored-by: Michael Howell <michael@notriddle.com>
this is in an effort to reduce the amount of code churn caused by this lint triggering on text that was never meant to be a link. a more principled hierustic for ignoring lints is not possible without extensive changes, due to the lint emitting code being so far away from the link collecting code, and the fact that only the link collecting code has access to details about how the link appears in the unnormalized markdown.
collapsed links and reference links have a pretty particular syntax, it seems unlikely they would show up on accident. Co-authored-by: León Orell Valerian Liehr <me@fmease.dev>
2d38996
to
a73d7e3
Compare
Some changes occurred to MIR optimizations cc @rust-lang/wg-mir-opt |
rustdoc will not try to do intra-doc linking if the "path" of a link looks too much like a "real url".
however, only inline links (
[text](url)
) can actually contain a url, other types of links (reference links, shortcut links) contain a reference which is later resolved to an actual url.the "path" in this case cannot be a url, and therefore it should not be skipped due to looking like a url.
fixes #54191
to minimize the number of false positives that will be introduced, the following heuristic is used:
If there's no backticks, be lenient revert to old behavior.
This is to prevent churn by linting on stuff that isn't meant to be a link.
only shortcut links have simple enough syntax that they
are likely to be written accidentlly, collapsed and reference links
need 4 metachars, and reference links will not usually use
backticks in the reference name.
therefore, only shortcut syntax gets the lenient behavior.
here's a truth table for how link kinds that cannot be urls are handled: