Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rustdoc: only report broken ref-style links once #77859

Merged
merged 1 commit into from
Jan 3, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
24 changes: 17 additions & 7 deletions src/librustdoc/html/markdown.rs
Original file line number Diff line number Diff line change
Expand Up @@ -37,7 +37,9 @@ use crate::doctest;
use crate::html::highlight;
use crate::html::toc::TocBuilder;

use pulldown_cmark::{html, BrokenLink, CodeBlockKind, CowStr, Event, Options, Parser, Tag};
use pulldown_cmark::{
html, BrokenLink, CodeBlockKind, CowStr, Event, LinkType, Options, Parser, Tag,
};

#[cfg(test)]
mod tests;
Expand Down Expand Up @@ -327,8 +329,6 @@ impl<'a, I: Iterator<Item = Event<'a>>> Iterator for LinkReplacer<'a, I> {
type Item = Event<'a>;

fn next(&mut self) -> Option<Self::Item> {
use pulldown_cmark::LinkType;

let mut event = self.inner.next();

// Replace intra-doc links and remove disambiguators from shortcut links (`[fn@f]`).
Expand Down Expand Up @@ -1123,7 +1123,13 @@ crate fn plain_text_summary(md: &str) -> String {
s
}

crate fn markdown_links(md: &str) -> Vec<(String, Range<usize>)> {
crate struct MarkdownLink {
pub kind: LinkType,
pub link: String,
pub range: Range<usize>,
}

crate fn markdown_links(md: &str) -> Vec<MarkdownLink> {
if md.is_empty() {
return vec![];
}
Expand Down Expand Up @@ -1163,7 +1169,11 @@ crate fn markdown_links(md: &str) -> Vec<(String, Range<usize>)> {

let mut push = |link: BrokenLink<'_>| {
let span = span_for_link(&CowStr::Borrowed(link.reference), link.span);
links.borrow_mut().push((link.reference.to_owned(), span));
links.borrow_mut().push(MarkdownLink {
kind: LinkType::ShortcutUnknown,
link: link.reference.to_owned(),
range: span,
});
None
};
let p = Parser::new_with_broken_link_callback(md, opts(), Some(&mut push)).into_offset_iter();
Expand All @@ -1174,10 +1184,10 @@ crate fn markdown_links(md: &str) -> Vec<(String, Range<usize>)> {
let iter = Footnotes::new(HeadingLinks::new(p, None, &mut ids));

for ev in iter {
if let Event::Start(Tag::Link(_, dest, _)) = ev.0 {
if let Event::Start(Tag::Link(kind, dest, _)) = ev.0 {
debug!("found link: {}", dest);
let span = span_for_link(&dest, ev.1);
links.borrow_mut().push((dest.into_string(), span));
links.borrow_mut().push(MarkdownLink { kind, link: dest.into_string(), range: span });
}
}

Expand Down
126 changes: 74 additions & 52 deletions src/librustdoc/passes/collect_intra_doc_links.rs
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,8 @@ use rustc_span::symbol::Symbol;
use rustc_span::DUMMY_SP;
use smallvec::{smallvec, SmallVec};

use pulldown_cmark::LinkType;

use std::borrow::Cow;
use std::cell::Cell;
use std::convert::{TryFrom, TryInto};
Expand All @@ -34,7 +36,7 @@ use std::ops::Range;
use crate::clean::{self, utils::find_nearest_parent_module, Crate, Item, ItemLink, PrimitiveType};
use crate::core::DocContext;
use crate::fold::DocFolder;
use crate::html::markdown::markdown_links;
use crate::html::markdown::{markdown_links, MarkdownLink};
use crate::passes::Pass;

use super::span_of_attrs;
Expand Down Expand Up @@ -265,8 +267,9 @@ struct LinkCollector<'a, 'tcx> {
/// because `clean` and the disambiguator code expect them to be different.
/// See the code for associated items on inherent impls for details.
kind_side_channel: Cell<Option<(DefKind, DefId)>>,
/// Cache the resolved links so we can avoid resolving (and emitting errors for) the same link
visited_links: FxHashMap<ResolutionInfo, CachedLink>,
/// Cache the resolved links so we can avoid resolving (and emitting errors for) the same link.
/// The link will be `None` if it could not be resolved (i.e. the error was cached).
visited_links: FxHashMap<ResolutionInfo, Option<CachedLink>>,
}

impl<'a, 'tcx> LinkCollector<'a, 'tcx> {
Expand Down Expand Up @@ -901,16 +904,8 @@ impl<'a, 'tcx> DocFolder for LinkCollector<'a, 'tcx> {
};
// NOTE: if there are links that start in one crate and end in another, this will not resolve them.
// This is a degenerate case and it's not supported by rustdoc.
for (ori_link, link_range) in markdown_links(&doc) {
let link = self.resolve_link(
&item,
&doc,
&self_name,
parent_node,
krate,
ori_link,
link_range,
);
for md_link in markdown_links(&doc) {
let link = self.resolve_link(&item, &doc, &self_name, parent_node, krate, md_link);
if let Some(link) = link {
item.attrs.links.push(link);
}
Expand Down Expand Up @@ -942,27 +937,26 @@ impl LinkCollector<'_, '_> {
self_name: &Option<String>,
parent_node: Option<DefId>,
krate: CrateNum,
ori_link: String,
link_range: Range<usize>,
ori_link: MarkdownLink,
) -> Option<ItemLink> {
trace!("considering link '{}'", ori_link);
trace!("considering link '{}'", ori_link.link);

// Bail early for real links.
if ori_link.contains('/') {
if ori_link.link.contains('/') {
return None;
}

// [] is mostly likely not supposed to be a link
if ori_link.is_empty() {
if ori_link.link.is_empty() {
return None;
}

let cx = self.cx;
let link = ori_link.replace("`", "");
let link = ori_link.link.replace("`", "");
let parts = link.split('#').collect::<Vec<_>>();
let (link, extra_fragment) = if parts.len() > 2 {
// A valid link can't have multiple #'s
anchor_failure(cx, &item, &link, dox, link_range, AnchorFailure::MultipleAnchors);
anchor_failure(cx, &item, &link, dox, ori_link.range, AnchorFailure::MultipleAnchors);
return None;
} else if parts.len() == 2 {
if parts[0].trim().is_empty() {
Expand Down Expand Up @@ -1018,7 +1012,7 @@ impl LinkCollector<'_, '_> {
path_str,
disambiguator,
dox,
link_range,
ori_link.range,
smallvec![ResolutionFailure::NoParentItem],
);
return None;
Expand Down Expand Up @@ -1058,7 +1052,7 @@ impl LinkCollector<'_, '_> {
path_str,
disambiguator,
dox,
link_range,
ori_link.range,
smallvec![err_kind],
);
return None;
Expand All @@ -1074,15 +1068,22 @@ impl LinkCollector<'_, '_> {
return None;
}

let key = ResolutionInfo {
module_id,
dis: disambiguator,
path_str: path_str.to_owned(),
extra_fragment,
let diag_info = DiagnosticInfo {
item,
dox,
ori_link: &ori_link.link,
link_range: ori_link.range.clone(),
};
let diag =
DiagnosticInfo { item, dox, ori_link: &ori_link, link_range: link_range.clone() };
let (mut res, mut fragment) = self.resolve_with_disambiguator_cached(key, diag)?;
let (mut res, mut fragment) = self.resolve_with_disambiguator_cached(
ResolutionInfo {
module_id,
dis: disambiguator,
path_str: path_str.to_owned(),
extra_fragment,
},
diag_info,
matches!(ori_link.kind, LinkType::Reference | LinkType::Shortcut),
)?;

// Check for a primitive which might conflict with a module
// Report the ambiguity and require that the user specify which one they meant.
Expand All @@ -1101,7 +1102,7 @@ impl LinkCollector<'_, '_> {
&item,
path_str,
dox,
link_range,
ori_link.range,
AnchorFailure::RustdocAnchorConflict(prim),
);
return None;
Expand All @@ -1111,7 +1112,7 @@ impl LinkCollector<'_, '_> {
} else {
// `[char]` when a `char` module is in scope
let candidates = vec![res, prim];
ambiguity_error(cx, &item, path_str, dox, link_range, candidates);
ambiguity_error(cx, &item, path_str, dox, ori_link.range, candidates);
return None;
}
}
Expand All @@ -1129,14 +1130,22 @@ impl LinkCollector<'_, '_> {
specified.descr()
);
diag.note(&note);
suggest_disambiguator(resolved, diag, path_str, dox, sp, &link_range);
suggest_disambiguator(resolved, diag, path_str, dox, sp, &ori_link.range);
};
report_diagnostic(cx, BROKEN_INTRA_DOC_LINKS, &msg, &item, dox, &link_range, callback);
report_diagnostic(
cx,
BROKEN_INTRA_DOC_LINKS,
&msg,
&item,
dox,
&ori_link.range,
callback,
);
};
match res {
Res::Primitive(_) => match disambiguator {
Some(Disambiguator::Primitive | Disambiguator::Namespace(_)) | None => {
Some(ItemLink { link: ori_link, link_text, did: None, fragment })
Some(ItemLink { link: ori_link.link, link_text, did: None, fragment })
}
Some(other) => {
report_mismatch(other, Disambiguator::Primitive);
Expand Down Expand Up @@ -1179,11 +1188,11 @@ impl LinkCollector<'_, '_> {
if self.cx.tcx.privacy_access_levels(LOCAL_CRATE).is_exported(hir_src)
&& !self.cx.tcx.privacy_access_levels(LOCAL_CRATE).is_exported(hir_dst)
{
privacy_error(cx, &item, &path_str, dox, link_range);
privacy_error(cx, &item, &path_str, dox, &ori_link);
}
}
let id = clean::register_res(cx, rustc_hir::def::Res::Def(kind, id));
Some(ItemLink { link: ori_link, link_text, did: Some(id), fragment })
Some(ItemLink { link: ori_link.link, link_text, did: Some(id), fragment })
}
}
}
Expand All @@ -1192,28 +1201,47 @@ impl LinkCollector<'_, '_> {
&mut self,
key: ResolutionInfo,
diag: DiagnosticInfo<'_>,
cache_resolution_failure: bool,
) -> Option<(Res, Option<String>)> {
// Try to look up both the result and the corresponding side channel value
if let Some(ref cached) = self.visited_links.get(&key) {
bugadani marked this conversation as resolved.
Show resolved Hide resolved
self.kind_side_channel.set(cached.side_channel);
return Some(cached.res.clone());
match cached {
Some(cached) => {
self.kind_side_channel.set(cached.side_channel.clone());
return Some(cached.res.clone());
}
None if cache_resolution_failure => return None,
None => {
// Although we hit the cache and found a resolution error, this link isn't
// supposed to cache those. Run link resolution again to emit the expected
// resolution error.
}
}
}

let res = self.resolve_with_disambiguator(&key, diag);

// Cache only if resolved successfully - don't silence duplicate errors
if let Some(res) = &res {
if let Some(res) = res {
// Store result for the actual namespace
self.visited_links.insert(
key,
CachedLink {
Some(CachedLink {
res: res.clone(),
side_channel: self.kind_side_channel.clone().into_inner(),
},
}),
);
}

res
Some(res)
} else {
if cache_resolution_failure {
// For reference-style links we only want to report one resolution error
// so let's cache them as well.
self.visited_links.insert(key, None);
}

None
}
}

/// After parsing the disambiguator, resolve the main part of the link.
Expand Down Expand Up @@ -1964,13 +1992,7 @@ fn suggest_disambiguator(
}

/// Report a link from a public item to a private one.
fn privacy_error(
cx: &DocContext<'_>,
item: &Item,
path_str: &str,
dox: &str,
link_range: Range<usize>,
) {
fn privacy_error(cx: &DocContext<'_>, item: &Item, path_str: &str, dox: &str, link: &MarkdownLink) {
let sym;
let item_name = match item.name {
Some(name) => {
Expand All @@ -1982,7 +2004,7 @@ fn privacy_error(
let msg =
format!("public documentation for `{}` links to private item `{}`", item_name, path_str);

report_diagnostic(cx, PRIVATE_INTRA_DOC_LINKS, &msg, item, dox, &link_range, |diag, sp| {
report_diagnostic(cx, PRIVATE_INTRA_DOC_LINKS, &msg, item, dox, &link.range, |diag, sp| {
if let Some(sp) = sp {
diag.span_label(sp, "this item is private");
}
Expand Down
20 changes: 20 additions & 0 deletions src/test/rustdoc-ui/reference-link-reports-error-once.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
#![deny(broken_intra_doc_links)]

/// Links to [a] [link][a]
/// And also a [third link][a]
bugadani marked this conversation as resolved.
Show resolved Hide resolved
/// And also a [reference link][b]
///
/// Other links to the same target should still emit error: [ref] //~ERROR unresolved link to `ref`
/// Duplicate [ref] //~ERROR unresolved link to `ref`
///
/// Other links to other targets should still emit error: [ref2] //~ERROR unresolved link to `ref2`
/// Duplicate [ref2] //~ERROR unresolved link to `ref2`
///
/// [a]: ref
//~^ ERROR unresolved link to `ref`
/// [b]: ref2
Copy link
Contributor Author

@bugadani bugadani Dec 21, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the current patch, if this is [b]: ref, the error emitted for [reference link][b] will point to the definition of a. This is because the link cache ignores the name of the reference and handles a and b as if identical. This isn't ideal, but I consider this acceptable: we don't report all errors, but if the user fixes link a, we will report an error for link b, so eventually, every error can be caught :)

We could fix this by adding the span to ResolutionInfo (Optionally, only for Shortcut/Reference links), but I'm not sure if the complexity is worth it. This would also make ResolutionInfo and DiagnosticInfo overlap, if only optionally.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wait, I'm confused what those have to do with each other. We never cache the span, so there's nowhere for it to look up the span of a in the first place, the only span available is that of b.

Can you post the error that's wrong?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assume the following:

[a] [b]

[a]: ref
[b]: ref

We would only report an error for a.

//~^ ERROR unresolved link to

/// [ref][]
//~^ ERROR unresolved link
pub fn f() {}
Loading