Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace hoedown with pull in rustdoc #40338

Merged
merged 14 commits into from
Mar 29, 2017

Conversation

GuillaumeGomez
Copy link
Member

@GuillaumeGomez GuillaumeGomez commented Mar 8, 2017

cc @rust-lang/docs

@rust-highfive
Copy link
Collaborator

r? @steveklabnik

(rust_highfive has picked a reviewer for you, use r? to override)

Copy link
Member

@steveklabnik steveklabnik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will give this a real review tomorrow, but 🎊

@@ -179,8 +181,8 @@ extern {
fn hoedown_document_free(md: *mut hoedown_document);

fn hoedown_buffer_new(unit: libc::size_t) -> *mut hoedown_buffer;
fn hoedown_buffer_put(b: *mut hoedown_buffer, c: *const libc::c_char,
n: libc::size_t);
/*fn hoedown_buffer_put(b: *mut hoedown_buffer, c: *const libc::c_char,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why leave this in but commented out?

@@ -594,7 +598,7 @@ impl<'a> fmt::Display for MarkdownHtml<'a> {
}

pub fn plain_summary_line(md: &str) -> String {
extern fn link(_ob: *mut hoedown_buffer,
/*extern fn link(_ob: *mut hoedown_buffer,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here

let mut register_header = None;
'main: loop {
let next_event = parser.next();
if let Some(event) = next_event {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you made this:

let event = match next_event {
    Some(event) => event,
    None => break,
};
....

This would reduce the indentation

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At that point, couldn't this be a:

while let Some(event) = parser.next() {
    ...
}

Copy link
Member

@steveklabnik steveklabnik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems good to me, though I agree with @frewsxcv 's comments about nesting

@GuillaumeGomez GuillaumeGomez force-pushed the pulldown-switch branch 4 times, most recently from 376b5dd to 2537b41 Compare March 10, 2017 17:35
@GuillaumeGomez GuillaumeGomez changed the title [WIP] Replace hoedown with pull in rustdoc Replace hoedown with pull in rustdoc Mar 10, 2017
@GuillaumeGomez
Copy link
Member Author

This is now ready to review!

let mut content = String::new();
loop {
let event = parser.next();
if let Some(event) = event {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could this could just be a for loop?

Copy link
Member

@killercup killercup Mar 10, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, this is for event in parser { match event { … } } (see main review comment)

fn link(parser: &mut Parser, buffer: &mut String, url: &str, mut title: String) {
loop {
let event = parser.next();
if let Some(event) = event {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for loop?

let mut content = String::new();
loop {
let event = parser.next();
if let Some(event) = event {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for loop?

use std::fmt::{self, Write};
use std::slice;
//use std::slice;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove commented out imports

/// A unit struct which has the `fmt::Display` trait implemented. When
/// formatted, this struct will emit the HTML corresponding to the rendered
/// version of the contained markdown string.
pub struct Markdown<'a>(pub &'a str);
// The second parameter is whether we need a shorter version or not.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is a "shorter version"?

Maybe this should be a struct with named fields instead of using unnamed fields

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, boolean parameters are usually a code smell

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I'd prefer an enum MarkdownOutputStyle { Compact, Fancy } (or some other fitting variant names) instead of bool

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@killercup: Good idea!

@@ -26,13 +26,13 @@

#![allow(non_camel_case_types)]

use libc;
//use libc;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we still use the libc crate in rustdoc?

(b'0' <= c && c <= b'9') ||
c == b'-' || c == b'_' || c == b'.' ||
c == b'~' || c == b'!' || c == b'\'' ||
c == b'(' || c == b')' || c == b'*'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull in rust-url for this?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not part of the change, however I can do it next up.

}

fn looper<'a>(parser: &'a mut Parser, buffer: &mut String, next_event: Option<Event<'a>>,
toc_builder: &mut Option<TocBuilder>, shorter: bool) -> bool {
if let Some(event) = next_event {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could return early here if this is None

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't change much from my point of view... I think this syntax is actually better.

@steveklabnik
Copy link
Member

/cc @raphlinus ❤️

@steveklabnik
Copy link
Member

So, I'm spot-checking some text to find regressions.

std::collections module page

collections

std::convert module page

convert

prelude:

prelude

some smooshed text

std::sync::LockResult

lock

std::string

string

that's what i found just now. we may need to tweak some text, and it's probably good to go through the whole entire thing once the technical side is good and clear up these kinds of issues.

Copy link
Member

@killercup killercup left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice to see some progress here!

One pattern I immediately saw was all the loop constructs matching the event and tag. I'd:

  1. use Event::* and use Tag::*; to reduce the visual noise

  2. Try to rewrite them as let events = parser.$filter_relevant_events; for event in events { match event {…} }, where $filter_relevant_events are some operations on the iterator to get the events you are interested in (e.g., to use take_while instead of adding break arms).

    FYI, I wrote a macro once for this exact use case to allow events.take_while(not!(end Tag::Paragraph)) and similar. This may help converting most of these loops into nice iterators.

    You can find it and example using it here. You'll probably need to extend it to support multiple end tag variants as used below.

I'd also really try to get rid of the shorter: bool stuff and replace with some semantically meaningful names :)

Switching parser is not something I expect to just work, even when both apparently conform to the CommonMark spec. Before merging this, I'd render a bunch of docs (std and some larger crates), and try to

  • set up a script that opens and takes screenshots of all generated html pages to visually diff them with ones generated by the current stable rustdoc,
  • or let humans look at the rendered pages,
  • or do both 😄

src/Cargo.lock Outdated
@@ -1203,6 +1203,7 @@ dependencies = [
"build_helper 0.1.0",
"gcc 0.3.43 (registry+https://github.com/rust-lang/crates.io-index)",
"log 0.0.0",
"pulldown-cmark 0.0.8 (registry+https://github.com/rust-lang/crates.io-index)",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

0.0.8… Another lib we should help get to 1.0

}
}
buffer.push_str(&format!("<table>{}{}</table>",
content,
if shorter || rows.is_empty() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this the only place where shorter is actually used? The name shorter does not imply "no tbody tag" to me…

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In paragraph as well. The point is to display only the first line, so everything that could be on more than one line use it.

/// A unit struct which has the `fmt::Display` trait implemented. When
/// formatted, this struct will emit the HTML corresponding to the rendered
/// version of the contained markdown string.
pub struct Markdown<'a>(pub &'a str);
// The second parameter is whether we need a shorter version or not.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I'd prefer an enum MarkdownOutputStyle { Compact, Fancy } (or some other fitting variant names) instead of bool

let mut content = String::new();
loop {
let event = parser.next();
if let Some(event) = event {
Copy link
Member

@killercup killercup Mar 10, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, this is for event in parser { match event { … } } (see main review comment)

_ => {}
}
} else {
break 'main;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

finally, a good reason not use a for loop ;)

@GuillaumeGomez
Copy link
Member Author

Most problems should be solved now. I'll write a macro next to allow to remove all these loops.

@ollie27
Copy link
Member

ollie27 commented Mar 11, 2017

Is there any reason you're not using pulldown_cmark::html::push_html to actually render the HTML?

@bors
Copy link
Contributor

bors commented Mar 11, 2017

☔ The latest upstream changes (presumably #40432) made this pull request unmergeable. Please resolve the merge conflicts.

@GuillaumeGomez
Copy link
Member Author

GuillaumeGomez commented Mar 11, 2017

@ollie27: Unfortunately yes: the generation of urls mainly (for play.rust-lang in blockcodes).

"&lt;", "&gt;", "&amp;", "&#39;", "&quot;"];
for sub in repl_sub {
id = id.replace(sub, "");
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shouldn't this mean the comment above goes away too?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Absolutely!

@GuillaumeGomez GuillaumeGomez force-pushed the pulldown-switch branch 2 times, most recently from c132361 to 8b61716 Compare March 11, 2017 15:41
Copy link
Member

@killercup killercup left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code looks nicer with the macro, though I'd probably make it more generic.

Also, why not use Event::* and use Tag::*?

@@ -104,6 +104,25 @@ thread_local!(pub static PLAYGROUND: RefCell<Option<(Option<String>, String)>> =
RefCell::new(None)
});

macro_rules! event_loop_break {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice. I'd call it collect_md_events, though, and make calling it a bit more expressive, e.g.

collect_md_events!(from parser into &mut buf (toc_builder, shorter) until End(Header(_)))

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if this is super useful. Anyone else has an opinion on it?

shorter: MarkdownOutputStyle) -> fmt::Result {
fn block(parser: &mut Parser, buffer: &mut String, lang: &str) {
let mut origtext = String::new();
while let Some(event) = parser.next() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could also be using the macro (e.g. when calling without toc builder and shorter)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't see much interest in expanding the macro to handle this case. But yes it's possible.

@GuillaumeGomez
Copy link
Member Author

Also, why not use Event::* and use Tag::*?

Simply because I like the code being explicit. At least with this syntax, you don't have to think about it, the information is already there.

@ollie27
Copy link
Member

ollie27 commented Mar 12, 2017

@ollie27: Unfortunately yes: the generation of urls mainly (for play.rust-lang in blockcodes).

You can still use push_html and add these links by implementing an iterator adapter around Parser. It would pass most Events through but you could filter and modify the ones you're interested in like CodeBlock and Header. This way you don't have to reimplement all the HTML rendering that pulldown-cmark already provides.

@GuillaumeGomez
Copy link
Member Author

Updated.

Copy link
Member

@frewsxcv frewsxcv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a bunch of nitpicky style things, but they aren't important enough to hold up this PR any longer. Looks good to me!

@GuillaumeGomez
Copy link
Member Author

@frewsxcv, @steveklabnik: Since both of you agreed on the changes, I'll r+ this PR. Don't hesitate to r- if you find anything!

@bors: r=frewsxcv,steveklabnik

@bors
Copy link
Contributor

bors commented Mar 29, 2017

📌 Commit a7c6d3e has been approved by frewsxcv,steveklabnik

@bors
Copy link
Contributor

bors commented Mar 29, 2017

⌛ Testing commit a7c6d3e with merge abf5592...

bors added a commit that referenced this pull request Mar 29, 2017
…veklabnik

Replace hoedown with pull in rustdoc

cc @rust-lang/docs
@bors
Copy link
Contributor

bors commented Mar 29, 2017

☀️ Test successful - status-appveyor, status-travis
Approved by: frewsxcv,steveklabnik
Pushing abf5592 to master...

@bors bors merged commit a7c6d3e into rust-lang:master Mar 29, 2017
@ollie27
Copy link
Member

ollie27 commented Mar 29, 2017

Was this actually finished? It's missing support for horizontal rules, footnotes and images. There are also several bugs like missing the <br /> for HardBreaks, the title part of links being rendered as visible text rather than as an attribute on the <a> tag and MarkdownHtml now doing the same as Markdown when it should be escaping raw HTML.

As I previously mentioned, using push_html would have been much easier than trying to reimplement all of the HTML rendering.

@frewsxcv
Copy link
Member

Was this actually finished? It's missing support for horizontal rules, footnotes and images. There are also several bugs like missing the <br /> for HardBreaks, the title part of links being rendered as visible text rather than as an attribute on the <a> tag and MarkdownHtml now doing the same as Markdown when it should be escaping raw HTML.

Have you confirmed any of these? I can bring up these concerns during today's Documentation Team meeting.

@ollie27
Copy link
Member

ollie27 commented Mar 29, 2017

Have you confirmed any of these?

Yes. Try rendering the following in the current rustdoc and after this change:

/// markdown test
///
/// this is a [link].
///
/// [link]: https://example.com "this is a title"
///
/// hard break:  
/// after hard break
///
/// -----------
///
/// a footnote[^footnote].
///
/// [^footnote]: Thing
///
/// ![Rust](https://www.rust-lang.org/logos/rust-logo-128x128-blk-v2.png)
#[deprecated(note = "Struct<T>")]
pub fn f() {}

Before:
Before

After:
After

@steveklabnik
Copy link
Member

Two things:

  1. this landed after last night's nightly, so tonight's is the first night where it lands. I was already planning on trying to publicize "please look at this since it's a big change", so knowing this stuff is good.
  2. I am currently building docs with the commit before and the commit after, and then i'm going to diff them to see exactly what's different.

I don't think that this missing a few things is the end of the world; now that this has landed, fixing stuff up is much easier. We've still got plenty of time before the release; four weeks until beta, and then we could backport anything catastrophic. Getting this out in the wild sooner rather than later is important IMHO.

@GuillaumeGomez GuillaumeGomez deleted the pulldown-switch branch March 29, 2017 14:25
@steveklabnik
Copy link
Member

Here is an un-semantic diff steveklabnik/docdiff@5d17061

@steveklabnik
Copy link
Member

A much prettier diff steveklabnik/docdiff@HEAD~1...master

@steveklabnik
Copy link
Member

I've filed #40912 to track these regressions. 😄

@steveklabnik steveklabnik added the relnotes Marks issues that should be documented in the release notes of the next release. label Apr 1, 2017
@mbrubeck
Copy link
Contributor

This was disabled by default in #41431, and #44229 is tracking the plan to enable it by default.

GuillaumeGomez added a commit to GuillaumeGomez/rust that referenced this pull request Feb 16, 2018
Is it really time? Have our months, no, *years* of suffering come to an end? Are we finally able to cast off the pall of Hoedown? The weight which has dragged us down for so long?

-----

So, timeline for those who need to catch up:

* Way back in December 2016, [we decided we wanted to switch out the markdown renderer](rust-lang#38400). However, this was put on hold because the build system at the time made it difficult to pull in dependencies from crates.io.
* A few months later, in March 2017, [the first PR was done, to switch out the renderers entirely](rust-lang#40338). The PR itself was fraught with CI and build system issues, but eventually landed.
* However, not all was well in the Rustdoc world. During the PR and shortly after, we noticed [some differences in the way the two parsers handled some things](rust-lang#40912), and some of these differences were major enough to break the docs for some crates.
* A couple weeks afterward, [Hoedown was put back in](rust-lang#41290), at this point just to catch tests that Pulldown was "spuriously" running. This would at least provide some warning about spurious tests, rather than just breaking spontaneously.
* However, the problems had created enough noise by this point that just a few days after that, [Hoedown was switched back to the default](rust-lang#41431) while we came up with a solution for properly warning about the differences.
* That solution came a few weeks later, [as a series of warnings when the HTML emitted by the two parsers was semantically different](rust-lang#41991). But that came at a cost, as now rustdoc needed proc-macro support (the new crate needed some custom derives farther down its dependency tree), and the build system was not equipped to handle it at the time. It was worked on for three months as the issue stumped more and more people.
  * In that time, [bootstrap was completely reworked](rust-lang#43059) to change how it ordered compilation, and [the method by which it built rustdoc would change](rust-lang#43482), as well. This allowed it to only be built after stage1, when proc-macros would be available, allowing the "rendering differences" PR to finally land.
  * The warnings were not perfect, and revealed a few [spurious](rust-lang#44368) [differences](rust-lang#45421) between how we handled the renderers.
  * Once these were handled, [we flipped the switch to turn on the "rendering difference" warnings all the time](rust-lang#45324), in October 2017. This began the "warning cycle" for this change, and landed in stable in 1.23, on 2018-01-04.
  * Once those warnings hit stable, and after a couple weeks of seeing whether we would get any more reports than what we got from sitting on nightly/beta, [we switched the renderers](rust-lang#47398), making Pulldown the default but still offering the option to use Hoedown.

And that brings us to the present. We haven't received more new issues from this in the meantime, and the "switch by default" is now on beta. Our reasoning is that, at this point, anyone who would have been affected by this has run into it already.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
relnotes Marks issues that should be documented in the release notes of the next release.
Projects
None yet
Development

Successfully merging this pull request may close these issues.