Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[rustdoc] Add --extract-doctests command-line flag #134531

Open
wants to merge 4 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
19 changes: 19 additions & 0 deletions src/doc/rustdoc/src/unstable-features.md
Original file line number Diff line number Diff line change
Expand Up @@ -664,3 +664,22 @@ Similar to cargo `build.rustc-wrapper` option, this flag takes a `rustc` wrapper
The first argument to the program will be the test builder program.

This flag can be passed multiple times to nest wrappers.

### `--extract-doctests`: outputs doctests in JSON format

* Tracking issue: [#134529](https://github.com/rust-lang/rust/issues/134529)

When this flag is used, it outputs the doctests original source code alongside information
such as:

* File where they are located.
* Line where they are located.
* Codeblock attributes (more information about this [here](./write-documentation/documentation-tests.html#attributes)).

The output format is JSON.
Comment on lines +675 to +679
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's not enough documentation, even for an unstable feature. You need to say what the actual keys are, and what they do (particularly the subkeys of langstr, which are not self-explanatory). Please include a prettified example JSON document with all of the format features used.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not too sure if we want everything to be documented as docblock attributes list might get longer. Well, I'll list the current one and we can think about it later.


Using this flag looks like this:

```bash
$ rustdoc -Zunstable-options --extract-doctests src/lib.rs
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this a CLI option and not an output format?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because I didn't think about it, it's SO MUCH better!

```
5 changes: 5 additions & 0 deletions src/librustdoc/config.rs
Original file line number Diff line number Diff line change
Expand Up @@ -172,6 +172,9 @@ pub(crate) struct Options {
/// This is mainly useful for other tools that reads that debuginfo to figure out
/// how to call the compiler with the same arguments.
pub(crate) expanded_args: Vec<String>,

/// If `true`, it will doctest in JSON format and exit.
pub(crate) extract_doctests: bool,
}

impl fmt::Debug for Options {
Expand Down Expand Up @@ -762,6 +765,7 @@ impl Options {
Ok(result) => result,
Err(e) => dcx.fatal(format!("--merge option error: {e}")),
};
let extract_doctests = matches.opt_present("extract-doctests");

if generate_link_to_definition && (show_coverage || output_format != OutputFormat::Html) {
dcx.struct_warn(
Expand Down Expand Up @@ -819,6 +823,7 @@ impl Options {
scrape_examples_options,
unstable_features,
expanded_args: args,
extract_doctests,
};
let render_options = RenderOptions {
output,
Expand Down
43 changes: 40 additions & 3 deletions src/librustdoc/doctest.rs
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,7 @@ use rustc_span::FileName;
use rustc_span::edition::Edition;
use rustc_span::symbol::sym;
use rustc_target::spec::{Target, TargetTuple};
use serde::ser::{Serialize, SerializeStruct, Serializer};
use tempfile::{Builder as TempFileBuilder, TempDir};
use tracing::debug;

Expand Down Expand Up @@ -165,6 +166,7 @@ pub(crate) fn run(dcx: DiagCtxtHandle<'_>, input: Input, options: RustdocOptions
let args_path = temp_dir.path().join("rustdoc-cfgs");
crate::wrap_return(dcx, generate_args_file(&args_path, &options));

let extract_doctests = options.extract_doctests;
let CreateRunnableDocTests {
standalone_tests,
mergeable_tests,
Expand All @@ -173,7 +175,7 @@ pub(crate) fn run(dcx: DiagCtxtHandle<'_>, input: Input, options: RustdocOptions
unused_extern_reports,
compiling_test_count,
..
} = interface::run_compiler(config, |compiler| {
} = match interface::run_compiler(config, |compiler| {
let krate = rustc_interface::passes::parse(&compiler.sess);

let collector = rustc_interface::create_and_enter_global_ctxt(&compiler, krate, |tcx| {
Expand All @@ -189,14 +191,30 @@ pub(crate) fn run(dcx: DiagCtxtHandle<'_>, input: Input, options: RustdocOptions
tcx,
);
let tests = hir_collector.collect_crate();
if extract_doctests {
let stdout = std::io::stdout();
let mut stdout = stdout.lock();
if let Err(error) = serde_json::ser::to_writer(&mut stdout, &tests) {
eprintln!();
return Err(format!("Failed to generate JSON output for doctests: {error:?}"));
}
return Ok(None);
}
tests.into_iter().for_each(|t| collector.add_test(t));

collector
Ok(Some(collector))
});
compiler.sess.dcx().abort_if_errors();

collector
});
}) {
Ok(Some(collector)) => collector,
Ok(None) => return,
Err(error) => {
eprintln!("{error}");
std::process::exit(1);
}
};

run_tests(opts, &rustdoc_options, &unused_extern_reports, standalone_tests, mergeable_tests);

Expand Down Expand Up @@ -725,6 +743,25 @@ pub(crate) struct ScrapedDocTest {
name: String,
}

// This implementation is needed for 2 reasons:
// 1. `FileName` doesn't implement `serde::Serialize`.
// 2. We don't want to output `name`.
impl Serialize for ScrapedDocTest {
fn serialize<S>(&self, serializer: S) -> Result<S::Ok, S::Error>
where
S: Serializer,
{
// `5` is the number of fields we output (so all of them except `name`).
let mut s = serializer.serialize_struct("ScrapedDocTest", 4)?;
let filename = self.filename.prefer_remapped_unconditionaly().to_string();
s.serialize_field("filename", &filename)?;
s.serialize_field("line", &self.line)?;
s.serialize_field("langstr", &self.langstr)?;
s.serialize_field("text", &self.text)?;
s.end()
}
}

impl ScrapedDocTest {
fn new(
filename: FileName,
Expand Down
31 changes: 30 additions & 1 deletion src/librustdoc/html/markdown.rs
Original file line number Diff line number Diff line change
Expand Up @@ -46,6 +46,7 @@ pub(crate) use rustc_resolve::rustdoc::main_body_opts;
use rustc_resolve::rustdoc::may_be_doc_link;
use rustc_span::edition::Edition;
use rustc_span::{Span, Symbol};
use serde::ser::{Serialize, SerializeStruct, Serializer};
use tracing::{debug, trace};

use crate::clean::RenderedLink;
Expand Down Expand Up @@ -836,7 +837,35 @@ pub(crate) struct LangString {
pub(crate) unknown: Vec<String>,
}

#[derive(Eq, PartialEq, Clone, Debug)]
// This implementation is needed for `Edition` which doesn't implement `serde::Serialize` so
// we need to implement it manually.
impl Serialize for LangString {
fn serialize<S>(&self, serializer: S) -> Result<S::Ok, S::Error>
where
S: Serializer,
{
// `12` is the number of fields.
let mut s = serializer.serialize_struct("LangString", 12)?;

s.serialize_field("original", &self.original)?;
s.serialize_field("should_panic", &self.should_panic)?;
s.serialize_field("no_run", &self.no_run)?;
s.serialize_field("ignore", &self.ignore)?;
s.serialize_field("rust", &self.rust)?;
s.serialize_field("test_harness", &self.test_harness)?;
s.serialize_field("compile_fail", &self.compile_fail)?;
s.serialize_field("standalone_crate", &self.standalone_crate)?;
s.serialize_field("error_codes", &self.error_codes)?;
let edition = self.edition.map(|edition| edition.to_string());
s.serialize_field("edition", &edition)?;
s.serialize_field("added_classes", &self.added_classes)?;
s.serialize_field("unknown", &self.unknown)?;

s.end()
}
}

#[derive(Eq, PartialEq, Clone, Debug, serde::Serialize)]
pub(crate) enum Ignore {
All,
None,
Expand Down
3 changes: 2 additions & 1 deletion src/librustdoc/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -685,6 +685,7 @@ fn opts() -> Vec<RustcOptGroup> {
"[rust]",
),
opt(Unstable, Flag, "", "html-no-source", "Disable HTML source code pages generation", ""),
opt(Unstable, Flag, "", "extract-doctests", "Output doctests in JSON format", ""),
]
}

Expand Down Expand Up @@ -804,7 +805,7 @@ fn main_args(
}
};

match (options.should_test, config::markdown_input(&input)) {
match (options.should_test | options.extract_doctests, config::markdown_input(&input)) {
(true, Some(_)) => return wrap_return(dcx, doctest::test_markdown(&input, options)),
(true, None) => return doctest::run(dcx, input, options),
(false, Some(md_input)) => {
Expand Down
2 changes: 2 additions & 0 deletions tests/run-make/rustdoc-default-output/output-default.stdout
Original file line number Diff line number Diff line change
Expand Up @@ -211,6 +211,8 @@ Options:
more information
--html-no-source
Disable HTML source code pages generation
--extract-doctests
Output doctests in JSON format

@path Read newline separated options from `path`

Expand Down
15 changes: 15 additions & 0 deletions tests/rustdoc-ui/extract-doctests.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
// Test to ensure that it generates expected output for `--extract-doctests` command-line
// flag.

//@ compile-flags:-Z unstable-options --extract-doctests
//@ normalize-stdout-test: "tests/rustdoc-ui" -> "$$DIR"
//@ check-pass

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We probably want a test for how this interacts with:

  • Implicit/explicit fn main()
  • Hidding lines with /// # use some::path;

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True although currently it's the code as defined in the documentation and no changes on rustdoc side (which I think matches better what rust for linux wants).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we should provide another field with "rustdoc computed code" where hidden lines and main function wrapping are added by rustdoc. Actually that sounds like a very good idea.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True although currently it's the code as defined in the documentation and no changes on rustdoc side (which I think matches better what rust for linux wants).

Currently we use both hidden lines and the ? support, i.e. the # Ok::<..., ...>(...) syntax (which implies the fn main(), from what I understand).

Some of that post-processing may be easy to do by users, like the hidden lines I assume, but if you walk the source or IR or similar to figure out details (like the crate attributes that you move out of main()), then it may be harder for end users to replicate that properly without a hack. Perhaps it may be possible to export what rustdoc figured about the test, so that users can replicate the post-processing on their side, but customized to their environment.

Having the rustdoc computed code sounds fine since it may be enough for some use cases, but e.g. we currently need to convert on the fly the .unwrap() in the ?-using tests into a custom assert!.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, for instance, if rustdoc tells us "this test uses ?", "these are the crate attributes I would have moved", etc., then users may be able to easily and reliably construct their own wrappers and e.g. do a check instead of an .unwrap().

Of course, for things like crate attributes, it may be best to remove them so that the end user can re-add them where needed, so it wouldn't be the completely "unaltered" source code. But it could be an "adapted" version. Hidden lines should probably be normal lines in that adapted version.

Perhaps it makes sense to provide all those versions, i.e. the completely unaltered one for those that may want to do something complex or to render the text somewhere, the "adapted" version for customized test environments and the rustdoc computed code for those that can use directly that. In the kernel we would use the "adapted" one, only, I think.

//! ```ignore (checking attributes)
//! let x = 12;
//! let y = 14;
//! ```
//!
//! ```edition2018,compile_fail
//! let
//! ```
1 change: 1 addition & 0 deletions tests/rustdoc-ui/extract-doctests.stdout
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
[{"filename":"$DIR/extract-doctests.rs","line":8,"langstr":{"original":"ignore (checking attributes)","should_panic":false,"no_run":false,"ignore":"All","rust":true,"test_harness":false,"compile_fail":false,"standalone_crate":false,"error_codes":[],"edition":null,"added_classes":[],"unknown":[]},"text":"let x = 12;\nlet y = 14;"},{"filename":"$DIR/extract-doctests.rs","line":13,"langstr":{"original":"edition2018,compile_fail","should_panic":false,"no_run":true,"ignore":"None","rust":true,"test_harness":false,"compile_fail":true,"standalone_crate":false,"error_codes":[],"edition":"2018","added_classes":[],"unknown":[]},"text":"let"}]
Loading