Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Running skeptic tests are slow #8

Open
Marwes opened this issue Jul 31, 2016 · 15 comments
Open

Running skeptic tests are slow #8

Marwes opened this issue Jul 31, 2016 · 15 comments

Comments

@Marwes
Copy link
Contributor

Marwes commented Jul 31, 2016

This is perhaps not noticeable with only a few tests but I currently have 29 tests which are generated by skeptic which now take up a non-negligible amount of time. From what I can tell from the source and the generated tests this is because all tests are recompiled every time cargo test is run and they are also all individually compiled. Is there any good reason for this and could it be changed? I'd be happy to implement the change if it is not to problematic.

@brson
Copy link
Collaborator

brson commented Aug 1, 2016

Hi @Marwes. Thanks for the report.

The first thing that can be done is have generate_doc_tests emit the cargo:rerun-if-changed directive for each of the input files. See these docs: http://doc.crates.io/build-script.html

That should prevent cargo from regenerating the tests each invocation, and in turn building the generated tests.

That won't help the scenario where you change one markdown file and all the tests get regenerated and then rebuilt. Fixing that would require some design tweaks: basically the single skeptic-tests.rs file would need to be turned into multiple test files (one for each markdown file probably), and in turn the manually written tests/skeptic.rs file that includes it would need to be multiple files, and finally the build script would have to detect when it is about to regenerate the exact same file and not do it. That way Cargo would consider rebuilding each set of test independently.

Figuring out how to create that design without imposing too much burden on the end-user (i.e. without requiring them to create a tests/*.rs file with that ugly include! line for every single markdown file) could be tricky.

@Marwes
Copy link
Contributor Author

Marwes commented Aug 1, 2016

The first thing that can be done is have generate_doc_tests emit the cargo:rerun-if-changed directive for each of the input files. See these docs: http://doc.crates.io/build-script.html

Seems simple enough but I don't think that is enough to prevent recompilation if the markdown is not changed as the generated test is actually an executable which generates the tests which will be recompiled. Looking at this excerpt from the generated tests.

#[test] fn readme_0() {
    let ref s = "...";
    skeptic::rt::run_test(r#"C:\Users\Markus\Dropbox\Programming\gluon\c-api\target\debug\build\gluon-a225c2b4679cfb91\out"#, s);
}

And run_test when running readme_0 the it will write a new file and recompile it using rustc (which does always recompiles regardless).

Why is the tests generated (from the already generated file) instead of just existing in the first file?

That won't help the scenario where you change one markdown file and all the tests get regenerated and then rebuilt ...

Yeah that seems trickier and having fast rebuilds after modifying documentation does not seem as important so I will probably skip that.

@Marwes
Copy link
Contributor Author

Marwes commented Aug 1, 2016

Or is the reason for generating the tests individually to keep compilation errors from different tests from interacting from each other? Maybe just compile tests should be generated in that case?

@brson
Copy link
Collaborator

brson commented Aug 1, 2016

Ah, I forgot that the tests recompiled dynamically. Yeah, I believe you are right that the reason it is that way is so that failure to compile the test cases are themselves errors at runtime and cause test failures, not compile-time failures. AFAIK this is how rustdoc deals with its test cases as well, for the same reason, but I could be misremembering. I wonder if you've noticed a difference in perceived performance between rustdoc and rust-skeptic.

Maybe just compile tests should be generated in that case?

Can you expand on this? I don't quite understand what 'compile tests' or 'that case' refer to here.

If we keep this model where the test cases are recompiled each test run, the simplest way I see to avoid the recompile is to have compile_test_case cache the results of previous compilations. Instead of compiling to a temporary dir it could compile to the out_dir, and to short-circuit the rebuild compare the test_text from the previous run to the current run.

@Marwes
Copy link
Contributor Author

Marwes commented Aug 1, 2016

Can you expand on this? I don't quite understand what 'compile tests' or 'that case' refer to here.

Tests that check if code compiles or not ie compile_test instead of run_test

@Marwes
Copy link
Contributor Author

Marwes commented Aug 1, 2016

I wonder if you've noticed a difference in perceived performance between rustdoc and rust-skeptic.

Rustdoc does the same thing. I actually remember being annoyed at that in combine which has 44 doc tests which all get recompiled every time. It was still less noticeable there however as that library is much smaller than gluon. In gluon I don't really have that many doc tests at the moment so I hadn't noticed it.

@brson
Copy link
Collaborator

brson commented Aug 2, 2016

Tests that check if code compiles or not ie compile_test instead of run_test

Haha, of course.

Does my previous suggestion for how to fix this with caching sound reasonable? Are you interested in implementing?

@Marwes
Copy link
Contributor Author

Marwes commented Aug 2, 2016

I think I will at least make the change to add cargo:rerun-if-changed which may help at least a little (and it is required for the bigger change anyway).

Caching for compile_test_case is going to be trickier so that might take a bit longer. For instance just caching on the text may prevent recompilation if one changes the rustc/cargo version but I'd have to check exactly how cargo operates to make sure (if it clears out the output directory when it detects a different rustc it should be fine for instance). Might be other problems preventing recompilation as well that I haven't thought about.

@Marwes
Copy link
Contributor Author

Marwes commented Aug 11, 2016

Implemented a partial fix in #9.

I do not cache the artifacts generated by the generated test file. The reason for that is that it would both be quite a bit more complicated since it can't piggyback on the caching that cargo does. Also, a large part of that overhead is that each test is generated separately. I tried changing it so that normal run tests are put directly into the generated file but unfortunately that does not work with the templates.

#[test] fn readme_0() {
    <test>
}

instead of

#[test] fn readme_0() {
    let ref s = format!(<template string>, <test string>);
    skeptic::rt::run_test(r#"C:\Users\Markus\Dropbox\Programming\gluon\c-api\target\debug\build\gluon-a225c2b4679cfb91\out"#, s);
}

Marwes added a commit to Marwes/little-skeptic that referenced this issue Apr 22, 2018
This fixes two major issues with skeptic's current implementation at the
cost of losing compile-fail tests. For many crates I expect this to be a
worthwhile tradeoff. It is possible to restore compile-tests but they
would be forced to use the same shelling out approach as before this PR
so they would suffer from the same problem. As such I prefer to keep
them out instead.

No longer shells out to rustc and attempts to emulate cargo's dependency
resolution.

Fixes budziq#18

Tests are now in a normal test module letting cargo cache and reuse
compilation artifacts as normal.

Fixes budziq#8

BREAKING CHANGE

- Handlebars templates are now used instead of `format!`
  Since there is no intermediate step we need a way to dynamically
  template. Handlebars is well known and well supported.

- Compile tests that expect panics are not supported
  Since code is generated inline, these tests would cause test
  compilation to fail
@Phlosioneer
Copy link
Contributor

Phlosioneer commented Dec 3, 2018

I think an improvement we can make here is to parse the manifest file just once. Currently, every single from_path invocation re-reads and fully parses the manifest file.

(There are at least 3 invocations, and I think there are several more.)

@Marwes
Copy link
Contributor Author

Marwes commented Dec 3, 2018

@Phlosioneer I doubt that is a big problem, as I measured, it is the fact that rustc gets invoked once for each skeptic test and therefore generates one executable for each tests which makes it so slow.

@Phlosioneer
Copy link
Contributor

Alright, as long as it's been checked.

@budziq
Copy link
Owner

budziq commented Dec 3, 2018

Yep while the current manifest handling scheme is suboptimal it does not contribute significantly to the whole test execution time. As @Marwes mentioned, the rustc invocations are the killer here.

But if anyone would like to take a stab at rewriting the manifest handling logic, I'm all for it :)

@brson
Copy link
Collaborator

brson commented Feb 12, 2019

I've started hacking on a fix for #18 that I thought would fix this, but looks like it won't.

The "every test case is an executable" behavior of skeptic is something that I value, because it ensures that every test case is something a reader can copy-paste directly into their own main.rs. Maintaining that while also not invoking rustc for every test case is a challenging problem.

The fix in the above link works by turning every test case into its own cargo project, sharing the target-dir with the parent project. That alone probably won't improve anything (and will actually add a cargo invocation on top of the work already being done). One way I can imagine speeding up the compilation of the test cases is by dynamic linking.

Another way I can imagine is by compiling every test case (project) as its own library, then linking them all into one "super-binary" that can masquerade as each test case as though it were an independent binary. It cuts out the exe link time, which tends to be significant. That should work for most cases that expect to be run as standalone executables.

@brson
Copy link
Collaborator

brson commented Feb 16, 2019

I've posted a branch prototype fix for #18. It doesn't improve testing times yet, but I think it will when it's done.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants