Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Possibility to use drake to run unit tests #206

Closed
lbusett opened this issue Jan 31, 2018 · 8 comments
Closed

Possibility to use drake to run unit tests #206

lbusett opened this issue Jan 31, 2018 · 8 comments

Comments

@lbusett
Copy link

lbusett commented Jan 31, 2018

Hi,

I was thinking that a possible nice "use case" for Drake could be using it to run the units tests for "R" packages. Currently, when I have to test a package, I use devtools::test() which automatically runs ALL tests, even if no changes occured wrt the last successful run on files involved in one particular test. If I want to run tests only on new/modfied functions, I need to manually run test_file() or testthis::test_this() on that specific file/function.

I was therefore thinking that it would be nice to be able to run a "command" which automatically skips tests that "passed" in the past if no changes occurred and only runs :

  1. new tests;
  2. tests involving modified source scripts/datasets;
  3. previously failing tests

This would considerably speed-up the testing for packages with an extensive set of unit tests.

Do you think this would be possible using Drake? Are there any particular drawbacks/things that would make it difficult?

Thanks in advance for any insights !

@wlandau
Copy link
Member

wlandau commented Jan 31, 2018

@lbusett This is a fantastic idea! I will have a proper response soon.

@wlandau
Copy link
Member

wlandau commented Feb 1, 2018

I love this idea. The unit tests for drake itself are fairly long, especially the roughly hour-long testing workflow over all kinds of parallel computing and envir configuration. The automation helps, but there is so much room to better. I think drake has potential to help.

I have not actually tried this approach, but here is how I see it starting to play out in real life.

  1. Load the package being tested. I suspect devtools::load_all() will suffice here. If not, there is a workaround for workflows implemented as R packages. One key point here is to retain the environment of your package for later: e.g. env <- load_all()$env.
  2. Convert all the code in tests/testthat/*.R into a workflow plan data frame. Here, the code should be read and parsed, but not evaluated. Calls to test_that() would become commands, and targets would have informative names based on the names of the individual tests.
  3. Now that you have a workflow plan and an environment, run the tests with make(your_plan, envir = env). Another advantage with drake is that if you follow the directions here, you could also run your tests in parallel, so you could set jobs = 4, for example.

Other notes:

  1. The environment loaded by load_all() must be explicitly passed to make() through the envir argument. Otherwise, drake will not analyzed nested functions for dependencies.
  2. Even if this works with devtools::test(), there is still a lot of work to be done for devtools::check() (R CMD check).
  3. I do not think this approach to testing should be part of CRAN's automated package checks. By design, drake under-reacts to changes in dependency packages. (packrat does a much better job with package reproducibility.)

This is already very involved, and I feel strongly that it should be its own separate tool, maybe a package called draketest. That is the only reason why I am closing the issue. I may not be able to contribute for a while yet because drake itself is about to undergo another phase of rapid development, but I am totally supportive of this. Let's keep talking.

@lbusett
Copy link
Author

lbusett commented Feb 1, 2018

Hi,

thanks for your reply. I agree that a draketest package would be the way to go for this that could complement (not replace) the usual testing tools.

As far as I understand, the tricky part here would be to build something like a "matrix" of dependencies for each test (i.e., which internal (from the package itself) and imported functions are used when running the test), so that a run can be made "obsolete" if any of the dependencies changed (i.e., because a new version of an imported function was installed - maybe that could be tracked by saving and comparing sessionInfo).

I'll see if in the next weeks I'll find some time to start experimenting on this. Will keep you posted in case.

@wlandau
Copy link
Member

wlandau commented Feb 5, 2018

Sounds great, @lbusett. Please keep me posted. If we get it to work, this could be a valuable contribution to the package development process. I am focused on #227 at the moment, and I won't have time to work on draketest myself any time soon.

One of the strengths of drake is that you don't actually need to build a matrix or graph of dependencies yourself. Drake analyzes your environment and commands to understand when to build stuff. For example,
see this comment from #227. So if your calls to test_that() are commands, drake will detect and analyze all the functions inside. That way, if you change a function in your package, the test will just rerun.

The catch here is that nesting stops at functions that look like they're from packages. That means you will have to trick the package you're testing to not look like a package namespace. @dapperjapper found a hack that does this, and I put in the caution vignette.

@lbusett
Copy link
Author

lbusett commented Feb 5, 2018

Hi,

I just started experimenting on this. From my initial tests, however, it seems to me more feasible/easy to exploit the fact that tests are usually organized in files and all tests within a file can be run with test_file(). I already tested parallelizing the tests execution by creating different targets for different calls to "test_file", and it works quite well (though I am struggling a bit with working directory issues). Obviously, it is suboptimal because then if you have one test file which takes much more time than the others because it includes more/longer tests you will not leverage the full possible parallelization speed-up. However, the advantage is that with this workaround does not require to parse the code of the test files to "create" commands for the targets at each 'test_that call (which I think was your suggestion and I would not know how to do at the moment... ;-)) and also that you can directly use the "structured" reports from testthat to easily create a testing report.

Regarding the dependencies issues: yes, after reading a bit more drake documentation I understand that the issue should be already dealt with by Drake.

I'll keep you posted.

@wlandau
Copy link
Member

wlandau commented Feb 6, 2018

That sounds like an excellent proof of concept, and it should give you a lot of mileage immediately. And yes, parsing code is hard, but I think it will be worth your while later on:

  • It will no longer matter how tests are divided among files.
  • Changing the whitespace, comments, and other formatting will not trigger a test.

When you get to that stage, the CodeDepends package by @gmbecker and @duncantl should take care the heavy lifting for you.

I will try to get to #208 soon, and I look forward to learning how drake-based testing turns out.

@lbusett
Copy link
Author

lbusett commented Feb 7, 2018

Hi. Sorry for pinging you on a closed issue. Just a quick question: since drake is apparently going through a major overhaul, do you think it would be better to use the github version for my experiments on this (or maybe even delay them to wait for a stable version, don't know...)?

Thanks in advance.

@wlandau
Copy link
Member

wlandau commented Feb 7, 2018

I would recommend the development version, especially after I fix #208. I originally meant to work on #237 first, but the benefits of #208 are more immediate, including an easier time developing draketest (or whatever you intend to call it).

When I close an issue, all that says is I no longer think there is a problem with drake itself. I do not mean to close the discussion. Please continue pinging me with these questions.

@wlandau wlandau changed the title Possibility to use Drake to run unit tests Possibility to use drake to run unit tests Aug 17, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants