Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Isolate each unit test in a self-contained file system #74

Closed
AlexAxthelm opened this issue Aug 24, 2017 · 13 comments
Closed

Isolate each unit test in a self-contained file system #74

AlexAxthelm opened this issue Aug 24, 2017 · 13 comments

Comments

@AlexAxthelm
Copy link
Collaborator

AlexAxthelm commented Aug 24, 2017

I've been trying to chase down why some of the unit tests (consistently) fail on my local machine.

I'm planning a PR with some of the easy, minor fixes, but one that might be trickier is that anything with
r expect_equal(in_progress(), character(0)) fails for me.

My current thought is that dclean() isn't actually scrubbing the old plan, because report.md shows up as in progress, every time (even though it's been unlinked).

My proposal is to wrap the testthat files in something along the lines of

withr::with_dir(tempdir(), {

context("blah, blah blah")
clean()
testthat("the first test", {
#<CODE>
}
testthat("the second test", {
#<CODE>
}
#<so on, and so forth...>
clean()
}

This would allow us to use the actual clean(), and put that through its paces on a real directory, rather than just using dclean(). Additionally, this would get rid of any weird coincidental ordering effects, or unexpected persistence from previous tests.

The three big issues that I can see with this mothod right now are:
1: I'm not sure exactly how you've implemented #58, but I want to avoid conflicts with that.
2: Since tempdir() is the same within a session of R, we would need to make sure to run clean() at the beginning and end of each test. Alternately, we could put a file.remove(tempdir()) at the end of each test file, to completely scrap the tempdir, and force it to recreate for each test. A slight performance hit, but not huge.
3: tempdir() actually will keep the same value for child sessions spawned by parallel which is sort of a mixed bag. Good for testing, since we can still use it with multiple jobs (although I'm not entirely sure about how it would interact with parallelism = Makefile). Mostly the warning would be to not run tests in the same session as an actual make where something important is being made.

@AlexAxthelm
Copy link
Collaborator Author

Alternately, we could use a fictitious but static directory (.testdir/ or something similar), and generate and scrub it each time.

@wlandau-lilly
Copy link
Collaborator

withr::with_dir(tempdir(), {...}) seems like a much better way of doing things anyway. Thank you for logging this issue. I agree that we will have to iron out the parallel computing concerns, and this will be much easier to work on after I push my fix to #58 so you can test multiple parallelism configurations automatically.

In tests/testthat/cache.R, the working directory changes in order to test cached(search = TRUE), so we may have to be careful there.

With expect_equal(in_progress(), character(0)), I am not sure dclean() is at fault since it explicitly removes the entire .drake/ cache with unlink(..., recursive = TRUE). Would you mind posting your test errors?

No matter how many platforms I test on, platform-dependent test errors still come along and surprise me every once in a while. Several months ago, for example, the tests in a previous CRAN release were failing on my personal Ubuntu box because sort() used a different order. The package itself was not broken, but I had to go back into the tests and explicitly sort everything so all the checks would pass.

@wlandau-lilly
Copy link
Collaborator

I wonder, do the unit tests for storr all pass on your machine?

@AlexAxthelm
Copy link
Collaborator Author

I think I fixed it. I tried a clean(destroy = TRUE) from the base drake/ directory, and all the test ran fine (except for other-features @ L51, detailed below). I must have had a .drake/ hanging around from doing some interactive testing where I didn't run dclean() at the end.

I was able to recreate the issue by running a make() in the base directory that failed, so it had an in_progress object still hanging around. then ending my R session, and starting another, I got:

 ⋯  DocumentsdrakeR                                                                             
> drake::in_progress()                                                                                 
[1] "small"                                                                                            
> devtools::test()                                                                                     
Loading drake                                                                                          
Loading required package: testthat                                                                     
Testing drake                                                                                          
basic: ..........................                                                                 
cache: ...1.....................................2......................                                
command-changes: ........                                                                              
edge-cases: .....................                                                                      
envir: .....                                                                                           
examples: ....                                                                                         
full-build: .............................................                                              
generate: ...................                                                                          
import-file: ..........                                                                                
import-object: ..........                                                                              
intermediate-file: ......                                                                              
Makefile: .............................                                                                
namespaced: .........                                                                                  
other-features: 3............4..................................                                       
plan: .............                                                                                                                                                           
                                                                                                       
Failed -------------------------------------------------------------------------                       
1. Failure: cache functions work (@test-cache.R#11) ----------------------------                       
in_progress(search = FALSE) not equal to character(0).                                                 
Lengths differ: 1 vs 0                                                                                 
                                                                                                       
                                                                                                       
2. Failure: cache functions work (@test-cache.R#114) ---------------------------                       
in_progress(search = TRUE, path = s) not equal to character(0).                                        
Lengths differ: 1 vs 0                                                                                 
                                                                                                       
                                                                                                       
3. Failure: in_progress() works (@test-other-features.R#6) ---------------------                       
in_progress() not equal to character(0).                                                               
Lengths differ: 1 vs 0                                                                                 
                                                                                                       
                                                                                                       
4. Failure: deps() correctly reports dependencies of functions and commands (@test-other-features.R#51)
deps(my_plan$command[2]) not equal to c("'tracked_input_file.rds'", "readRDS", "x").                   
2/3 mismatches                                                                                         
x[1]: "readRDS"                                                                                        
y[1]: "'tracked_input_file.rds'"                                                                       
                                                                                                       
x[2]: "'tracked_input_file.rds'"                                                                       
y[2]: "readRDS"                                                                                        
>

So while this may no longer be a bug fix issue, I wonder if it would still be worth implementing after the next round of approvals are up? I know that when I'm looking to write tests for #56, it would be easier for me to not worry about the environment.

Also, just tried storr, and those tests run fine.

Additionally, I have run into the sort() issue on my machine too, suggesting that my Ubuntu is doing something different yet than yours? In particular in the "other features" tests, line 51 fails on my Ubuntu setup, but passes on windows. Strangely enough, the next line (52) which also has multiple elements passes just fund on both. I tried wrapping them in a sort(), and everything passed. I can look for more like that one, where order doesn't matter, when I lint the test files.

@wlandau-lilly wlandau-lilly changed the title Use withr::with_dir for unit tests Isolate unit tests in a self-contained file system Aug 24, 2017
@wlandau-lilly
Copy link
Collaborator

Glad you figured it out! And yes, while not technically a bug, I still think this is a problem because drake should pay no attention to .drake/ caches that are not part of the unit tests. I thought I had fixed this problem long ago.

Also, I think I just added enough sorting on my end to test_that("deps() ...", {...}) (test-other-features.R). That's a relatively new test, so I am not surprised that I forgot to add extra sorting.

@wlandau-lilly wlandau-lilly changed the title Isolate unit tests in a self-contained file system Isolate each unit test in a self-contained file system Aug 24, 2017
@wlandau-lilly
Copy link
Collaborator

wlandau-lilly commented Aug 25, 2017

I actually decided to create a separate local file system for each test.

test_with_dir <- function(desc, code){
  root <- "workspaces"
  if (!file.exists(root)){
    dir.create(root)
  }
  relative_dir <- base32_encode(desc) %>%
    digest(algo = hash_algorithm)
  dir <- file.path(root, relative_dir)
  dir_empty(dir)
  dir <- normalizePath(dir)
  with_dir(dir, test_that(desc = desc, code = code))
  # Keep dir until the very end. We should check for hash conflicts. If there are, we need to rename 
  # the prose descriptions of the tests.
}

Rationale:

  1. You had mentioned that tempdir() might conflict with a serious make() in an R session.
  2. If there is some way to run all the calls to test_that() (or in our case, test_with_dir()) in parallel R sessions, we are prepared. Wouldn't that be awesome?
  3. I like to run different kinds of testing on different clones of drake on the same machine, and I would rather use separate local file systems instead of all the checks pointing to potentially the same tempdir() (rare but possible).
  4. This approach gets around my strange Linux errors on Red Hat: shell-init: error retrieving current directory: getcwd: cannot access parent directories: No such file or directory.

@wlandau-lilly
Copy link
Collaborator

I guess the downside is that drake might still get confused if we test .drake/ caches in parent directories, but I think that can be fixed on a case-by-case basis as the need arises.

@wlandau-lilly
Copy link
Collaborator

wlandau-lilly commented Aug 25, 2017

By the way, I created a nonempty drake/.drake/ cache to try to reproduce the errors you were seeing, but all the tests passed in my local development copy (Red Hat Linux). But in the meantime, the strange shell-init error has been coming back. The tests seem to run fine in spite of it.

@wlandau-lilly
Copy link
Collaborator

wlandau-lilly commented Aug 25, 2017

Never mind about the shell-init errors. I managed to avoid them as long as I retain everything in tests/testthat/workspaces until the very end. This is also a good way of checking collisions: currently, the description of each test has to be unique or else multiple tests could share the same file system.

@AlexAxthelm
Copy link
Collaborator Author

Do you get issues with conflicting caches if you have an unfinished (failed) test, and then try to run it again? Either same session or after a restart? Mostly I'm wondering if it would be worth it to put a line to unlink(dir, recursive = TRUE) so that the directory is fresh before each test run.

@wlandau-lilly
Copy link
Collaborator

wlandau-lilly commented Aug 25, 2017

I did not check, but because of the dir_empty(dir) line, I do not think there will be conflicting caches. I had unlink(dir, recursive = TRUE) before, but removing it seemed to remove the elusive shell-init errors. I am, however, clearing out tests/testthat/workspaces before and after test_check() in testthat.R.

@wlandau-lilly
Copy link
Collaborator

wlandau-lilly commented Aug 25, 2017

The approaches above still gave me problems with the file system, so I decided to go back to tempdir(), but with a unique subfolder for each test. I would very much like to avoid dangling files from any failed/interrupted tests. This seems to work so far:

test_with_dir <- function(desc, code){
  root <- tempdir()
  stopifnot(file.exists(root))
  relative_dir <- base32_encode(desc) %>%
    digest(algo = hash_algorithm)
  dir <- file.path(root, relative_dir)
  dir_empty(dir)
  with_dir(dir, test_that(desc = desc, code = code))
}

@wlandau-lilly
Copy link
Collaborator

Confirmed: everything looks totally clean with the new test_with_dir() (Red Hat and Windows 7, all testing configurations).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants