Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reproducible random numbers #56

Closed
wants to merge 5 commits into from

Conversation

AlexAxthelm
Copy link
Collaborator

This PR is intended to increase reproducible, by having make(plan) preserve .Random.seed when building elements.

Includes test file with tests for comparing built and rebuilt objects, and ensuring that objects with same command (generated by expand) yield distinct results.

Note: I am getting inconsistent results from devtools::test(). Sometimes it will run without any errors, others it will have a lot, even when run sequentially. When I do get errors, it's almost never the same tests as the last run. I'm hoping that it's a quirk of my system, or that I am missing a package that is called in a test, because I get the same behavior on master as well. I'm submitting this PR, and hoping that travis can test better than me.

Alex Axthelm added 4 commits August 9, 2017 09:47
I want to be able to get the same object when I build, so that I can get
the same simulation results when necessary
Using `withr::with_preserve_seed` to ensure that random number gereation
is the same across runs
@codecov-io
Copy link

codecov-io commented Aug 9, 2017

Codecov Report

Merging #56 into master will not change coverage.
The diff coverage is 100%.

Impacted file tree graph

@@          Coverage Diff          @@
##           master    #56   +/-   ##
=====================================
  Coverage     100%   100%           
=====================================
  Files          33     33           
  Lines        1059   1061    +2     
=====================================
+ Hits         1059   1061    +2
Impacted Files Coverage Δ
R/make.R 100% <100%> (ø) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update f10e1c7...101de07. Read the comment docs.

@wlandau-lilly
Copy link
Collaborator

wlandau-lilly commented Aug 9, 2017

This is a super elegant approach to a problem I thought would be harder to solve.

I think the testing problems may be caused because withr::with_preserve_seed() uses the global environment.

with_preserve_seed
## function (code)
## {
##     old_seed <- get_valid_seed()
##     on.exit(assign(".Random.seed", old_seed, globalenv()), add = TRUE)
##     code
## }

On the other hand, a lot of functions like make() begin with

make = function(..., envir = parent.frame()){
  ...
}

In my testing, envir is a new.env(parent = globalenv()). You may have to define your own version of with_preserve_seed() that uses envir instead of globalenv().

@wlandau-lilly
Copy link
Collaborator

Never mind about my original concern about resetting the seed. I will need to think about this more, but there is only one reset per call to make(), so RNGs should still be pseudo-random throughout the workflow.

@AlexAxthelm
Copy link
Collaborator Author

The main idea is that in a script of vignette, we could have something along the lines of:

set.seed(123)
make(plan)

where plan has random number generation, or (in my case) sampling, and we would get the same results, regardless of where or when it is run.

@AlexAxthelm
Copy link
Collaborator Author

I hadn't considered that envir might have a different .Random.seed. Maybe I can use withr::with_seed() instead, and pull the seed from there.

@AlexAxthelm AlexAxthelm closed this Aug 9, 2017
@AlexAxthelm AlexAxthelm reopened this Aug 9, 2017
@AlexAxthelm
Copy link
Collaborator Author

Upon further investigation of .Random.seed, it seems that there it only exists in globalenv(). Even when I attempt to explicitly set it inside an environment, all the RNG functions call on the one in globalenv()

> ls(globalenv(), all = TRUE)
character(0)
> a <- rnorm(1)
> ls(globalenv(), all = TRUE)
[1] ".Random.seed" "a"
> foo <- .Random.seed
> rnorm(3)
[1] -0.07880674 -1.15956053 -0.09012208
> rnorm(3) # Note this is diefferent
[1] -0.7723668 -1.6987008  0.5537592
> envir <- new.env()
> envir[[".Random.seed"]] <- foo #Explicitly set the environment .Random.seed
> ls(envir, all = TRUE)
[1] ".Random.seed"
> f <- function(){
+ force(envir) # Same as we see in drake::make()
+ rnorm(3) # If calling using the envir .Random.seed, we should see same results as before
+ }
> f() #But no.
[1]  0.8525371 -0.7058269  0.1934947
> .Random.seed <- foo
> rnorm(3)
[1] -0.07880674 -1.15956053 -0.09012208
> .Random.seed <- foo
> f() # even with the force(envir), it's still using global .Random.seed
[1] -0.07880674 -1.15956053 -0.09012208
>

This leads me to believe that with_preserve_seed() is the right way to go. I'm guessing that the testing problems on my local machine are particular to my machine.

@wlandau-lilly
Copy link
Collaborator

wlandau-lilly commented Aug 9, 2017

What are the specs of your system and R version? Maybe I can reproduce the test errors. In my experience, the CRAN check servers do not always work properly, especially R-devel on Windows, so I need to be extra diligent to make sure all the checks pass.

After reading your tests more closely, the problem you are solving is much clearer to me: even if the user does not remember to set the seed beforehand, make() should restore the seed for the next call to make() so that targets can be repaired and recovered with the same seed automatically.

Speaking of different forms of parallelism, almost all of drake's unit tests are configured to use envir = dbug()$envir, parallelism = testopts()$parallelism, and jobs = testopts()$jobs. Everything needs to work under all combinations of these settings, and seeds may not carry over easily, especially with Makefile parallelism in a non-global environment. Would you mind passing these arguments to make() in test-reproducible.R?

Bonus points if you know a clean, automated way to run testthat::test("drake") under different scenarios instead of manually changing R/debug.R and R/test.R each time. (Maybe with mockr somehow?)

@AlexAxthelm
Copy link
Collaborator Author

Working on implementing the arguments into make() commands now.

What different scenarios would you want to test?

As far as my local environments, I'm on Windows10, and I also have a setup with BashOnUbuntuOnWindows

Win 10 (vanilla)

> sessionInfo()
R version 3.4.1 (2017-06-30)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 14393)

Matrix products: default

locale:
[1] LC_COLLATE=English_United States.1252
[2] LC_CTYPE=English_United States.1252
[3] LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C
[5] LC_TIME=English_United States.1252

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base

other attached packages:
[1] fcuk_0.1.21

loaded via a namespace (and not attached):
[1] compiler_3.4.1     magrittr_1.5       parallel_3.4.1     tibble_1.3.3
[5] Rcpp_0.12.11       rlang_0.1.1        stringdist_0.9.4.4 purrr_0.2.2

BashOnWindows

> sessionInfo()
R version 3.4.1 (2017-06-30)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 14.04.5 LTS

Matrix products: default
BLAS: /usr/lib/libblas/libblas.so.3.0
LAPACK: /usr/lib/lapack/liblapack.so.3.0

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C
 [9] LC_ADDRESS=C               LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base

loaded via a namespace (and not attached):
[1] compiler_3.4.1

@wlandau-lilly
Copy link
Collaborator

Thanks, @AlexAxthelm. Unfortunately I do not have access to Windows 10. Are the tests failing because the seed is not recovered, or are there other errors?

I think I digressed when I brought up testing configurations. I made a separate issue in #58.

@AlexAxthelm
Copy link
Collaborator Author

AlexAxthelm commented Aug 24, 2017

I've been working on this today, and I think we should take it off of the 4.1.0 milestone. I've tried a couple of different variations on the withr::with_seed() theme, and they all only work when jobs = 1 which is a pretty severe limitation to enforce. Looking through the RNG documentation(page 5) in the parallel package does not leave me hopeful. That doesn't even touch the Makefile parallelism.

I vote we close for now, and turn this into an issue with the "difficult" tag.

@wlandau-lilly
Copy link
Collaborator

wlandau-lilly commented Aug 24, 2017

Useful information. I agree, it seems like the current approach will not work for parallel execution. This makes me think we should handle the seeds on a target-by-target basis. We could wrap withr::with_seed() around the body of build() and use the storr cache to keep track of

  1. The original seed at the start of make().
  2. The ending seed of each target.

For the starting seed of each subsequent target, we could choose among the ending seeds of the dependencies using some deterministic rule that does not depend on execution order.

I think this is doable, but as you say, difficult. I am okay with keeping this PR around if you would like to work on it. Otherwise, we can create a separate issue for another time.

At the moment, my tentative plans for drake are these (plus any production-ready pull requests that come in the meantime)

@AlexAxthelm
Copy link
Collaborator Author

I'll probably try again on this in a week or two. My current line of thought also has me wanting to cache the seed as the first line of make(), but then I think I want to pass that (same value) to each of the parallel sessions.

A thought that I had was to store the .Random.seed at the beginning of make(), and then call that exact value from within each parallel session, which would mean we don't have to worry about tracking before/after conditions. Importantly, I would probably end up using a numeric hash of .Random.seed since the actual value of the seed is not as important as the fact that it can be set/reset/recalled (I'm ignoring the use case where somebody tries to get the same results without drake as they do with). Hashing the seed would let me salt the hash with the target name, ensuring that targets with identical commands would get different (reproducible) seeds, despite having the same "base seed".

Not entirely sure what the best way to implement this will be. I'll take a shot a passing through storr, but my suspicion is that it will end up with as an element of config.

@wlandau-lilly
Copy link
Collaborator

Sounds much cleaner and easier to implement. Do you think we will still keep the promise of pseudo-randomness if we reset the seed this way for each target? We probably do, my concern is mild.

I would just add the seed to either config or the cache's "session" namespace, either should be fine.

@AlexAxthelm AlexAxthelm deleted the reproducible branch September 12, 2017 14:10
@wlandau-lilly
Copy link
Collaborator

@AlexAxthelm what are your plans for this issue? Did you close the PR to start fresh later?

@AlexAxthelm AlexAxthelm restored the reproducible branch September 12, 2017 15:06
@AlexAxthelm
Copy link
Collaborator Author

AlexAxthelm commented Sep 12, 2017

Deleted the branch by mistake while getting a little too happy with cleaning my git branches. I am planning on returning to this, probably after 4.2.0 is done.

@wlandau-lilly
Copy link
Collaborator

Completed/replaced by #109. @AlexAxthelm, sorry to close your PR. The ideas are all great, I simply implemented them.

@AlexAxthelm AlexAxthelm deleted the reproducible branch October 16, 2017 12:51
@AlexAxthelm
Copy link
Collaborator Author

Thank you so much for wrapping this up! I"m glad that I was able to help.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants