Reproducible random numbers #56

AlexAxthelm · 2017-08-09T15:02:40Z

This PR is intended to increase reproducible, by having make(plan) preserve .Random.seed when building elements.

Includes test file with tests for comparing built and rebuilt objects, and ensuring that objects with same command (generated by expand) yield distinct results.

Note: I am getting inconsistent results from devtools::test(). Sometimes it will run without any errors, others it will have a lot, even when run sequentially. When I do get errors, it's almost never the same tests as the last run. I'm hoping that it's a quirk of my system, or that I am missing a package that is called in a test, because I get the same behavior on master as well. I'm submitting this PR, and hoping that travis can test better than me.

I want to be able to get the same object when I build, so that I can get the same simulation results when necessary

Using `withr::with_preserve_seed` to ensure that random number gereation is the same across runs

codecov-io · 2017-08-09T15:13:55Z

Codecov Report

Merging #56 into master will not change coverage.
The diff coverage is 100%.

@@          Coverage Diff          @@
##           master    #56   +/-   ##
=====================================
  Coverage     100%   100%           
=====================================
  Files          33     33           
  Lines        1059   1061    +2     
=====================================
+ Hits         1059   1061    +2

Impacted Files	Coverage Δ
R/make.R	`100% <100%> (ø)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update f10e1c7...101de07. Read the comment docs.

wlandau-lilly · 2017-08-09T15:26:01Z

This is a super elegant approach to a problem I thought would be harder to solve.

I think the testing problems may be caused because withr::with_preserve_seed() uses the global environment.

with_preserve_seed
## function (code)
## {
##     old_seed <- get_valid_seed()
##     on.exit(assign(".Random.seed", old_seed, globalenv()), add = TRUE)
##     code
## }

On the other hand, a lot of functions like make() begin with

make = function(..., envir = parent.frame()){
  ...
}

In my testing, envir is a new.env(parent = globalenv()). You may have to define your own version of with_preserve_seed() that uses envir instead of globalenv().

wlandau-lilly · 2017-08-09T15:28:14Z

Never mind about my original concern about resetting the seed. I will need to think about this more, but there is only one reset per call to make(), so RNGs should still be pseudo-random throughout the workflow.

AlexAxthelm · 2017-08-09T15:34:57Z

The main idea is that in a script of vignette, we could have something along the lines of:

set.seed(123)
make(plan)

where plan has random number generation, or (in my case) sampling, and we would get the same results, regardless of where or when it is run.

AlexAxthelm · 2017-08-09T15:37:09Z

I hadn't considered that envir might have a different .Random.seed. Maybe I can use withr::with_seed() instead, and pull the seed from there.

AlexAxthelm · 2017-08-09T16:42:57Z

Upon further investigation of .Random.seed, it seems that there it only exists in globalenv(). Even when I attempt to explicitly set it inside an environment, all the RNG functions call on the one in globalenv()

> ls(globalenv(), all = TRUE)
character(0)
> a <- rnorm(1)
> ls(globalenv(), all = TRUE)
[1] ".Random.seed" "a"
> foo <- .Random.seed
> rnorm(3)
[1] -0.07880674 -1.15956053 -0.09012208
> rnorm(3) # Note this is diefferent
[1] -0.7723668 -1.6987008  0.5537592
> envir <- new.env()
> envir[[".Random.seed"]] <- foo #Explicitly set the environment .Random.seed
> ls(envir, all = TRUE)
[1] ".Random.seed"
> f <- function(){
+ force(envir) # Same as we see in drake::make()
+ rnorm(3) # If calling using the envir .Random.seed, we should see same results as before
+ }
> f() #But no.
[1]  0.8525371 -0.7058269  0.1934947
> .Random.seed <- foo
> rnorm(3)
[1] -0.07880674 -1.15956053 -0.09012208
> .Random.seed <- foo
> f() # even with the force(envir), it's still using global .Random.seed
[1] -0.07880674 -1.15956053 -0.09012208
>

This leads me to believe that with_preserve_seed() is the right way to go. I'm guessing that the testing problems on my local machine are particular to my machine.

wlandau-lilly · 2017-08-09T19:12:30Z

What are the specs of your system and R version? Maybe I can reproduce the test errors. In my experience, the CRAN check servers do not always work properly, especially R-devel on Windows, so I need to be extra diligent to make sure all the checks pass.

After reading your tests more closely, the problem you are solving is much clearer to me: even if the user does not remember to set the seed beforehand, make() should restore the seed for the next call to make() so that targets can be repaired and recovered with the same seed automatically.

Speaking of different forms of parallelism, almost all of drake's unit tests are configured to use envir = dbug()$envir, parallelism = testopts()$parallelism, and jobs = testopts()$jobs. Everything needs to work under all combinations of these settings, and seeds may not carry over easily, especially with Makefile parallelism in a non-global environment. Would you mind passing these arguments to make() in test-reproducible.R?

Bonus points if you know a clean, automated way to run testthat::test("drake") under different scenarios instead of manually changing R/debug.R and R/test.R each time. (Maybe with mockr somehow?)

AlexAxthelm · 2017-08-09T19:55:16Z

Working on implementing the arguments into make() commands now.

What different scenarios would you want to test?

As far as my local environments, I'm on Windows10, and I also have a setup with BashOnUbuntuOnWindows

Win 10 (vanilla)

> sessionInfo()
R version 3.4.1 (2017-06-30)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 14393)

Matrix products: default

locale:
[1] LC_COLLATE=English_United States.1252
[2] LC_CTYPE=English_United States.1252
[3] LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C
[5] LC_TIME=English_United States.1252

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base

other attached packages:
[1] fcuk_0.1.21

loaded via a namespace (and not attached):
[1] compiler_3.4.1     magrittr_1.5       parallel_3.4.1     tibble_1.3.3
[5] Rcpp_0.12.11       rlang_0.1.1        stringdist_0.9.4.4 purrr_0.2.2

BashOnWindows

> sessionInfo()
R version 3.4.1 (2017-06-30)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 14.04.5 LTS

Matrix products: default
BLAS: /usr/lib/libblas/libblas.so.3.0
LAPACK: /usr/lib/lapack/liblapack.so.3.0

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C
 [9] LC_ADDRESS=C               LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base

loaded via a namespace (and not attached):
[1] compiler_3.4.1

wlandau-lilly · 2017-08-10T02:19:15Z

Thanks, @AlexAxthelm. Unfortunately I do not have access to Windows 10. Are the tests failing because the seed is not recovered, or are there other errors?

I think I digressed when I brought up testing configurations. I made a separate issue in #58.

AlexAxthelm · 2017-08-24T20:29:54Z

I've been working on this today, and I think we should take it off of the 4.1.0 milestone. I've tried a couple of different variations on the withr::with_seed() theme, and they all only work when jobs = 1 which is a pretty severe limitation to enforce. Looking through the RNG documentation(page 5) in the parallel package does not leave me hopeful. That doesn't even touch the Makefile parallelism.

I vote we close for now, and turn this into an issue with the "difficult" tag.

wlandau-lilly · 2017-08-24T22:02:36Z

Useful information. I agree, it seems like the current approach will not work for parallel execution. This makes me think we should handle the seeds on a target-by-target basis. We could wrap withr::with_seed() around the body of build() and use the storr cache to keep track of

The original seed at the start of make().
The ending seed of each target.

For the starting seed of each subsequent target, we could choose among the ending seeds of the dependencies using some deterministic rule that does not depend on execution order.

I think this is doable, but as you say, difficult. I am okay with keeping this PR around if you would like to work on it. Otherwise, we can create a separate issue for another time.

At the moment, my tentative plans for drake are these (plus any production-ready pull requests that come in the meantime)

4.1.0: mostly bugfixes for which I have patched issues locally, plus all the accepted PRs so far.
4.1.1: any straightforward, low-risk work on style and cleanup not in 4.1.0: linting, testing, Improve code style #40, lingering bugs, etc.
4.2.0: this issue, plus Get dependencies of knitr reports automatically. #9, Automatically generate a workflow plan data frame from arbitrary R code. #25, and maybe Additional modes of parallelism #42.
4.2.1: bugfixes from 4.2.0

AlexAxthelm · 2017-08-25T13:40:17Z

I'll probably try again on this in a week or two. My current line of thought also has me wanting to cache the seed as the first line of make(), but then I think I want to pass that (same value) to each of the parallel sessions.

A thought that I had was to store the .Random.seed at the beginning of make(), and then call that exact value from within each parallel session, which would mean we don't have to worry about tracking before/after conditions. Importantly, I would probably end up using a numeric hash of .Random.seed since the actual value of the seed is not as important as the fact that it can be set/reset/recalled (I'm ignoring the use case where somebody tries to get the same results without drake as they do with). Hashing the seed would let me salt the hash with the target name, ensuring that targets with identical commands would get different (reproducible) seeds, despite having the same "base seed".

Not entirely sure what the best way to implement this will be. I'll take a shot a passing through storr, but my suspicion is that it will end up with as an element of config.

wlandau-lilly · 2017-08-25T14:36:23Z

Sounds much cleaner and easier to implement. Do you think we will still keep the promise of pseudo-randomness if we reset the seed this way for each target? We probably do, my concern is mild.

I would just add the seed to either config or the cache's "session" namespace, either should be fine.

wlandau-lilly · 2017-09-12T14:30:52Z

@AlexAxthelm what are your plans for this issue? Did you close the PR to start fresh later?

AlexAxthelm · 2017-09-12T15:06:36Z

Deleted the branch by mistake while getting a little too happy with cleaning my git branches. I am planning on returning to this, probably after 4.2.0 is done.

wlandau-lilly · 2017-10-14T22:07:24Z

Completed/replaced by #109. @AlexAxthelm, sorry to close your PR. The ideas are all great, I simply implemented them.

AlexAxthelm · 2017-10-16T12:51:48Z

Thank you so much for wrapping this up! I"m glad that I was able to help.

Alex Axthelm added 4 commits August 9, 2017 09:47

Add some failing tests around reproducibility

3716c5a

I want to be able to get the same object when I build, so that I can get the same simulation results when necessary

Let the test run quietly

372306e

Ensure that the seed is preserved across runs

97206c8

Using `withr::with_preserve_seed` to ensure that random number gereation is the same across runs

add withr to imports

a986b8d

AlexAxthelm closed this Aug 9, 2017

AlexAxthelm reopened this Aug 9, 2017

use standard testing options in reproducible

101de07

wlandau-lilly modified the milestone: v4.1.0 CRAN release Aug 11, 2017

This was referenced Aug 11, 2017

58 test configs #59

Closed

Re-architect unit testing (automate different platforms, mostly) #58

Closed

AlexAxthelm mentioned this pull request Aug 24, 2017

Isolate each unit test in a self-contained file system #74

Closed

wlandau-lilly removed this from the v4.1.0 CRAN release milestone Aug 24, 2017

wlandau-lilly added the difficulty: advanced label Aug 24, 2017

AlexAxthelm closed this Sep 12, 2017

AlexAxthelm deleted the reproducible branch September 12, 2017 14:10

AlexAxthelm restored the reproducible branch September 12, 2017 15:06

AlexAxthelm reopened this Sep 12, 2017

wlandau-lilly assigned AlexAxthelm Oct 14, 2017

wlandau mentioned this pull request Oct 14, 2017

Implemented reproducible random numbers #109

Merged

wlandau-lilly closed this Oct 14, 2017

AlexAxthelm deleted the reproducible branch October 16, 2017 12:51

wlandau-lilly mentioned this pull request Jan 23, 2018

drake (R package) ropensci/software-review#156

Closed

15 tasks

wlandau mentioned this pull request Feb 1, 2018

Pseudo-randomness is not totally reproducible #218

Closed

bart1 mentioned this pull request Mar 10, 2018

Reproducibility with random numbers #313

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reproducible random numbers #56

Reproducible random numbers #56

AlexAxthelm commented Aug 9, 2017

codecov-io commented Aug 9, 2017 •

edited

Loading

wlandau-lilly commented Aug 9, 2017 •

edited

Loading

wlandau-lilly commented Aug 9, 2017

AlexAxthelm commented Aug 9, 2017

AlexAxthelm commented Aug 9, 2017

AlexAxthelm commented Aug 9, 2017

wlandau-lilly commented Aug 9, 2017 •

edited

Loading

AlexAxthelm commented Aug 9, 2017

wlandau-lilly commented Aug 10, 2017

AlexAxthelm commented Aug 24, 2017 •

edited

Loading

wlandau-lilly commented Aug 24, 2017 •

edited

Loading

AlexAxthelm commented Aug 25, 2017

wlandau-lilly commented Aug 25, 2017

wlandau-lilly commented Sep 12, 2017

AlexAxthelm commented Sep 12, 2017 •

edited

Loading

wlandau-lilly commented Oct 14, 2017

AlexAxthelm commented Oct 16, 2017

Reproducible random numbers #56

Reproducible random numbers #56

Conversation

AlexAxthelm commented Aug 9, 2017

codecov-io commented Aug 9, 2017 • edited Loading

Codecov Report

wlandau-lilly commented Aug 9, 2017 • edited Loading

wlandau-lilly commented Aug 9, 2017

AlexAxthelm commented Aug 9, 2017

AlexAxthelm commented Aug 9, 2017

AlexAxthelm commented Aug 9, 2017

wlandau-lilly commented Aug 9, 2017 • edited Loading

AlexAxthelm commented Aug 9, 2017

Win 10 (vanilla)

BashOnWindows

wlandau-lilly commented Aug 10, 2017

AlexAxthelm commented Aug 24, 2017 • edited Loading

wlandau-lilly commented Aug 24, 2017 • edited Loading

AlexAxthelm commented Aug 25, 2017

wlandau-lilly commented Aug 25, 2017

wlandau-lilly commented Sep 12, 2017

AlexAxthelm commented Sep 12, 2017 • edited Loading

wlandau-lilly commented Oct 14, 2017

AlexAxthelm commented Oct 16, 2017

codecov-io commented Aug 9, 2017 •

edited

Loading

wlandau-lilly commented Aug 9, 2017 •

edited

Loading

wlandau-lilly commented Aug 9, 2017 •

edited

Loading

AlexAxthelm commented Aug 24, 2017 •

edited

Loading

wlandau-lilly commented Aug 24, 2017 •

edited

Loading

AlexAxthelm commented Sep 12, 2017 •

edited

Loading