Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[compiler] enable new rng #12139

Merged
merged 27 commits into from
Dec 2, 2022
Merged

[compiler] enable new rng #12139

merged 27 commits into from
Dec 2, 2022

Conversation

patrick-schultz
Copy link
Collaborator

@patrick-schultz patrick-schultz commented Aug 30, 2022

Stacked on #12331 and #12309.

Design doc here.

Comment on lines 139 to 115
The seed can also be set globally using :func:`.set_global_seed`. This sets the
seed globally for all subsequent Hail operations, and a pipeline will be
guaranteed to have the same results if the global seed is set right beforehand:

>>> hl.set_global_seed(0)
>>> hl.eval(hl.array([hl.rand_unif(0, 1), hl.rand_unif(0, 1)])) # doctest: +SKIP_OUTPUT_CHECK
[0.6830630912401323, 0.4035978197966855]
>>> hl.reset_global_randomness()
>>> hl.eval(hl.array([hl.rand_unif(0, 1), hl.rand_unif(0, 1)]))
[0.9828239225846387, 0.49094525115847415]

>>> hl.set_global_seed(0)
>>> hl.eval(hl.array([hl.rand_unif(0, 1), hl.rand_unif(0, 1)])) # doctest: +SKIP_OUTPUT_CHECK
[0.6830630912401323, 0.4035978197966855]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The text above references set_global_seed which the code does not.

I think I'm a bit confused. The text above says that the global seed "is fixed for the entire session", but we appear to modify it here.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah thanks, meant to remove all mentions of set_global_seed

@danking
Copy link
Contributor

danking commented Nov 14, 2022

I resolved conflicts. There were two issues in ir.py and one in aggregators.py. The former were resolved by: replace if with assert, add a new field about needing randomness. The latter is visible in the diff and switches from frozenset to set.

@danking
Copy link
Contributor

danking commented Nov 16, 2022

Bumping this PR, I'd like it to land so I can nail down exactly why my flags PR is causing BN tests to fail.

Copy link
Contributor

@tpoterba tpoterba left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tiny comments. Python IR and Scala RNG changes are super clear, great work.

hail/python/hail/context.py Outdated Show resolved Hide resolved
hail/python/hail/ir/ir.py Show resolved Hide resolved
static_rng_uid = Env.next_static_rng_uid()
else:
if Env._hc is None or not Env._hc._user_specified_rng_nonce:
warning('To ensure reproducible randomness across Hail sessions, '
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning calls through to initialize, I think -- is this intentional? Fine if yes, just good to be aware.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't realize, but I think that's fine. The warning is saying you should really initialize manually before doing anything else, not to worry about what functions do or don't implicitly initialize.

@@ -383,7 +381,7 @@ def _spectral_moments(A, num_moments, p=None, moment_samples=500, block_size=128
# TODO: When moment_samples > n, we should just do a TSQR on A, and compute
# the spectrum of R.
assert moment_samples < n, '_spectral_moments: moment_samples must be smaller than num cols of A'
G = hl.nd.zeros((n, moment_samples)).map(lambda n: hl.if_else(hl.rand_bool(0.5), -1, 1))
G = hl.rand_unif(-1, 1, size=(n, moment_samples)).map(lambda x: hl.sign(x))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good change.

Env._seed_generator = hail.utils.HailSeedGenerator(seed)
def next_static_rng_uid():
result = Env._static_rng_uid
assert(result <= 0xFFFF_FFFF_FFFF_FFFF)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should keep Konrad contained for at least a few years

# Conflicts:
#	hail/python/hail/ir/__init__.py
#	hail/python/hail/ir/ir.py
#	hail/python/test/hail/table/test_table.py
#	hail/python/test/hail/utils/test_google_fs_utils.py
@patrick-schultz
Copy link
Collaborator Author

This is still stacked on #12309 as well

@danking
Copy link
Contributor

danking commented Nov 30, 2022

This seems to be triggering hail initialization somewhere it not to?

@patrick-schultz
Copy link
Collaborator Author

Ah, it was just a new test file that used setUpModule = startTestHailContext. Btw, after this merges, no need to do that anymore. Now pytest fixtures always ensure there's a hail context.

@patrick-schultz
Copy link
Collaborator Author

@tpoterba Tests are passing

@tpoterba
Copy link
Contributor

tpoterba commented Dec 2, 2022

Ship it!

@danking
Copy link
Contributor

danking commented Dec 2, 2022

Thank you both!

@danking danking merged commit 567acfd into hail-is:main Dec 2, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants