Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generic split function #12

Conversation

curiousleo
Copy link
Collaborator

This makes concrete the idea discussed in #7 (comment).

@curiousleo curiousleo force-pushed the generic-split branch 3 times, most recently from 979cb64 to c20ebe5 Compare March 2, 2020 14:00
@lehins lehins mentioned this pull request Mar 3, 2020
, bytestring
, primitive
, time
, mtl
, mwc-random
, SHA
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

random is a very popular package therefore we should be very careful when adding an extra dependency.

I am not sure that addition of binary here is justified, even though I know that it is a core package.

I am also not sure about SHA dependency here, I quickly benchmarked it against cryptohash-sha256 and the former was 8 times slower.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's also problem of GHCJS. I think random is package that must work on GHCJS. cryptohash-sha256 uses C so it wouldn't. It's of course possible to use SHA on GHCJS and cryptohash-sha256 otherwise

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. Thanks. I agree that fallback onto SHA for GHCJS is a good approach

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm going to hold off on implementing these suggestions for now. The purpose of this PR was to show the interface changes required (just initialize on RandomGen, as it turns out) and to show one possible implementation.

I created #18 to discuss seeding more generally; a very different design may emerge from it (for example without split being a method on RandomGen) and we may not actually want to use SHA at all.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That sounds like a good plan. I'll keep this conversation unresolved, just in case.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Distinguishing splittable from non-splittable was decided to be solved by supplying a really good default implementation for splitting any pure RNG. So far proof-of-concept is implemented in #12

Poor splitting is one of the problems in the current implementation. Where does the proposal for SHA-1 of 8 random bytes come from?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From here: #7 (comment) - I couldn't find an existing Haskell implementation of SpookyHash and thought that SHA-256 is surely at least as good as a mixing function for a proof-of-concept PR.

@idontgetoutmuch
Copy link
Owner

Ok so the proposal is to have a default split but the implentor of a given RNG would have to explicitly use it (it's not a default in the class definition)?

There would need to be some good text in the documentation describing the default and we would encourage (write PRs for?) maintainers of e.g. splitmix and tf-random to give details on their splitting approach?

@curiousleo
Copy link
Collaborator Author

curiousleo commented Mar 4, 2020

Ok so the proposal is to have a default split but the implentor of a given RNG would have to explicitly use it (it's not a default in the class definition)?

Yes, that is the compromise we worked out in #7 (comment) and #7 (comment). Although as I understood @lehins, he wanted to avoid making defaultSplit the actual default implementation of split because of CPRNGs. But in #18 (comment) (with thumbs-up from @lehins) we decided to explicitly make random about fast PRNGs, and make it clear that it is not suited for security-relevant randomness generation. Perhaps in light of that, we should reconsider making defaultSplit the actual default implementation for split in this proposal.

@idontgetoutmuch
Copy link
Owner

we decided to explicitly make random about fast PRNGs, and make it clear that it is not suited for security-relevant randomness generation

I agree

@lehins
Copy link
Collaborator

lehins commented Mar 4, 2020

Regardless of this, I would still vote against having default split implementation. This is the behavior that directly affects the RNG, therefore selecting a splitting function should be a concensus decision on the authors part. CRNG is just an extreme version of that.

Perhaps in light of that, we should reconsider making defaultSplit the actual default implementation for split in this proposal.

@idontgetoutmuch
Copy link
Owner

Regardless of this, I would still vote against having default split implementation.

I also vote against as there is a risk that splitting will not behave as expected. This is the current situation.

@curiousleo
Copy link
Collaborator Author

curiousleo commented Mar 10, 2020

This PR was based on the idea that split would be used a few times at the beginning of a randomness-consuming program to set up various PRNGs. The combination of Haskell being functional and lazy seems to lead to a different usage pattern than in other languages: split is used all the time, e.g. in QuickCheck (see #9 (comment)), so I think that the proposal in this PR just won't cut it. Closing.

@curiousleo curiousleo closed this Mar 10, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants