-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Separate ranges and split #9
Conversation
CC @curiousleo @idontgetoutmuch and @cartazio |
I think we agreed so far that splitting Distinguishing splittable from non-splittable was decided to be solved by supplying a really good default implementation for splitting any pure RNG. So far proof-of-concept is implemented in #12 |
Splittable rngs need to be a subclass. There’s a different state
transformer monad instance for split vs not splitting and and they have
different performance characteristics
…On Tue, Mar 3, 2020 at 3:05 PM Alexey Kuleshevich ***@***.***> wrote:
Closed #9 <#9>.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#9?email_source=notifications&email_token=AAABBQWFUQR562YS3UT3KADRFVPKHA5CNFSM4KZVQM7KYY3PNVWWK3TUL52HS4DFWZEXG43VMVCXMZLOORHG65DJMZUWGYLUNFXW5KTDN5WW2ZLOORPWSZGOXBS6C3Q#event-3093684590>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAABBQQDWSEMPN7SCGW7IF3RFVPKHANCNFSM4KZVQM7A>
.
|
@cartazio You are speaking too abstract. Could you provide some concrete examples? |
In a splitting monad instance, every monadic bind does the split operation.
There are valid reasons to do this for laziness and independent sampling.
One motivation in quick check like settings is you can replace a monadic
step with return () and the rest of the computation will still give the
same results for a fixed choice of seed.
The cost is that bind isn’t free/ can’t be optimized away.
I kind of think of the difference as : the splitting bind monad is good for
macro system composition (assuming enough entropy and the bind split step
doesn’t dominate computation time) and that the State style monad is best
for inner loops because it’s gonna be something like a rejection loop
…On Thu, Mar 5, 2020 at 12:47 PM Alexey Kuleshevich ***@***.***> wrote:
@cartazio <https://github.com/cartazio> You are speaking to abstract.
Could you provide some concrete examples?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#9?email_source=notifications&email_token=AAABBQWB4RSHFVP6RS6UE3DRF7QTVA5CNFSM4KZVQM7KYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEN6HNPI#issuecomment-595359421>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAABBQREWHE3ZKHEKQBLPJ3RF7QTVANCNFSM4KZVQM7A>
.
|
@cartazio wrote:
Sorry if I'm missing something obvious here. Can you point to the code in https://github.com/idontgetoutmuch/random/blob/interface-to-performance/System/Random.hs that calls |
This is about how libraries such as quickcheck use the random api. The gen
monad in quickcheck
https://hackage.haskell.org/package/QuickCheck-2.13.2/docs/Test-QuickCheck-Gen.html
…On Fri, Mar 6, 2020 at 2:52 AM Leonhard Markert ***@***.***> wrote:
@cartazio <https://github.com/cartazio> wrote:
In a splitting monad instance, every monadic bind does the split operation.
Sorry if I'm missing something obvious here. Can you point to the code in
https://github.com/idontgetoutmuch/random/blob/interface-to-performance/System/Random.hs
that calls split whenever a monadic bind takes place?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#9?email_source=notifications&email_token=AAABBQV54VL6XFJT242OR3DRGCTVDA5CNFSM4KZVQM7KYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEOANQFA#issuecomment-595646484>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAABBQRWHPY64TTFBWX4NKLRGCTVDANCNFSM4KZVQM7A>
.
|
@cartazio So you are talking about something like this, right?
splitState :: (MonadState g m, RandomGen g) => (g -> (r, g)) -> m r
splitState f = do
g <- get
case split g of
(h1, h2) -> fst (f h2) <$ put h1 I don't see how anything like this can be useful for laziness, but I definitely see how it is useful in uses cases like QuickCheck for example. In fact, I remember wondering about how does QuickCheck handles reproducibility when some subset of tests is being selected to run, so thank you for bringing it up. Back to your original comment "Splittable rngs need to be a subclass. " If we all agree on splitting the |
Look more closely : it’s in the definition of bind in the gen monad
instance in the quickcheck library. It’s not just a lifting into monad
state.
…On Fri, Mar 6, 2020 at 3:31 PM Alexey Kuleshevich ***@***.***> wrote:
@cartazio <https://github.com/cartazio> So you are talking about
something like this, right?
There’s a different state
transformer monad instance for split vs not splitting
splitState :: (MonadState g m, RandomGen g) => (g -> (r, g)) -> m r
splitState f = do
g <- get
case split g of
(h1, h2) -> fst (f h2) <$ put h1
I don't see how anything like this can be useful for laziness, but I
definitely see how it is useful in uses cases like QuickCheck fro example.
In fact, I remember wondering about how does QuickCheck handles
reproducibility when some subset of tests is being selected to run, so
thank you for bringing it up.
Back to your original comment "Splittable rngs need to be a subclass. " If
we all agree on splitting the RandomGen class into two, then I totally
agree that unsplittable should be the superclass of splittable, like it was
implemented in this PR
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#9?email_source=notifications&email_token=AAABBQS6KZAVGUOT5NQK2O3RGFMRRA5CNFSM4KZVQM7KYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEOCYLGI#issuecomment-595953049>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAABBQVMOQ22XNIQ47GL4KTRGFMRRANCNFSM4KZVQM7A>
.
|
@cartazio I had to implement another monad in order to see what you mean :) So, here is a monad that is similar to the one in QuickCheck: newtype GenM a = GenM
{ runGenM :: forall g. RandomGen g => g -> a
}
instance Functor GenM where
fmap f (GenM h) = GenM (f . h)
instance Applicative GenM where
pure a = GenM (const a)
(<*>) (GenM f) (GenM h) =
GenM $ \g ->
case split g of
(r1, r2) -> f r1 (h r2)
instance Monad GenM where
return = pure
(>>=) (GenM f) gh =
GenM $ \h ->
case split h of
(r1, r2) ->
case gh (f r1) of
GenM f' -> f' r2 And here is the more strict one that doesn't rely on splitting: newtype GenS a = GenS
{ runGenS :: forall g. RandomGen g => g -> (a, g)
}
instance Functor GenS where
fmap f (GenS h) = GenS (first f . h)
instance Applicative GenS where
pure a = GenS ((,) a)
(<*>) (GenS f) (GenS h) =
GenS $ \g ->
case f g of
(fa, g') ->
case h g' of
(b, h') -> (fa b, h')
instance Monad GenS where
return = pure
(>>=) (GenS f) gh =
GenS $ \g ->
case f g of
(a, g') ->
case gh a of
GenS f' -> f' g' If we create two actions where one of values can be very expensive to generate (namely fooM :: Int -> GenM (ByteArray, Word16)
fooM n = do
ba <- GenM (fst . genByteArray n)
w16 <- GenM (fst . genWord16)
pure (ba, w16)
fooS :: Int -> GenS (ByteArray, Word16)
fooS n = do
ba <- GenS (genByteArray n)
w16 <- GenS genWord16
pure (ba, w16) Now it is pretty clear that λ> (ba, w) = runGenM (fooM 100000000) (mkStdGen 217)
λ> w
35990
λ> (ba, w) = fst $ runGenS (fooS 100000000) (mkStdGen 217)
λ> w
Interrupted. The reason why I was a bit confused is that it is not that fooS' :: GenS (ByteArray, Word16)
fooS' = do
ba <- GenS (\g -> (undefined, g))
w16 <- GenS genWord16
pure (ba, w16) λ> (ba, w) = fst $ runGenS fooS' (mkStdGen 217)
λ> w
63926 That was a fun exercise :) Side note - above |
Yup.
…On Fri, Mar 6, 2020 at 6:04 PM Alexey Kuleshevich ***@***.***> wrote:
@cartazio <https://github.com/cartazio> I had to implement another monad
in order to see what you mean :)
So, here is a monad that is similar to the one in QuickCheck:
newtype GenM a = GenM
{ runGenM :: forall g. RandomGen g => g -> a
}
instance Functor GenM where
fmap f (GenM h) = GenM (f . h)
instance Applicative GenM where
pure a = GenM (const a)
(<*>) (GenM f) (GenM h) =
GenM $ \g ->
case split g of
(r1, r2) -> f r1 (h r2)
instance Monad GenM where
return = pure
(>>=) (GenM f) gh =
GenM $ \h ->
case split h of
(r1, r2) ->
case gh (f r1) of
GenM f' -> f' r2
And here is the more strict one that doesn't rely on splitting:
newtype GenS a = GenS
{ runGenS :: forall g. RandomGen g => g -> (a, g)
}
instance Functor GenS where
fmap f (GenS h) = GenS (first f . h)
instance Applicative GenS where
pure a = GenS ((,) a)
(<*>) (GenS f) (GenS h) =
GenS $ \g ->
case f g of
(fa, g') ->
case h g' of
(b, h') -> (fa b, h')
instance Monad GenS where
return = pure
(>>=) (GenS f) gh =
GenS $ \g ->
case f g of
(a, g') ->
case gh a of
GenS f' -> f' g'
If we create two actions where one of values can be very expensive to
generate (namely ByteArray in the example):
fooM :: Int -> GenM (ByteArray, Word16)
fooM n = do
ba <- GenM (fst . genByteArray n)
w16 <- GenM (fst . genWord16)
pure (ba, w16)
fooS :: Int -> GenS (ByteArray, Word16)
fooS n = do
ba <- GenS (genByteArray n)
w16 <- GenS genWord16
pure (ba, w16)
Now it is pretty clear that fooS will have to compute ba, before w16:
λ> (ba, w) = runGenM (fooM 100000000) (mkStdGen 217)
λ> w
35990
λ> (ba, w) = fst $ runGenS (fooS 100000000) (mkStdGen 217)
λ> w
Interrupted.
The reason why I was a bit confused is that it is not that GenS is strict
in the value it returns, but the fact that there is sequential dependency
on the generator when splitting is not used:
fooS' :: GenS (ByteArray, Word16)
fooS' = do
ba <- GenS (\g -> (undefined, g))
w16 <- GenS genWord16
pure (ba, w16)
λ> (ba, w) = fst $ runGenS fooS' (mkStdGen 217)
λ> w
63926
That was a fun exercise :)
/Side note/ - above GenS and GenM can be made an instance of MonadRandom
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#9?email_source=notifications&email_token=AAABBQUMR2NWR5OYMRTP3ATRGF6RVA5CNFSM4KZVQM7KYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEODEIHI#issuecomment-596001821>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAABBQUUUZV7XFRGAPKFBJ3RGF6RVANCNFSM4KZVQM7A>
.
|
An important footnote is the sequential / strict one has a trivial bind
that can be optimized away, but the splitting one does real computation on
every bind. Which is why having a default split could potentially back
fire depending on the end user.
But yes, an important application of the splitting monad is for
constructing lazy data structures.
On Fri, Mar 6, 2020 at 6:16 PM Carter Schonwald <carter.schonwald@gmail.com>
wrote:
… Yup.
On Fri, Mar 6, 2020 at 6:04 PM Alexey Kuleshevich <
***@***.***> wrote:
> @cartazio <https://github.com/cartazio> I had to implement another monad
> in order to see what you mean :)
>
> So, here is a monad that is similar to the one in QuickCheck:
>
> newtype GenM a = GenM
>
> { runGenM :: forall g. RandomGen g => g -> a
>
> }
>
>
> instance Functor GenM where
>
> fmap f (GenM h) = GenM (f . h)
>
>
> instance Applicative GenM where
>
> pure a = GenM (const a)
>
> (<*>) (GenM f) (GenM h) =
>
> GenM $ \g ->
>
> case split g of
>
> (r1, r2) -> f r1 (h r2)
>
>
> instance Monad GenM where
>
> return = pure
>
> (>>=) (GenM f) gh =
>
> GenM $ \h ->
>
> case split h of
>
> (r1, r2) ->
>
> case gh (f r1) of
>
> GenM f' -> f' r2
>
> And here is the more strict one that doesn't rely on splitting:
>
> newtype GenS a = GenS
>
> { runGenS :: forall g. RandomGen g => g -> (a, g)
>
> }
>
>
> instance Functor GenS where
>
> fmap f (GenS h) = GenS (first f . h)
>
>
> instance Applicative GenS where
>
> pure a = GenS ((,) a)
>
> (<*>) (GenS f) (GenS h) =
>
> GenS $ \g ->
>
> case f g of
>
> (fa, g') ->
>
> case h g' of
>
> (b, h') -> (fa b, h')
>
>
> instance Monad GenS where
>
> return = pure
>
> (>>=) (GenS f) gh =
>
> GenS $ \g ->
>
> case f g of
>
> (a, g') ->
>
> case gh a of
>
> GenS f' -> f' g'
>
> If we create two actions where one of values can be very expensive to
> generate (namely ByteArray in the example):
>
> fooM :: Int -> GenM (ByteArray, Word16)
>
> fooM n = do
>
> ba <- GenM (fst . genByteArray n)
>
> w16 <- GenM (fst . genWord16)
>
> pure (ba, w16)
>
>
> fooS :: Int -> GenS (ByteArray, Word16)
>
> fooS n = do
>
> ba <- GenS (genByteArray n)
>
> w16 <- GenS genWord16
>
> pure (ba, w16)
>
> Now it is pretty clear that fooS will have to compute ba, before w16:
>
> λ> (ba, w) = runGenM (fooM 100000000) (mkStdGen 217)
>
> λ> w
> 35990
>
> λ> (ba, w) = fst $ runGenS (fooS 100000000) (mkStdGen 217)
>
> λ> w
> Interrupted.
>
> The reason why I was a bit confused is that it is not that GenS is
> strict in the value it returns, but the fact that there is sequential
> dependency on the generator when splitting is not used:
>
> fooS' :: GenS (ByteArray, Word16)
>
> fooS' = do
>
> ba <- GenS (\g -> (undefined, g))
>
> w16 <- GenS genWord16
>
> pure (ba, w16)
>
> λ> (ba, w) = fst $ runGenS fooS' (mkStdGen 217)
>
> λ> w
> 63926
>
> That was a fun exercise :)
>
> /Side note/ - above GenS and GenM can be made an instance of MonadRandom
>
> —
> You are receiving this because you were mentioned.
> Reply to this email directly, view it on GitHub
> <#9?email_source=notifications&email_token=AAABBQUMR2NWR5OYMRTP3ATRGF6RVA5CNFSM4KZVQM7KYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEODEIHI#issuecomment-596001821>,
> or unsubscribe
> <https://github.com/notifications/unsubscribe-auth/AAABBQUUUZV7XFRGAPKFBJ3RGF6RVANCNFSM4KZVQM7A>
> .
>
|
So are we now proposing something like:
|
@lehins thank you so much for working through this in #9 (comment). I still had to draw this (on a literal napkin in this case) to really get it. In In |
@idontgetoutmuch I propose if we do split
In particular it was number 3 from [this comment]:(#7 (comment)) class NoSplitGen g where
...
genWord64 :: g -> (Word64, g)
class NoSplitGen g => RandomGen g where
split :: g -> (g, g) because it results in less breakage. But, if we do decide not to split |
Splitting must be a child subclass.
…On Mon, Mar 9, 2020 at 6:02 AM Alexey Kuleshevich ***@***.***> wrote:
@idontgetoutmuch <https://github.com/idontgetoutmuch> I propose if we do
split RandomGen class into two "splitabble" and "not-splittable" concept
we do it in a way like it was implemented in this PR:
So are we now proposing something like:
In particular it was number 3 from [this comment]:(#7 (comment)
<#7 (comment)>)
class NoSplitGen g where
...
genWord64 :: g -> (Word64, g)class NoSplitGen g => RandomGen g where
split :: g -> (g, g)
because it results in less breakage.
*But*, if we do decide not to split RandomGen class, I would be fine with
it as well and I wouldn't oppose an implementation of a helper function
(eg. defaultSplit or basicSplit) that can be used as a split
implementation, but only if it is really acceptable in the industry to use
such split function for common non-splittable generators. People seem to
suggest it is possible to implement such function, I just don't know, since
it is not my area of expertise.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#9?email_source=notifications&email_token=AAABBQUC6QRCL4C4F5MYK2LRGS5EPA5CNFSM4KZVQM7KYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEOGOJCA#issuecomment-596436104>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAABBQW2VXCGGFT5BAKJ6LTRGS5EPANCNFSM4KZVQM7A>
.
|
Since this PR is closed, can we carry on the discussion on this issue: #7? |
@cartazio That is exactly what is in this example:
class NoSplitGen g where
...
genWord64 :: g -> (Word64, g)
class NoSplitGen g => RandomGen g where
split :: g -> (g, g) Or am I missing something you are trying to say here? |
Misread
…On Tue, Mar 10, 2020 at 7:55 AM Alexey Kuleshevich ***@***.***> wrote:
@cartazio <https://github.com/cartazio> That is exactly what is in this
example:
Splitting must be a child subclass.
class NoSplitGen g where
...
genWord64 :: g -> (Word64, g)class NoSplitGen g => RandomGen g where
split :: g -> (g, g)
Or am I missing something you are trying to say here?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#9?email_source=notifications&email_token=AAABBQQOMQKLCKN2I5EOWWDRGYTERA5CNFSM4KZVQM7KYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEOLDPXY#issuecomment-597047263>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAABBQWP5QCRMLXVJDCGNTDRGYTERANCNFSM4KZVQM7A>
.
|
This PR implements ideas from #8 and #7