Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add rand(::Pair) #28704

Closed
wants to merge 1 commit into from
Closed

add rand(::Pair) #28704

wants to merge 1 commit into from

Conversation

rfourquet
Copy link
Member

This is similar to #25278, which adds rand(::Tuple). Pair has the attributes of a collection, so there is no reason to exclude it from randable objects.

@rfourquet rfourquet added the domain:randomness Random number generation and the Random stdlib label Aug 16, 2018
@Keno
Copy link
Member

Keno commented Aug 16, 2018

I'm not sure I entirely like this. I think of pairs as having more semantics than just a two element collection of its elements. Frankly, I'm not sure what else it would mean in this context, but it seems like we may want to have it available for the future.

@rfourquet
Copy link
Member Author

rfourquet commented Aug 16, 2018

Thanks for raising this point. I for one really wanted to use this syntax in #24912, for combining distribution. For example, rand(Pair => (1:10, Normal())) would mean create a random Pair, whose first element is from 1:10 and second one is from the normal distribution. Similarly for e.g. rand(Complex => (1:10, Normal()). Unfortunately, Complex => (1:10, Normal()) and Pair => (1:10, Normal()) have the same type, hence the rand function would be type unstable for such inputs. I then proposed instead Combine(Pair, 1:10, Normal()) in that PR and in RandomExtensions.jl.
I currently don't see another way of using the x => y syntax in rand than the obvious one (this PR), but I'm fine to wait'n see.

@Keno
Copy link
Member

Keno commented Aug 16, 2018

I say wait and see. If you want this functionality you can always splat the pair into a tuple.

@mschauer
Copy link
Contributor

Turning the argument upside down, shouldn't a 2-Tuple be used in #24912 (which can be extended to tuples at some point), as there is no more "semantics than just a two element collection of its elements" to the use?

@rfourquet
Copy link
Member Author

@mschauer do you mean something like in #25278, which implements rand(t::Tuple) to pick up randomly an element from t ?

@mschauer
Copy link
Contributor

Ah, yes, perfect.

@rfourquet
Copy link
Member Author

rfourquet commented Jun 4, 2019

FWIW, I implemented the proposition from my comment above in RandomExtension.jl, e.g.

  • rand(Complex => Int) produces a random Complex{Int}
  • rand(Pair => (1:9, Char) produces a Pair{Int, Char} whose first member is in 1:9.

As said above, it's not well inferred, but it's not that bad thanks to inlining. It's just a surface API on top of a tightly inferred one.

I'm in favor of still not merging this PR, and wait and see.

@rfourquet rfourquet mentioned this pull request Jun 4, 2019
2 tasks
@tpapp
Copy link
Contributor

tpapp commented Jun 4, 2019

I don't see the advantage of this semantics over just constructing an element from randomly generated values (the alternative proposed above) or picking an element from any collection (but a Pair; the original proposal). Eg I would consider using rand(a => b) instead of rand((a, b)) bad style.

I like to think of rand calls as drawing from a distribution, and would need a compelling reason to introduce alternative call syntaxes for what is essentially the same distribution (eg uniform over a collection).

@rfourquet
Copy link
Member Author

I don't see the advantage of this semantics over just constructing an element from randomly generated values

I'm not sure I understand this... if you mean that e.g. rand(Complex => 1:9) is not better than Complex(rand(1:9), rand(1:9)), that's debatable, but when you think "collections", it's not easy to replace something like e.g. rand(Complex => 1:9, 10). But here Complex => 1:9 is just a way to specify how to construct a "scalar" value, it doesn't have to use Pair, it was just a possibility that I found expressive.

I like to think of rand calls as drawing from a distribution, and would need a compelling reason to introduce alternative call syntaxes for what is essentially the same distribution

The Pair API proposed in my previous comment implements something which is not available in Random, so it's not an "alternative"... or did I misunderstand?

@tpapp
Copy link
Contributor

tpapp commented Jun 5, 2019

Just to clarify:

  1. I would support neither and leave rand(::Pair) undefined in Base and Random,

  2. Yes, I think that Complex(rand(1:9), rand(1:9)) is better and rand(Complex => 1:9) is opaque and too clever. Also, I don't think this is a very common use case in general; for specific uses (eg unit testing) one can just define a function to generate values like this.

Also, from a broader perspective, we are essentially talking about elements of a DSL to describe distributions over Julia types. This is a pretty complex task and IMO should be first hammered out in a package, to see how people use it and clear up the design issues. So while I don't see the use case for the rand(Complex => 1:9) syntax, I think that experimenting with it in your package is the right choice. The only problem is type piracy, perhaps use a custom type would help? Of course then you lose the infix syntax.

@rfourquet
Copy link
Member Author

Ok, thanks for the clarifications!
So, to be clear myself, I'm not at all advocating this syntax here, merely mentioning the possibility, and I totally agree that it's best tested in a package first.

The only problem is type piracy, perhaps use a custom type would help? Of course then you lose the infix syntax.

Type piracy is not a problem here, as this is intended as an experimental feature, which may be included in Random. Also, there is already a custom type there, but the Pair API seems just nicer to use.

@StefanKarpinski
Copy link
Sponsor Member

What is the idea that rand(Complex => 1:9) would mean? Complex(rand(1:9), rand(1:9))? That seems too clever and complex by far. What would the general rule be? How is writing rand(Pair => (1:9, Char)) easier or clearer than writing rand(1:9) => rand(Char)? I'm completely missing the motivation for wanting this behavior...

@rfourquet
Copy link
Member Author

I'm completely missing the motivation for wanting this behavior...

It's for composability (and sometimes efficiency). This is the equivalent of calling the Distribution constructor in #24912. As this is explained elsewhere, and I'm not particularly pushing this specific API here (merely suggesting the possibility of it, as a point in favor of not merging this PR hastily), I'm reluctant to talk too much about the details here, but here is the idea.

When we want to generate a random "scalar" of type T, while providing some parameters to rand, the typical way is to create a special-purpose type encapsulating these parameters. E.g. in the Ditributions package there is Normal(mu, sigma).

For generating a string we have e.g. randstring("123", 10). To generate an array of String, you currently have to write a for-loop. The next step would be to create a type to encapsulate the parameters with e.g. MakeString("123", 10). To call

rand(MakeString("123", 10), 100)

would be equivalent to

[randstring("123", 10) for _ in 1:100]

(but more efficient in this case as it uses the Sampler machinery).

Similarly, a MakePair and a MakeComplex can be created. At that point, it's annoying to create a new type for each type we want to construct just to store parameters. So we create a Make type to rule them all, wich is invoked as e.g. Make(String, "123", 10), Make(Pair, 1:9, Int8), etc. The Pair API suggested above is just light sugar above Make, which I find more readable in particular when nested, e.g.

Make(Pair, Make(String, 4), Make(String, "123", 10))

vs

Pair => (String => 4, String => ("123", 10))

String => ("123", 10) is a sort of "order": create a String, with the ingredients "123" and 10. @mschauer suggested that reversing the order may read better (("123", 10) => String).

Note that this generic Make makes probably less sense for numeric types, for which there are loads of different distributions, so that calling the distribution by its name (e.g. Normal) is more convenient.
Also while I'm conviced of the advantage of the composability of the Make approach (which I didn't develop much here), I'm far from sold on the Pair API.

@StefanKarpinski
Copy link
Sponsor Member

as a point in favor of not merging this PR hastily

Can't argue with that! I was confused because this is your PR. Let's just not define rand(::Pair) 😃

@rfourquet
Copy link
Member Author

(Just to be sure this was clear, this PR and my previous post define two competing meaning for rand(::Pair), the second one with "more semantics than just a two element collection of its elements", and maybe indeed no defintion at all is better!)

@tpapp
Copy link
Contributor

tpapp commented Jun 7, 2019

(but more efficient in this case as it uses the Sampler machinery)

I think that's a reasonable goal, but I find the that just creating a Sampler and using that explicitly is not much of an extra hassle and makes the code very readable.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
domain:randomness Random number generation and the Random stdlib
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants