-
Notifications
You must be signed in to change notification settings - Fork 313
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement unique_combinations. #363
base: master
Are you sure you want to change the base?
Conversation
/// vec![1, 2], // Note: these are the same | ||
/// vec![1, 2], // Note: these are the same | ||
/// vec![1, 2], | ||
/// vec![2, 2], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we keep these examples in the documentation of combinations
, to highlight the difference to unique_combinations
? (Or am I misreading the diff?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Think this one got removed with the rebase, thow now might be the time to have them there. I am not sure on the matter.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We might create a table with all types of combinations that shows the same iterator and different output for each of them.
@@ -575,6 +576,17 @@ fn combinations() { | |||
vec![3, 4], | |||
]); | |||
|
|||
let it = vec![1, 2, 2, 3, 4].into_iter().unique_combinations(3); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we assert that unique_combinations
yields the "same" (modulo ordering) elements that would be obtained by manually filtering elements from combinations
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that this will not always be true, as combinations explore all combinations before getting a new number for the iterator. We see all the numbers as fast as possible as we want to find the end. This is why the LasyBuffer
is used. to not need that many next calls from the iterator.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure if I understand you correctly. But shouldn't unique_combinations
give the same elements that you would get by manually removing duplicates from combinations
(possibly sorting the yielded permutations)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is true as long as the iterator that combinations is done over is sorted in the first place. Might be a good test with quicksort.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After some thinking is think we should make thise, however without garanteeing the order of the items that comes out in either cases. So sort bought on input and output. Will implement that this weekend
I did a rebase so diffs should now make more sence to the current codebase. There have happend a lot of changes to combinations since the initial release of this pr, so we should see if we can use some of the techniques applied to combinations. As of the LazyBuffer type, i think if this should be merged with unique combinations, this is to major of a rewrite and think it should be a separate pr if so. Thanks @phimuemue for reviewing. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool that you could address some of the issues. Still, I think we would profit from bringing UniqueCombinations
more in line with Combinations
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the improvements.
As a side note: I see you improved readability considerably by changing some line breaks and adjusting some whitespace. I think we surely want these (independent of the combinations
). It might simplify things if we had these in a separate PR, though.
@@ -575,6 +576,17 @@ fn combinations() { | |||
vec![3, 4], | |||
]); | |||
|
|||
let it = vec![1, 2, 2, 3, 4].into_iter().unique_combinations(3); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure if I understand you correctly. But shouldn't unique_combinations
give the same elements that you would get by manually removing duplicates from combinations
(possibly sorting the yielded permutations)?
i have autoformat in my editorconfig on save. I can create an pr with formating for all the files. |
Some questions that needs to be answared: * Document differently? * Any optimalisations? * The algorithm only requires equal Items to be grouped beside each other. This is achived by sorting. However if we there is an effisient way to group equals then we can have this require `Eq` insted of `Ord`
position -> indices next_none -> done remove len (use indices.len() instead) removed a comment that did not make sence
renamed local variables to what they are representing
@phimuemue is there any other blockers than getting formating done on master for this PR (besides the unit testing)? |
As always, @jswrenn has the last word. But I still think that the implementation could/should be closer to Combinations::next. I tried to see if my intuition is wrong, but taking your implementation and transforming piece by piece leads to something very similar to Combinations::next (see my sketch meltinglava/itertools@master...phimuemue:uniq_comb2). (One thing I like about Combinations::next is that the early-outs are only for the |
#[test] | ||
fn combinations_and_unique_combinations_has_all_unique_values() { | ||
let mut rng = &mut rand::thread_rng(); | ||
let a = [1, 2, 3, 4, 5]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool to have a test for this. Just not sure if having non-determinism such a good idea there, especially if the random test is only run once. (I may be wrong on this one though, if we already have precedence for this kind of test.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My biggest problem here was that i did not find out how you made sure quickcheck made vectors that was garanteed to contain duplicates.
I agree to the theory here, thow technicly the The reason I am hessitent to do it that way, is that I do not feel like the code is better in terms of readability and logic. If I am not the only one that think this is true, then it is my opinion that I have found that it takes quite alot of time to figgure out what an algorithm does when its all based on one mutable number that deals with all the cases. Here are the pros and cons of each as i see it. Pros unique:
Cons:
Neither one:
Do anyone have anything to add? |
Thanks for your feedback.
Regarding the
we know that the loop initiated in line 73 (
You may have a valid point there: I may be biased (because I just compared your new code to the existing one), so another opinion on this could very well inform our choice there.
I agree: If
You're right: These algorithms are surprisingly tricky to formulate in simple terms. But isn't the essence as follows:
I think current |
Just thought how hard it would be doing this with rust iterators. It is qute descriptive if when comments are added. Here we have almost all of combinations, thow i could not get the .iter_mut to return mutable referances. Do also think what I did inside the for loop is wrong. thow this is just an example. // search for bumpable index
return self
.indices
.iter()
.rev()
.enumerate()
.find(|(offset, &index)| self.pool[index] != self.pool[pool_len - 1 - offset])
.map(|(offset, &index)| (offset, index))
.clone()
// when found bump that index and increase the following values
.map(|(offset, index)| {
for (o, &mut i) in self
.indices
.iter_mut()
.rev()
.take(offset + 1)
.rev()
.enumerate()
{
*i = index + offset + o;
}
}) What i like about this version is that its only the |
Some questions that needs to be answared:
other. This is achived by sorting. However if we there is an
effisient way to group equals then we can have this require
Eq
insted of
Ord