-
Notifications
You must be signed in to change notification settings - Fork 13k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] Calculate capacity when collecting into Option and Result #52910
Conversation
r? @sfackler (rust_highfive has picked a reviewer for you, use r? to override) |
We don't know that the iterator doesn't contain a @rfcbot fcp merge |
Team member @sfackler has proposed to merge this. The next step is review by the rest of the tagged teams: No concerns currently listed. Once a majority of reviewers approve (and none object), this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up! See this document for info about what commands tagged team members can give me. |
Even if there are any |
🔔 This is now entering its final comment period, as per the review above. 🔔 |
@bors r+ |
📌 Commit 77aa031 has been approved by |
Isn't this creating an iterator that might return less elements than the lower bound of its |
Yep |
Yes it is a protocol violation, but since the iterator is private there is no "victim". |
The iterator is private but it can be passed to anything implementing |
That's true but it seems like the benefits outweigh the costs here, right? I don't really see there being FromIterator implementations that would explode if the lower bound estimate is wrong. |
How about |
@ollie27 I'm not sure about your example, but |
Even |
Is there any way around this? We wouldn't want to be forced to using push loops, which are both less idiomatic and less performant (though a lot less than using the existing implementation of |
@ollie27 What about the following alternative implementation?
It makes both your playground examples compile and is about as fast as a manual push loop:
I'm not too happy about that intermediate |
Calculate capacity when collecting into Option and Result I was browsing the [perf page](http://perf.rust-lang.org) to see the impact of my recent changes (e.g. rust-lang#52697) and I was surprised that some of the results were not as awesome as I expected. I dug some more and found an issue that is the probable culprit: [Collecting into a Result<Vec<_>> doesn't reserve the capacity in advance](rust-lang#48994). Collecting into `Option` or `Result` might result in an empty collection, but there is no reason why we shouldn't provide a non-zero lower bound when we know the `Iterator` we are collecting from doesn't contain any `None` or `Err`. We know this, because the `Adapter` iterator used in the `FromIterator` implementations for `Option` and `Result` registers if any `None` or `Err` are present in the `Iterator` in question; we can use this information and return a more accurate lower bound in case we know it won't be equal to zero. I [have benchmarked](https://gist.github.com/ljedrz/c2fcc19f6260976ae7a46ae47aa71fb5) collecting into `Option` and `Result` using the current implementation and one with the proposed changes; I have also benchmarked a push loop with a known capacity as a reference that should be slower than using `FromIterator` (i.e. `collect()`). The results are quite promising: ``` test bench_collect_to_option_new ... bench: 246 ns/iter (+/- 23) test bench_collect_to_option_old ... bench: 954 ns/iter (+/- 54) test bench_collect_to_result_new ... bench: 250 ns/iter (+/- 25) test bench_collect_to_result_old ... bench: 939 ns/iter (+/- 104) test bench_push_loop_to_option ... bench: 294 ns/iter (+/- 21) test bench_push_loop_to_result ... bench: 303 ns/iter (+/- 29) ``` Fixes rust-lang#48994.
Using an intermediate I'm not sure what can really be done here. Unfortunately, this may be a case where you just have to manually use I'll take this off the queue, hopefully I've made my point. @bors r- |
I rather like the just add it to Iterator idea from eddyb earlier, since I've found no evidence that the hint range is ever actually used, so maybe |
Ping from triage! Could someone summarize the status of this PR? |
The PR looks plausible to me, FWIW. I think it needs @ljedrz to rebase it to resolve the conflicts, then probably a try+perf to confirm it's effective? |
@scottmcm Rebased; please feel free to do a perf run if you feel this can work. |
☔ The latest upstream changes (presumably #57974) made this pull request unmergeable. Please resolve the merge conflicts. |
@ljedrz Sorry, can you rebase again? Seems like something else stepped on you again 🙁 |
@scottmcm no prob; rebased. |
Let's see if I have permissions to @bors try |
[WIP] Calculate capacity when collecting into Option and Result I was browsing the [perf page](http://perf.rust-lang.org) to see the impact of my recent changes (e.g. #52697) and I was surprised that some of the results were not as awesome as I expected. I dug some more and found an issue that is the probable culprit: [Collecting into a Result<Vec<_>> doesn't reserve the capacity in advance](#48994). Collecting into `Option` or `Result` might result in an empty collection, but there is no reason why we shouldn't provide a non-zero lower bound when we know the `Iterator` we are collecting from doesn't contain any `None` or `Err`. We know this, because the `Adapter` iterator used in the `FromIterator` implementations for `Option` and `Result` registers if any `None` or `Err` are present in the `Iterator` in question; we can use this information and return a more accurate lower bound in case we know it won't be equal to zero. I [have benchmarked](https://gist.github.com/ljedrz/c2fcc19f6260976ae7a46ae47aa71fb5) collecting into `Option` and `Result` using the current implementation and one with the proposed changes; I have also benchmarked a push loop with a known capacity as a reference that should be slower than using `FromIterator` (i.e. `collect()`). The results are quite promising: ``` test bench_collect_to_option_new ... bench: 246 ns/iter (+/- 23) test bench_collect_to_option_old ... bench: 954 ns/iter (+/- 54) test bench_collect_to_result_new ... bench: 250 ns/iter (+/- 25) test bench_collect_to_result_old ... bench: 939 ns/iter (+/- 104) test bench_push_loop_to_option ... bench: 294 ns/iter (+/- 21) test bench_push_loop_to_result ... bench: 303 ns/iter (+/- 29) ``` Fixes #48994.
☀️ Test successful - checks-travis |
@rust-timer build 8b32f79 |
Success: Queued 8b32f79 with parent 23d8d0c, comparison URL. |
Finished benchmarking try commit 8b32f79 |
Hmm, perf doesn't seem to have detected any difference. |
ping from triage @ljedrz you have conflicts to resolve. |
ping from triage @ljedrz any updates? |
@Dylan-DPC Actually I'm not sure how to proceed at this point - the specialization doesn't seem to be kicking in. I think we can close this for the time being and perhaps get back to it some other time. |
@ljedrz no issues. Closing this as per your comment. Thanks for taking the time to contribute :) |
I was browsing the perf page to see the impact of my recent changes (e.g. #52697) and I was surprised that some of the results were not as awesome as I expected. I dug some more and found an issue that is the probable culprit: Collecting into a Result<Vec<_>> doesn't reserve the capacity in advance.
Collecting into
Option
orResult
might result in an empty collection, but there is no reason why we shouldn't provide a non-zero lower bound when we know theIterator
we are collecting from doesn't contain anyNone
orErr
.We know this, because the
Adapter
iterator used in theFromIterator
implementations forOption
andResult
registers if anyNone
orErr
are present in theIterator
in question; we can use this information and return a more accurate lower bound in case we know it won't be equal to zero.I have benchmarked collecting into
Option
andResult
using the current implementation and one with the proposed changes; I have also benchmarked a push loop with a known capacity as a reference that should be slower than usingFromIterator
(i.e.collect()
). The results are quite promising:Fixes #48994.