Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove FromIterator impl for ~[T], and the ensuing fallout #13963

Closed

Conversation

lilyball
Copy link
Contributor

@lilyball lilyball commented May 6, 2014

With ~[T] no longer growable, the FromIterator impl for ~[T] doesn't make
much sense. Not only that, but nearly everywhere it is used is to convert from
a Vec<T> into a ~[T], for the sake of maintaining existing APIs. This turns
out to be a performance loss, as it means every API that returns ~[T], even a
supposedly non-copying one, is in fact doing extra allocations and memcpy's.
Even &[T].to_owned() is going through Vec<T> first.

Remove the FromIterator impl for ~[T], and adjust all the APIs that relied
on it to start using Vec<T> instead. This includes rewriting
&[T].to_owned() to be more efficient, among other performance wins.

Also add a new mechanism to go from Vec<T> -> ~[T], just in case anyone
truly needs that, using the new trait FromVec.

[breaking-change]

@lilyball
Copy link
Contributor Author

lilyball commented May 6, 2014

I marked up the first commit as [breaking-change]. There are other commits scattered in here that make API changes as fallout from the initial change. Should they be marked up as well? I'd also be happy to adjust the PR description to list all the API changes and include [breaking-change] but the original ML post suggested filtering out bors merge commits.

@alexcrichton
Copy link
Member

Let's please hold off on r+'ing this for a bit, I would like to make sure that we do it for the right reasons.

FromIterator was the only way to construct a variable sized ~[T] right now. It is slow, but that's because we don't have DST. The performance hit (in its current state) is not a reason to migrate.

If we decide to migrate towards Vec<T>, then this is more than welcome, but that decision has yet to be made (I want to talk about this at tomorrow's meeting).

@lilyball
Copy link
Contributor Author

lilyball commented May 6, 2014

@alexcrichton I agree that we should hold off on r+'ing immediately, although I don't want to wait too long as this is likely to bitrot fast. But I do think it's something we need to do.

FromIterator was the only way to construct a variable sized ~[T] right now.

Assuming you mean without cloning T. Correct. But ~[T] isn't growable any more, and yet that's precisely what FromIterator was: a method of growing a ~[T].

The performance hit (in its current state) is not a reason to migrate.

The problem is not just the performance hit on constructing ~[T]. The problem is this performance hit spread throughout disparate parts of the standard libraries. Even methods that claimed they converted into a ~[T] without copying were in fact copying memory (doing it twice, no less! Because the FromIterator impl had to copy into a Vec<T> and then copy from there into a ~[T]).

I want to talk about this at tomorrow's meeting

Please do. My impression was, with the movement of I/O toward Vec<T>, that it had tentatively been decided to go ahead and migrate APIs to Vec<T>, with the understanding that they could be migrated back if it was determined that ~[T] really is the correct type long-term (which I really don't think it is; converting from Vec<T> to ~[T] post-DST is not free, nor is converting back if the API client wants a Vec<T> after all).

My belief is that this is the right move, and in fact we should think about moving even more APIs away from ~[T] (although this PR only concerned itself with APIs that broke when the FromIterator impl was removed).

@nrc
Copy link
Member

nrc commented May 6, 2014

I'd be really happy not to land this before I land my current DST branch because I think they will interact badly. Kinda just selfishness though...

@lilyball
Copy link
Contributor Author

lilyball commented May 6, 2014

@nick29581 What, removing uses of ~[T]? That should let you delete code ;)

@alexcrichton
Copy link
Member

We talked about this at the meeting today, and the decision was to remove all current usage of ~[T] wherever possible. It sounded like @pcwalton was going to try to tackle that, so you'll likely want to coordinate with him.

@nrc
Copy link
Member

nrc commented May 6, 2014

We didn't actually discuss from_iterator - I don't understand why we would want to remove this, necessarily (I think I'm missing some context). Just because we preference Vec over ~[T], if someone chooses to use ~[T] why wouldn't they want a from_iterator method for it?

@alexcrichton
Copy link
Member

This kinda falls under the purview of "how do I construct Box<[T]>", but the same question can be asked of Rc and Gc, and I imagine that they should all have very similar solutions.

@nick29581, do you know if with DST we'll have to add a custom constructor for all these types?

@lilyball
Copy link
Contributor Author

lilyball commented May 6, 2014

@nick29581 FromIterator by definition requires a growable container. The fact that it was implemented on ~[T] was a huge hack; it actually went through Vec<T> first.

If you want a way to create ~[T], then write some way to go from Vec<T> to ~[T] directly. But that way is not FromIterator,

@nrc
Copy link
Member

nrc commented May 6, 2014

@alexcrichton converting from fixed length [] will be a coercion, so won't need a custom constructor.

Converting from a Vec or iterator is an open question. At the moment the only way to do it is to use FromIterator. So I'm opposed to removing it without some kind of replacement.

@lilyball
Copy link
Contributor Author

lilyball commented May 6, 2014

After discussing with @nick29581 in IRC, my plan is to add a trait

pub trait FromVec<T> {
    fn from_vec(v: Vec<T>) -> Self
}

impl<T> FromVec<T> for ~[T] {
    fn from_vec(v: Vec<T>) -> ~[T] {
        // ..
    }
}

This will be awkward to use (which I view as a bonus; nobody should actually be using this), but it will suffice to convert a Vec<T> into a ~[T] without Clone. This same trait can then be reused for smart pointers once DST lands (e.g. for Rc<[T]>).

@lilyball
Copy link
Contributor Author

lilyball commented May 7, 2014

After considering this some more, FromVec is overkill. What we actually want is

impl<T> Vec<T> {
    fn into_unsized_slice(self) -> [T] { ... }
}

Then clients will say box v.into_unsized_slice(), or box (Rc) v.into_unsized_slice() (or whatever the syntax for that is).

Naturally we can't implement that today, but we can use .into_owned_slice() for now, then once DST lands we deprecate it and add .into_unsized_slice().

@lilyball
Copy link
Contributor Author

lilyball commented May 7, 2014

Ugh nevermind. Even post-DST you can't return [T]. I didn't think this through. I guess we stick with FromVec (assuming we want some way to produce Rc<[T]> or other such smart pointers).

@lilyball
Copy link
Contributor Author

lilyball commented May 7, 2014

r? @alexcrichton @nick29581

@lilyball
Copy link
Contributor Author

lilyball commented May 7, 2014

@alexcrichton First commit message rewritten, and the one new test is fixed.

@@ -451,19 +451,6 @@ impl<E, S:Encoder<E>,T:Encodable<S, E>> Encodable<S, E> for ~[T] {
}
}

impl<E, D:Decoder<E>,T:Decodable<D, E>> Decodable<D, E> for ~[T] {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any need to remove this? If it is just for tidiness, then I would prefer to leave it in until we can implement for [T]

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It used FromIterator.

But more generally, decoding into a ~[T] only makes sense if users are expected to define data structures using ~[T]. But we've already decided that the appropriate data structure is Vec<T>. Trying to include support for ~[T] both suggests that it's an appropriate type to use, and it also inherently needs to go through Vec<T> anyway (as it requires a growable vector). If someone really, truly wants to use a ~[T], they should just manually implement Decodable on their own type and decode into a Vec<T>.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So you should be able to keep it and use FromVec, right? If not then FromVec hasn't really achieved its goals.

I don't think we should be so aggressively opinionated about the use of ~[T] - I don't think we should use it, but I don't think we need to make life unnecessarily difficult for others to use it.

Once we have DST, we will want to support [T] so that it can be used in data structures, that will naturally allow for Box<[T]>. In the meantime, I don't want to take away functionality and then return it, that is unnecessary churn and can cause pain for developers. I believe that compatibility with the standard libs will be enough to motivate people to use Vec rather than ~[T], we don't need to cripple ~[T] to make that happen.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How are we going to implement Decodable for arbitrary SomePtr<[T]>? My belief is that it will have to be implemented independently for each pointer type.

I do think we should be aggressively opinionated about the use of ~[T]. Basically, I think we need to approach these APIs with the question: if we didn't have ~[T] before today, would we have chosen to implement this? And I think the answer is no, we would not have implemented Decodable for ~[T], because we don't think people should be using it and therefore we should not expend the effort and maintenance burden to add this to the standard libraries.

FromVec I was willing to add because I recognized the validity of your point that we should have some way of producing a ~[T], even if we don't think people should be using it, so that at a basic level people have some way to create instances of this type.

All that said, since you disagree with me about this, I'm willing to hold back and restore Decodable for now. But I do think we should reconsider this later after we've finished migrating off of ~[T] completely.

@pnkfelix pnkfelix changed the title Remote FromIterator impl for ~[T], and the ensuing fallout Remove FromIterator impl for ~[T], and the ensuing fallout May 7, 2014
match realargs::take() {
realstd::option::Some(a) => Some(a),
realstd::option::Some(v) => Some(unsafe{ ::cast::transmute(v) }),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need the transmute here? Is it due to different versions of args between test and not test cfgs? Could we get around that by using cfg(stage0)? If that's not the reason and its permanent, then I am scared because I think it relies on Vec and ~[T] having the same representation which will not be the case soon.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's because of realstd. The #[test] versions of these methods (which you've flagged here) call into realstd::rt:args. That wasn't a problem when the type was ~[T] because that was a compiler-defined type, but with Vec, realargs::take() returns realstd::vec::Vec but this function needs to return vec::Vec.

Luckily, since the only difference between the types is that one is from the current crate and one is from realstd, we know for a fact that they have the same layout (and the same implementation), so we can just transmute them.

@nrc
Copy link
Member

nrc commented May 7, 2014

@alexcrichton, @pcwalton I'm mostly happy with this now and happy to r+ once the few comments are addressed. Did either of you want to look at this or have anything to add before we land it?

lilyball added 7 commits May 8, 2014 12:06
As part of the shift from ~[T] to Vec<T>, recently ~[T] was made
non-growable. However, the FromIterator implementation for ~[T] was left
intact (albeit implemented inefficiently), which basically provided a
loophole to grow a ~[T] despite its non-growable nature. This is a
problem, both for performance reasons and because it encourages APIs to
continue returning ~[T] when they should return Vec<T>. Removing
FromIterator forces these APIs to adopt the correct type.

Furthermore, during today's weekly meeting it was decided that we should
remove all instances of ~[T] from the standard libraries in favor of
Vec<T>. Removing the FromIterator impl makes sense to do as a result.

This commit only includes the removal of the FromIterator impl. The
subsequent commits involve handling all of the breakage that results,
including changing APIs to use Vec<T> instead of ~[T]. The precise API
changes are documented in the subsequent commit messages, but each
commit is not individually marked as a breaking change.

Finally, a new trait FromVec is introduced that provides a mechanism to
convert Vec<T> back into ~[T] if truly necessary. It is a bit awkward to
use by design, and is anticipated that it will be more useful in a
post-DST world to convert to an arbitrary Foo<[T]> smart pointer.

[breaking-change]
This used to create a Vec<T> and then call .move_iter().collect() to
convert to a ~[T]. We can't do that anymore, so construct the ~[T] in
place instead. This has the added benefit of avoiding an unnecessary
memory copy (from the Vec<T> to the ~[T]).
A few methods in slice that used to return ~[T] now return Vec<T>:

- VectorVector.concat/connect_vec() returns Vec<T>
- slice::unzip() returns (Vec<T>, Vec<U>)
- ImmutableCloneableVector.partitioned() returns (Vec<T>, Vec<T>)
- OwnedVector.partition() returns (Vec<T>, Vec<T>)
unzip() has nothing to do with slices, so it belongs in vec.
Change from_buf_raw() to return a Vec<T> instead of a ~[T]. As such, it
belongs in vec, in the newly-created vec::raw module.
- StrSlice.to_utf16() now returns a Vec<u8>.
- Other miscellaneous fallout in std::str.
lilyball added 18 commits May 8, 2014 12:06
API changes:

- OwnedAsciiCast returns Vec<Ascii> instead of ~[Ascii]
- OwnedAsciiCast is implemented on Vec<u8>
- AsciiStr.to_lower/upper() returns Vec<Ascii>
- IntoBytes::into_bytes() returns Vec<u8>
- float_to_str_bytes_common() returns (Vec<u8>, bool)
API changes:

- UnsafeArc::newN() returns Vec<UnsafeArc<T>>
os::env(), os::args(), and related functions now use Vec<T> instead of
~[T].
API Changes:

- get_host_addresses() returns IoResult<Vec<IpAddr>>
- Process.extra_io is Vec<Option<io::PipeStream>>
Adding two vectors now results in a Vec<T> instead of a ~[T].

Implement Add on Vec<T>.
API Changes:

- from_base64() returns Result<Vec<u8>, FromBase64Error>
- from_hex() returns Result<Vec<u8>, FromHexError>
- json::List is a Vec<Json>
- Decodable is no longer implemented on ~[T] (but Encodable still is)
- DecoderHelpers::read_to_vec() returns a Result<Vec<T>, E>
API Changes:

- GetAddrInfoRequest::run() returns Result<Vec<..>, ..>
- Process::spawn() returns Result(.., Vec<..>), ..>
API Changes:

- GetAddrInfoRequest::run() returns Result<Vec<..>, ..>
- Process::spawn() returns Result<(.., Vec<..>), ..>
os::args() no longer auto-borrows to &[~str].
Tweak the tutorial's section on vectors and strings, to slightly clarify
the difference between fixed-size vectors, vectors, and slices.
Add a new trait FromVec with one self-less method from_vec(). This is
kind of like FromIterator, but it consumes a Vec<T>. It's only
implemented for ~[T], but the idea is post-DST it can be implemented for
any Boxed<[T]>.
API Changes:

- &[T] and ~[T] no longer support the addition operator (+)
Bring back the Decodable impl for ~[T], this time using FromVec. It's
still not recommended that anyone use this, but at least it's available
if necessary.
There was no reason to remove them from slice. They're testing methods
defined in slice, so that's where they belong.

Leave vec with copies of the partition/partitioned tests because it has
its own implementation of those methods.
@lilyball
Copy link
Contributor Author

lilyball commented May 8, 2014

@nick29581 @alexcrichton @pcwalton It's rebased on top of latest master, and includes the requested changes.

The one additional breaking change was removing &[T]/~[T] addition, as that cannot be implemented to return Vec with Add being defined in libcore. It's still defined for Vec<A> + T:Vector<A> (although I'm of the opinion that even that should be removed, but it's not applicable to this PR).

This has passed a full make check locally.

bors added a commit that referenced this pull request May 9, 2014
…r=pcwalton

With `~[T]` no longer growable, the `FromIterator` impl for `~[T]` doesn't make
much sense. Not only that, but nearly everywhere it is used is to convert from
a `Vec<T>` into a `~[T]`, for the sake of maintaining existing APIs. This turns
out to be a performance loss, as it means every API that returns `~[T]`, even a
supposedly non-copying one, is in fact doing extra allocations and memcpy's.
Even `&[T].to_owned()` is going through `Vec<T>` first.

Remove the `FromIterator` impl for `~[T]`, and adjust all the APIs that relied
on it to start using `Vec<T>` instead. This includes rewriting
`&[T].to_owned()` to be more efficient, among other performance wins.

Also add a new mechanism to go from `Vec<T>` -> `~[T]`, just in case anyone
truly needs that, using the new trait `FromVec`.

[breaking-change]
@bors bors closed this May 9, 2014
@lilyball lilyball deleted the remove_owned_vec_from_iterator branch May 9, 2014 07:35
bors added a commit to rust-lang-ci/rust that referenced this pull request Feb 13, 2023
Unconditionally enable location links in inlay hints again

While the goto functionality still doesn't work in VSCode, the hover part actually does. Also the way this was gated before, one only had to update their config while r-a was running to have the links enabled automatically due to the check only living in the startup code.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants