Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Slice with size known at compile time should be coerced to array #1772

Closed
est31 opened this issue Oct 16, 2016 · 17 comments
Closed

Slice with size known at compile time should be coerced to array #1772

est31 opened this issue Oct 16, 2016 · 17 comments
Labels
T-lang Relevant to the language team, which will review and decide on the RFC.

Comments

@est31
Copy link
Member

est31 commented Oct 16, 2016

You should be able to write stuff like:

fn do_sum(a: [u8; 2]) -> u8 {
    a[0] + a[1]
}

fn main() {
    let arr = [1, 2, 3, 4, 5, 6, 7, 8];
    println!("The sum is {}", do_sum(arr[0..2]));
}

If subslice notation ([] with a range inside) is used on a slice or array with the borders known at compile time, it should be coerced to an array.

There is a stackoverflow thread with a list of workarounds, and someone has even done a crate for the problem, but it should be solved inside the language, as it doesn't really feel natural.

Also, if one end is unspecified, it should allow for values only known at runtime for the specified end, and infer the unspecified end from the type of the function. Example code:

fn do_sum(a: [u8; 2]) -> u8 {
    a[0] + a[1]
}

fn main() {
    let arr = [1, 2, 3, 4, 5, 6, 7, 8];
    for i in 0 .. 8 {
        println!("The sum is {}", do_sum(arr[i..]));
    }
}
@nrc nrc added the T-lang Relevant to the language team, which will review and decide on the RFC. label Oct 17, 2016
@oli-obk
Copy link
Contributor

oli-obk commented Oct 17, 2016

I've been thinking about a more general solution to this. &arr[0..2] should be &[T; 2]. If T is Copy then the semantics work like in your first suggestion. I'm not entirely sure if it's backwards compatible, but since &[T; N] coerces to &[T], it should work once generics allow coercions.

Also, if one end is unspecified, it should allow for values only known at runtime for the specified end, and infer the unspecified end from the type of the function.

That's a little surprising in my opinion.

@eddyb
Copy link
Member

eddyb commented Oct 17, 2016

For &arr[0..2] to be &[T; 2], <[T; n] as Index<const i..j>>::Output has to be [T; j - i].
To be able to actually express that in the types, you'd have a feature which allows for much more powerful things, such as tuple indexing (e.g. (A, B) implementing Index<0usize> and Index<1usize>).

I call it "value refinement", that is, you've refined a type to a single value (any pattern can technically be a refinement and you can even have full subtyping with "pattern-refined types" if you wanted to).

But it's one of those typesystem extensions that are "less mainstream" than, say, HKT, or even GADTs.
I'm curious if someone knows how to do something like that in Haskell, if it's at all possible.

@est31
Copy link
Member Author

est31 commented Oct 17, 2016

@eddyb will the typesystem extensions of #1657 be enough?

@eddyb
Copy link
Member

eddyb commented Oct 17, 2016

@est31 No, those are nowhere near allowing pair[0] and pair[1] to have different types (!!).

@burdges
Copy link

burdges commented Oct 18, 2016

I'm happy to learn about the arrayref crate because lacking this gets annoying.

@burdges
Copy link

burdges commented Oct 18, 2016

I think should go read about Index to better understand your comments here @eddyb .

@est31
Copy link
Member Author

est31 commented Oct 18, 2016

@burdges I've wondered about it as well and talked with @eddyb on IRC. In fact, not the Index trait is used here, but the Range struct. The problem is that the Range struct only has fields for the borders, whose value is only known at runtime. @eddyb is proposing that the issue is left until there is a generic way to work even with these values.

However, he confirmed that something like ConstRange would do the job as well be needed, where the borders are constant and fixed at compile time (as of #1657 merged). @eddyb didn't like that approach, due to the added complexity.

@burdges
Copy link

burdges commented Oct 18, 2016

I see. It's an issue of there being too many things like this, so they should be dealt with using something more general like #1657, as opposed to many little syntax hacks.

@VictorKoenders
Copy link

I'd love to have this feature, especially with firmware development where compile-time checks like this are crucial

@SoniEx2
Copy link

SoniEx2 commented Oct 13, 2018

So uh, impl<T> From<&[T]> for Option<&[T; N]>?

@VictorKoenders
Copy link

VictorKoenders commented Oct 14, 2018

It's more like a specialization for indexing a slice by RangeInclusive and Range where const N: usize = range.end - range.end;

&slice[3..5] returns &[_; 2] but &slice[3..] would still return a &[_]

@JAicewizard
Copy link

this would be really useful for reading known size ints from a byte slice/vector.

@burdges
Copy link

burdges commented Aug 10, 2019

We'd ideally want &mut a[i+2] to return a &mut [T; 2] too, but Range cannot express this.

I think refinement types sound pretty handy for many things, and we should explore them, but not sure this feature drives that well.

Imho, we should simply push for arrayref-like tooling to be considered idiomatic rust, including convenience methods like droundy/arrayref#10

I've noticed many projects avoid arrayref for silly reasons like it's internal usage of unsafe, or that it panics exactly like slicing does, so maybe an RFC adding arrayref and friends to core makes sense.

I think const generics change how arrayref should look, which gives good reason for inclusion in core. I think you're two fundamental methods become

impl<T> [T] {
    pub fn array<const N: usize, T>(&self) -> Option<&[T; N]> {
        if self.len() < N { return None; }
        Some(unsafe { &*(self.as_ptr() as *const [T; N]) })
    }
    pub fn head<const N: usize, T>(&self) -> Option<(&[T; N], &[T])> { ... }
    pub fn tail<const N: usize, T>(&self) -> Option<(&[T], &[T; N])> { ... }

    pub fn array_mut<const N: usize, T>(&mut self) -> Option<&mut [T; N]> {
        if self.len() < N { return None; }
        Some(unsafe { &*(self.as_mut_ptr() as *mut [T; N]) })
    }
    pub fn head_mut<const N: usize, T>(&mut self) -> Option<(&mut [T; N], &mut [T])> { ... }
    pub fn tail_mut<const N: usize, T>(&mut self) -> Option<(&mut [T], &mut [T; N])> { ... }
}

There are also mut reference to slices variants that come in pretty handy, like they partially replace the arrayref multi-slicing macros.

impl<'heap,T> &'heap [T] {
    pub fn take_prefix(&mut self, len: usize) -> Option<&'heap [T]> {
        if self.len() < len { return None; }
        let tmp: &'heap [T] = ::core::mem::replace(&mut *self, &[]);
        let (head, tmp) = tmp.split_at(len);
        *self = tmp;
        Some(head)
    }
    pub fn take_suffix(&mut self, len: usize) -> Option<&'heap [T]> { ... }
    pub fn take_head<const N: usize>(&mut self) -> Option<&[T; N]> { ... }
    pub fn take_tail<const N: usize>(&mut self) -> Option<&[T; N]> { ... }
}

impl<'heap,T> &'heap mut [T] {
    pub fn take_prefix_mut<const N: usize>(&mut self, len: usize) -> Option<&'heap mut [T]> {
        if self.len() < len { return None; }
        let tmp: &'heap mut [T] = ::std::mem::replace(&mut *heap, &mut []);
        let (head, tmp) = tmp.split_at_mut(len);
        *self = tmp;
        Some(head)
    }
    pub fn take_suffix_mut(&mut self, len: usize) -> Option<&'heap mut [T]> { ... }
    pub fn take_head_mut<const N: usize>(&mut self) -> Option<&'heap mut [T; N]> { ... }
    pub fn take_tail_mut<const N: usize>(&mut self) -> Option<&'heap mut [T; N]> { ... }
}

I've written these with Option because first returns an Option, but actually folks would likely prefer panics, so that users write out one explicit length check beforehand.

@burdges
Copy link

burdges commented Aug 10, 2019

In fact, there is a fairly simple approach by adding 4 or 5 new range types that express the constness.

pub struct RangeFixed<Idx, const LEN: Idx> { }
pub struct RangeArray<Idx, const LEN: Idx> {
    pub start: Idx,
}
pub struct RangeHead<Idx, const LEN: Idx> { }
pub struct RangeTail<Idx, const LEN: Idx> { }
// RangeHeadinclusive?

impl<const LEN: usize,T> SliceIndex<[T]> for RangeFixed<usize,LEN> {
    type Output = [T; LEN];
    fn get(self, slice: &[T]) -> Option<&Self::Output> {
        if self.len() != N { return None; }
        Some(unsafe { &*(self.as_ptr() as *const [T; N]) })
    }
    fn get_mut(self, slice: &mut [T]) -> Option<&mut Self::Output> {
        if self.len() != N { return None; }
        Some(unsafe { &*(self.as_mut_ptr() as *mut [T; N]) })
    }
    ...
}

impl<const LEN: usize,T> SliceIndex<[T]> for RangeArray<usize,LEN> {
    type Output = [T; LEN];
    fn get(self, slice: &[T]) -> Option<&Self::Output> {
        if self.len() != N { return None; }
        let slice = slice.split_at(self.start).1.split_at(LEN).0;
        Some(&slice[RangeFixed<usize,LEN>]) // RangeFixed<usize,LEN>.get(slice)
    }
    fn get_mut(self, slice: &mut [T]) -> Option<&mut Self::Output> {
        if self.len() != N { return None; }
        let slice = slice.split_at_mut(self.start).1.split_at_mut(LEN).0;
        Some(&mut slice[RangeFixed<usize,LEN>]) // RangeFixed<usize,LEN>.get_mut(slice)
    }
    ...
}
impl<const LEN: usize,T> SliceIndex<[T]> for RangeHead<usize,LEN> {
    type Output = [T; LEN];
    fn get(self, slice: &[T]) -> Option<&Self::Output> {
        if self.len() != N { return None; }
        let slice = slice.split_at(LEN).0;
        &slice[RangeFixed<usize,LEN>] // RangeFixed<usize,LEN>.get(slice)
    }
    fn get_mut(self, slice: &mut [T]) -> Option<&mut Self::Output> {
        if self.len() != N { return None; }
        let slice = slice.split_at_mut(LEN).0;
        &mut slice[RangeFixed<usize,LEN>] // RangeFixed<usize,LEN>.get_mut(slice)
    }
    ...
}
impl<const LEN: usize,T> SliceIndex<[T]> for RangeTail<usize,LEN> {
    type Output = [T; LEN];
    fn get(self, slice: &[T]) -> Option<&Self::Output> {
        if self.len() != N { return None; }
        let slice = slice.split_at(self.len() - LEN).1;
        &slice[RangeFixed<usize,LEN>] // RangeFixed<usize,LEN>.get(slice)
    }
    fn get_mut(self, slice: &mut [T]) -> Option<&mut Self::Output> {
        if self.len() != N { return None; }
        let slice = slice.split_at_mut(self.len() - LEN).1;
        &mut slice[RangeFixed<usize,LEN>] // RangeFixed<usize,LEN>.get_mut(slice)
    }
    ...
}

I believe this already addresses the current issue because &a[RangeHead<5>] work just fine. We might always represent these new const range types with some special syntax, not sure what though. We should investigate inferring the const vs non-const range type so that the existing range operators work. At first blush, inference sounds less problematic than adding fancy type system features.

We can abstract the split and take machinery as well.

pub trait SliceSplit<T> : SliceIndex<T> {
    fn split(self, slice: &T) -> Option<(&<Self as SliceIndex<T>>::Output,&T)>;
    fn split_mut(self, slice: &mut T) -> Option<(&mut <Self as SliceIndex<T>>::Output,&mut T)>;
    ...
}

impl<T> [T] {
    fn split<R: SliceSplit<[T]>>(&self, range: R)
     -> Option<(&<Self as SliceIndex<T>>::Output,&T)>;
    fn split_mut<R: SliceSplit<[T]>>(&mut self, range: R)
     -> Option<(&mut <Self as SliceIndex<T>>::Output,&mut T)>;
    ...
}

impl<'s,T> &'s [T] {
    /// Return a subslice of a slice while adjusting the remaining slice safely.
    pub fn take<R: SliceSplit<[T]>>(&mut self, range: R) -> Option<&'s <Self as SliceIndex<[T]>>::Output> {
        let tmp: &'s [T] = ::core::mem::replace(&mut *self, &[]);
        if let (taken, tmp) = range.split(tmp) {
            *self = tmp;
            Some(taken)      
        } else {
            *self = tmp;
            None
        }
    }
}

impl<'s,T> &'s mut [T] {
    /// Return a mutable subslice of a mutable slice while adjusting the remaining slice safely.
    pub fn take<R: SliceSplit<[T]>>(&mut self, range: R) -> Option<&'s mut <Self as SliceIndex<[T]>>::Output> {
        let tmp: &'s [T] = ::core::mem::replace(&mut *self, &[]);
        if let (taken, tmp) = range.split_mut(tmp) {
            *self = tmp;
            Some(taken)      
        } else {
            *self = tmp;
            None
        }
    }
}

We implement SliceSplit for the existing half ranges, etc., but not Range itself.

impl<const LEN: usize,T> SliceSplit<[T]> for RangeFull {
    fn split(self, slice: &[T]) -> Option<(&[T],&[T])> {
        Some((slice, &[]))
    }
    fn split_mut(self, slice: &mut [T]) -> Option<(&mut [T],&mut [T])> {
        Some((slice, &mut []))
    }
    ...
}
impl<const LEN: usize,T> SliceSplit<[T]> for RangeFrom<usize> {
    fn split(self, slice: &[T]) -> Option<(&[T],&[T])> {
        if slice.len() < self.start { return None; }
        Some(slice.split_at(self.start))
    }
    fn split_mut(self, slice: &mut [T]) -> Option<(&mut [T],&mut [T])> {
        if slice.len() < self.start { return None; }
        Some(slice.split_at(self.start))
    }
    ...
}
impl<const LEN: usize,T> SliceSplit<[T]> for RangeTo<usize> {
    fn split(self, slice: &[T]) -> Option<(&[T],&[T])> {
        if slice.len() < self.start { return None; }
        let (x,y) = slice.split_at(self.end);
        Some((y,x))
    }
    fn split_mut(self, slice: &mut [T]) -> Option<(&mut [T],&mut [t])> {
        if slice.len() < self.start { return None; }
        let (x,y) = slice.split_at_mut(self.end);
        Some((y,x))
    }
    ...
}
impl<const LEN: usize,T> SliceSplit<[T]> for RangeToInclusive<usize> {
    fn split(self, slice: &[T]) -> Option<(&[T],&[T])> {
        if slice.len() < self.start+1 { return None; }
        let (x,y) = slice.split_at(self.end+1);
        Some((y,x))
    }
    fn split_mut(self, slice: &mut [T]) -> Option<(&mut [T],&mut [t])> {
        if slice.len() < self.start+1 { return None; }
        let (x,y) = slice.split_at_mut(self.end+1);
        Some((y,x))
    }
    ...
}

We now implement SliceSplit for our new const half ranges too, but not ``

impl<const LEN: usize,T> SliceSplit<[T]> for RangeFixed<usize,LEN> {
    fn split(self, slice: &T) -> Option<(&<Self as SliceIndex<T>>::Output,&T)> {
        self.get(slice).map(|x| (x,&[]))
    }
    fn split_mut(self, slice: &mut T) -> Option<(&mut <Self as SliceIndex<T>>::Output,&mut T)> {
        self.get_mut(slice).map(|x| (x,&mut []))
    }
}
impl<const LEN: usize,T> SliceSplit<[T]> for RangeHead<usize,LEN> {
    fn split(self, slice: &T) -> Option<(&<Self as SliceIndex<T>>::Output,&T)> {
        let (x,y) = slice.split_at(LEN);
        Some((&x[self],y))  // self.get(x).map(|x| (x,y))
    }
    fn split_mut(self, slice: &mut T) -> Option<(&mut <Self as SliceIndex<T>>::Output,&mut T)> {
        let (x,y) = slice.split_at(LEN);
        Some((&mut x[self],y))  // self.get_mut(x).map(|x| (x,y))
    }
}
impl<const LEN: usize,T> SliceSplit<[T]> for RangeTail<usize,LEN> {
    fn split(self, slice: &T) -> Option<(&<Self as SliceIndex<T>>::Output,&T)> {
        let (x,y) = slice.split_at(LEN);
        Some((&mut y[self],x))  // self.get(y).map(|y| (y,x))
    }
    fn split_mut(self, slice: &mut T) -> Option<(&mut <Self as SliceIndex<T>>::Output,&mut T)> {
        let (x,y) = slice.split_at(LEN);
        Some((&mut y[self],x))  // self.get)_mut(y).map(|y| (y,x))
    }
}

@burdges
Copy link

burdges commented Aug 11, 2019

I'll clarify, there is currently no way for rustc to infer RangeFixed vs RangeFull or RangeHead RangeTo or RangeTail vs RangeFrom because rustc does not understand sealed traits and thus cannot identify the range type from SliceIndex::Output type. I do think such tricks resemble the magic already used around indexing much more than anything around refinement types., but..

It sounds hard to infer RangeArray<usize,5> { start: i } from i..i+5, and almost impossible to infer RangeTail<usize,5> { } from s.len()-5.., but inferring either RangeHead<usize,5> {} from ..5 or RangeTail<usize,5> { } from -5.. or RangeFixed<usize,5> {} from .. sound less hard.

In all cases however, there is a convoluted aspect to this inference that really does not feel like rust, so..

We should maybe just specify these array slicing types directly, either by writing them our, or with funky notation? i..@, @.., ..@, and @? Or use * or #? Or maybe some [?? ; N]?

@drdozer
Copy link

drdozer commented Nov 19, 2019

In all cases however, there is a convoluted aspect to this inference that really does not feel like rust,

It feels a bit like the issue is related to staging of computations. It's about how much knowledge can be gleaned at compile time, and from that, what type specialisations or materialisations can be made.

Perhaps if the [..] syntax family was guaranteed to resolve to some type defined by trait, it would be possible at const eval time to return an impl type thing that was a specialised trait that had as much static information as possible injected? So [a..b] would return a slice that was of type [_; b-a], and in the case where the thing being sliced had a statically known length, [a..] could first become [a..len] and then on to returning a slice of type [_, len - a].

I'm not a fan of adding extra syntax to bake in placeholders for statically and dynamically known limits. We don't need yet more magic characters cluttering up our source. Not knowing much about the compiler guts, this feels like a const eval + type refinement problem.

@est31
Copy link
Member Author

est31 commented Nov 20, 2020

There is TryInto now that converts slices to references of arrays, as well as owned arrays if the type impl's Copy: https://doc.rust-lang.org/1.47.0/std/primitive.array.html#impl-TryFrom%3C%26%27a%20%5BT%5D%3E

I'll close this, even though the precise feature I requested isn't implemented yet, because the core of the request of easy conversions is covered now. Thanks everyone.

@est31 est31 closed this as completed Nov 20, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
T-lang Relevant to the language team, which will review and decide on the RFC.
Projects
None yet
Development

No branches or pull requests

9 participants