Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Split Allocator trait #112

Open
zakarumych opened this issue Apr 5, 2023 · 15 comments
Open

Split Allocator trait #112

zakarumych opened this issue Apr 5, 2023 · 15 comments

Comments

@zakarumych
Copy link

Currently there's only Allocator trait that provides both allocations and deallocations.
And Box, Vec and other types has A: Allocator generic parameter.

However there are allocators with no-op deallocations and thus do not require collections and smart-pointers to keep any allocator state to deallocate.
It would reduce size and improve performance somewhat significantly if Box<T, A> would be able to use ZST A parameter if no state is required for deallocation and allocation is not needed.

I propose the following solution:

  • Split Allocator trait into two - Deallocator and Allocator.
    They can be defined as following.

    unsafe trait Deallocator {
      fn deallocate(&self, ptr: NonNull<u8>, layout: Layout);
    }
    
    unsafe trait Allocator: Deallocator {
      /* all other methods */
    }
  • Define that deallocator deallocator: D created using <D as From<A>>::from(alloc) may deallocate memory allocated by alloc, any of its copy and equivalent allocators.

  • Leave only A: Deallocator bound on collections and smart-pointers and all their impl blocks where allocation is not performed.

  • Implement From<Box<T, A>> for Box<T, D> where D: From<A> and similar impls for other types with allocator type.
    This impl may conflict with others. The alternative is to add a method.

After this is done then allocators with possible no-op deallocation (like bumpalo::Bump or blink_alloc::BlinkAlloc) may define ZST deallocator type that does nothing on deallocation and only provides a lifetime to ensure that allocator is not reset.

On the bumpalo as example

struct Bumped<'a> {
  _marker: PhantomData<&'a Bump>,
}

unsafe impl<'a> Deallocator for Bumped<'a> {
  fn deallocate(&self, _ptr: NonNull<u8>, _layout: Layout) {}
}

impl<'a> From<&'a Bump> for Bumped<'a> {
  fn from(_bump: &'a Bump) -> Self {
    Bumped { _marker: PhantomData }
  }
}

// Usage

fn foo<'a>(bump: &'a Bump) {
  let box: Box<u32, &'a Bump> = Box::new_in(42, bump);
  let box: Box<u32, Bumped<'a>> = box.into();
  
  assert_eq!(size_of_val(&box), size_of::<usize>());
 
  // Following doesn't compile as cloning `Box` requires `A: Allocator`
  // let box2: Box<u32, _> = box.clone();
  // If such ability is required - do not downgrade allocator to deallocator.
}
@zakarumych
Copy link
Author

The main motivation here is performance.
Smart-pointers rarely require allocation after creation and containers are sometimes frozen.
Making them have smaller footprint would reduce register-pressure and improve performance of generated code.

Not only bumpalo-like allocators may benefit from this change.
Some stateful allocators may recover everything they need directly from pointer to memory block.
Or at least require less state to perform deallocation.

@Amanieu
Copy link
Member

Amanieu commented Apr 7, 2023

See #9 for a long discussion precisely this issue.

My position is that:

  • In the case of Bump this isn't actually needed since you can just use Box::leak to get a &'a mut which is tied to the lifetime of the allocator.
  • This doesn't help Vec, HashMap, etc which are much more common use cases for custom allocators.

In the end I don't think the use cases justify the additional API complexity.

@zakarumych
Copy link
Author

Box::<T>::leak will cause <T as Drop>::drop to be not called.
That's kinda bad for anything with useful Drop impl.

And there are allocators that need to perform actual deserialization but restore state from pointer.

I agree that proposed change is not that useful for Vec and HashMap.
When Vec is not longer need allocations it can be converted to Box<[T]> and again, with this change that box could be smaller.

I can't agree that custom allocators are used mostly with Vec and HashMap's but not Boxes. Bump-allocators - maybe

And note that bumpalo provides its own Box without allocator state and people will hesitate to move to std's Box without this change since if they pass those Boxes around they will see how performance degrades.

@Amanieu
Copy link
Member

Amanieu commented Apr 8, 2023

I still think this introduces a huge amount of API complexity. I'd recommend reading the full discussion in #9, it has examples like Box::clone which can only work if Box has a complete allocator. What then happens if you convert a Vec<T, A> to a Box<[T], A>? Is the resulting box clonable? Is a separate API needed to "downgrade" a Box<T, A> to a Box<T, D>? If an API is needed anyways, could we just keep the Allocator` trait the same and use an allocator that panics when trying to allocate?

@zakarumych
Copy link
Author

What then happens if you convert a Vec<T, A> to a Box<[T], A>? Is the resulting box clonable?

Box is clonable if A: Allocator.

just keep the Allocator trait the same and use an allocator that panics when trying to allocate?

Making always panicking Allocator will increase WTF factor a lot.
Collection types do not support conversion of allocator type without deconstruction. And some collection are not
deconstructible.

I don't think Deallocator trait would increase complexity. Unless you know that you need it - just use Allocator.
Authors of collection types may keep A: Allocator bound everywhere as they do right now.

Deallocator trait can be added without even changing Allocator trait like this:

unsafe trait Deallocator {
  unsafe fn deallocate(&self, ptr: NonNull<u8>, layout: Layout);
}

impl<A> Deallocator for A
where
  A: Allocator,
{
  #[inline(always)]
  unsafe fn deallocate(&self, ptr: NonNull<u8>, layout: Layout) {
    Allocator::deallocate(self, ptr, layout);
  }
}

Collection types may start using Deallocator where it makes sense.
And allocator types may start support downgrading Allocator to Deallocator.
Without breaking anyone's code

@morrisonlevi
Copy link

morrisonlevi commented Aug 23, 2023

I don't think leaking is acceptable for arenas like Bumpalo, but I'm not sure the premise of the motivation is correct:

The main motivation here is performance.
Smart-pointers rarely require allocation after creation and containers are sometimes frozen.
Making them have smaller footprint would reduce register-pressure and improve performance of generated code.

I think based on the definitions, if the allocator is zero-sized just like Global, then you don't get a wider Box. Is this not true? If it isn't true, that's... kind of crazy that the committee/wg thinks that's okay.

But, I do understand that for some allocators, you need state for the allocation but not for the deallocation, so for these types, splitting them would be ideal so that things which only need dealloc can be smaller. I think that's maybe what they were actually implying, just under specified.

@Ddystopia
Copy link

Hello, splitting allocator is useful for the following use case, while I'm not sure if in the form originally presented.

I want to use allocator api to Box values into static variables. If split allocator and deallocator, allocator will have pointer to the original static, but deallocator will not, because it is redundant - box will provide it on deallocate call

@zakarumych
Copy link
Author

zakarumych commented Aug 1, 2024

for some allocators, you need state for the allocation but not for the deallocation

Exactly. Where "some" is for example bumpalo's allocator. There's no-op deallocation that needs no state.
It's not a leak as memory is reclaimed on Bump::reset call.

For this reason people actually copy Box code as close as possible with no allocator state, only borrow lifetime, with Drop doing only drop of the value and no deallocation. And see significant performance gain in comparison to alloc::boxed::Box<T, &Bump>.

@Ddystopia
Copy link

Maybe instead of making allocator and deallocator traits allow, under specific conditions, allocate and deallocate using different allocators? In case of box it would mean, for example, impl From<Box<T, A1>> for Box<T, A2> where A1: From<A2>.

See #9 for a long discussion precisely this issue.

My position is that:

* In the case of `Bump` this isn't actually needed since you can just use `Box::leak` to get a `&'a mut` which is tied to the lifetime of the allocator.

* This doesn't help `Vec`, `HashMap`, etc which are much more common use cases for custom allocators.

In the end I don't think the use cases justify the additional API complexity.

I see the benefit there: you may construct initial allocator with initial pointer to memory, and then when Vec would want grow or shrink memory, it would pass previous pointer. That way allocator would not need state.

Allowing allocations and deallocations under specific (to be documented) circumstances is a lot less disruptive and "light" change, while covering OP use case for bumpalo and other too

@zakarumych
Copy link
Author

@Ddystopia

  • In the case of Bump this isn't actually needed since you can just use Box::leak to get a &'a mut which is tied to the lifetime of the allocator.

Box would drop the value and leaked &mut T won't. There are bump-allocators that may allocate, return&mut T and then drop the value on reset. But not popular ones.

Getting rid of state that allows allocations, but still having Allocator implemented will allow call to Box::clone, but it'll have to panic.

@CAD97
Copy link

CAD97 commented Aug 3, 2024

Note that there is independently some desire to have a kind of "&move T" type which owns the T but not the memory the T resides in. If this type exists, Box<T, NoopDealloc<'_>> would be just &'_ move T.

Although, unfortunately, while this works for the uniquely-owned Box, there are other container types which could enjoy access to batch reset backing storage, such as obviously Arc, but also any other collection that won't be doing any more reallocation.

As an interesting side note, with the storage model, the duplicated pointer in Box<T, &mut MaybeUninit<T>> isn't an issue, because instead of {ptr: *mut T, alloc: *mut T}, it would be {handle: (), store: *mut T}. Additionally, storage only capable of storing one object at a time is explicitly a supported use case of at least one revision of the storage model.

@Ddystopia
Copy link

Ddystopia commented Aug 4, 2024

@Ddystopia

  • In the case of Bump this isn't actually needed since you can just use Box::leak to get a &'a mut which is tied to the lifetime of the allocator.

Box would drop the value and leaked &mut T won't. There are bump-allocators that may allocate, return&mut T and then drop the value on reset. But not popular ones.

Getting rid of state that allows allocations, but still having Allocator implemented will allow call to Box::clone, but it'll have to panic.

It is up to implementor of the trait. My use case involves storing a single T inside the static, so any second allocation, regardless of new_in of close, should error.

For Vec all reallocs would also be accepted for stateless allocator, as "state" (previous pointer) would be passed by Vec to allocator.

GlobalAllocator is also stateless allocator.

@zakarumych
Copy link
Author

Wouldn't it be better to make Box::clone non-compilable in that case? Or at least is some of those cases?

@Amanieu
Copy link
Member

Amanieu commented Aug 4, 2024

In the case of bumpalo, you can get a &'a mut T directly from the allocator instead of a Box<T, A>. This suffices for almost all cases using bumpalo directly. However I still haven't seen any good use cases where you would want to perform this kind of Box-transmutation in allocator-generic code.

I personally feel that in practice, there aren't enough of these generic uses of Box to justify a significant complexity increase for the allocator trait.

@zakarumych
Copy link
Author

zakarumych commented Aug 4, 2024

You still insist that having Box<T, A> is the same as &mut T in case of bumpalo and similar allocators.

I insist that owning T and dropping it, and not owning T and never dropping it are two very different things.

But that's true that I don't have more use-cases. Only allocators with zero-state deallocators to save memory.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants