-
Notifications
You must be signed in to change notification settings - Fork 295
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
meta: Expose Bytes
vtable
#437
Comments
I've been researching a bunch of relevant tickets and solutions for the last couple days, I'm going to attempt to summarize all of the issues/PRs so that we can discuss them in one place and hopefully move forward with a solution. IntroductionBytes has become the de-facto ref-counted Buffer for asynchronous services and service frameworks. Interop with most high-level frameworks in the Tokio ecosystem requires using Bytes. However, Bytes features a limited set of allocation modes, so it is incompatible with many high-performance memory management systems. Recent interest in Why should I use Bytes over an One might go so far as to say that Bytes is actually two orthogonal sets of functionality under one struct
Level SettingThere have been repeated requests for 2 umbrellas of operations:
Current VTable ImplementationThe VTable implementation currently looks like: struct Vtable {
/// fn(data, ptr, len) to increment a refcount (which may require restructuring)
pub clone: unsafe fn(&AtomicPtr<()>, *const u8, usize) -> Bytes,
/// fn(data, ptr, len) To convert a Bytes to a Vec<u>
pub to_vec: unsafe fn(&AtomicPtr<()>, *const u8, usize) -> Vec<u8>,
/// fn(data, ptr, len) To decrement or deallocate a Bytes instance
pub drop: unsafe fn(&mut AtomicPtr<()>, *const u8, usize),
} New Functionality RequiredAs I will explain below: Exposing the vtable itself is not quite sufficient to meet the new requirements because it has no way to:
As we all know, exposing a plain struct as a public interface is generally considered a bad idea. If we ever need to add/remove/alter fields to the vtable, it will break the API. In addition, there is some buffer lifecycle management code that might be better to be stored in the VTable (see BytesMut::resize and Bytes::truncate) For this reason, we need some added functionality that is specific to each underlying buffer type. (Either in the vtable or somewhere nearby)
Proposed DesignsThere are 2 lobes to this:
Item 2 should be rather trivial, and would likely be subject to bike-shedding, so this post will only discuss 1. 1. VTableBuilder ProposalThis was proposed by @Matthias247 in #287 (comment) This mentions explicitly constructing a VTable object, which we may not want, we could alternatively use the Vtable builder as an alternate Bytes constructor, so we don't have to make the VTable struct public. let mybytes = Bytes::with_vtable()
.clone_fn(my_clone_fn)
.resize_fn(my_resize_fn)
.drop_fn(my_drop_fn)
.from_bytes_fn(MyBufferType::from_bytes_parts)
.build(const* MyBufferType); The idea would be to offer an API that allows the user to supply Pros
Cons
2. dyn Trait ApproachThis was proposed by several people, the most compelling example was by @quark-zju in their implementation of mini-bytes. #359 (comment) This implementation very simply and elegantly allows a Byte to use the buffer contents of a 3rdPartyBuffer as well as control its lifetime in a refcounted way, and with very little hassle on the part of the 3rd party. (Note that the above example doesn't defer the clone/drop to the 3rdPartyBuffer instance. So it can't offer the optimizations that Bytes currently offers, however, that can easily be added) The idea is this: trait ManagedBuffer {
fn get_slice() -> &[u8];
fn inc_ref() -> Option<Self>;
fn dec_ref() -> Option<Self>;
fn resize(&self, usize) -> Option<Self>;
...
}
struct Bytes {
ptr: *const u8,
len: usize,
owner: Box<dyn ManagedBuffer>,
}
impl Bytes {
pub fn from_buffer_manager(bman: impl BufferManager);
} So you impl BufferManager for your 3rdPartyBuffer, and use it to construct Others have suggested a similar approach, but to store the object as an Pros
Cons
3. Trait as VTable Builder ApproachThis is sort of a hybrid of approaches 1 and 2. It offers more flexibility than both, but at the possible expense of greater complexity. This was implemented in PR (#567) by @HyeonuPark . This approach features an expanded VTable, as well as a stateless Trait design. This differs from approach 2 because the trait exists to provide functionality to the VTable, as well as provide functionality to re-construct/downcast the Bytes type back to T. Note that the above PR features a clone method that looks like: Pros
Cons
4. The In-a-Perfect-World ApproachAs mentioned in the intro, Bytes encompasses two somewhat orthogonal sets of functionality. Buffer Lifecycle Management, and Buffer Read/Write operations. This design follows the analogy of the Future and FutureExt traits. The So all Buffer lifecycle management, copy on write etc, would be implemented by the impl of The ManagedBuffer design would be similar to proposal 2, except that it would be designed in such a way that the implementation of ManagedBuffer is 100% responsible for itself, e.g. clone, drop, resize, into_parts, from_parts. It would be completely self contained. In order to maintain compatibility with the existing API, Bytes and BytesMut would become thin wrappers that store a The implementor of ManagedBuffer could be any 3rd party type. So it'd look something like: pub trait ManagedBuffer {
fn from_parts() -> Self;
fn into_parts(this: Self) -> (...);
fn clone() -> Self;
fn drop() -> Self
fn resize(usize) -> Self
}
pub trait ByteExt: ManagedBuffer {
fn slice() -> Self {
...
}
fn len() -> usize {
....
}
....
}
impl<T> ByteExt for T
where
T: ManagedBuffer + ?Sized
struct StaticBuffer {
buf: &'static [u8]
}
// this is where the STATIC_VTABLE impl would go.. more or less
impl ManagedBuffer for StaticBuffer {
....
}
... |
pinging @ipoupaille for comment, since they made a PR (#558) for VTable exposure as well. |
I need this to avoid memory copy between rust and c++ code. |
A couple people are asking for ways to convert rkyv's |
Hi! I'd also like to throw an up-vote in on this issue. I've got some memory mapped files I would like to expose a zero-copy Bytes interface on, and right now the only way I've been able to do that is by using |
YMMV depending on whether your files are disk-backed or SSD-backed, whether you can |
I think I've got a pretty good idea of the risks; in our case we have a cache fronting an object store, so if we miss we're going to hit S3 (or equivalent) anyways, which is much slower than faulting from SSD, which all our machines have for local storage. |
I'd also like to up-vote the idea of exposing vtable! I'll explain my use-case, and my requirements, in the hope that this is relevant to the current discussion about exposing the vtable... Background on my use-caseI'm working on a set of crates for high-speed I/O using
|
Proposal name | Enables my use-case? | Explanation |
---|---|---|
1. VTableBuilder Proposal | ❌ | As this currently stands, this proposal doesn't appear to be able to store an alignment: usize . We could hardcode an alignment in my_drop_fn but that won't work for my use-case because I won't know the alignment until runtime. So, unfortunately, this proposal doesn't satisfy my requirements, AFAICT. |
2. dyn Trait Approach | ✅ | Looks good! I can store alignment in a custom struct (and my custom struct would impl ManagedBuffer ). |
3. Trait as VTable Builder Approach | ❌? | I must admit I haven't fully wrapped my head around this approach yet! But I don't think this would support arbitrary alignment (because BytesImpl::from_bytes_parts doesn't accept an alignment ). But I could be wrong! |
4. The In-a-Perfect-World Approach | ✅ | I think this would work for my use-case (for the same reasons as proposal 2 would work). |
Maybe my use-case isn't a good fit for Bytes
?
Maybe my use-case is a bad fit for Bytes
🙂.
My use-case doesn't need some of the main features that Bytes
offers (different implementations of the same API, growing/truncating buffers). And I do need some features that Bytes
doesn't yet have (setting the alignment; recycling buffers).
So, please don't feel bad if your response to my comment is "Bytes shouldn't enable this use-case" 😄!
Related
- My PR Implement
Bytes::from_raw_parts
#684 implementedBytes::from_raw_parts
but failed to store thealignment
, and hence was unsound. - An issue which describes a similar use-case to mine: Please allow to construct BytesMut with custom alignment #600
Footnotes
-
See the notes on
O_DIRECT
at the bottom of theopen(2)
man page. ↩
I'd also like to express a desire to have the ability to take over control of how memory is managed in the Our use case Lately we've been having memory fragmentation issues because these This should be fairly simple to implement if we could have control of the underlying pointer that |
Hello, I've hit a nasty segfault in our software since we were (without success) working around the limitation of not being able to create Due to perf constraits we could not de-optimise the code and create allocated copies in memory, so I've bit the bullet and I'm hoping that you will find #742 acceptable. I've taken a different path to addressing this issue: Previous attempts have been very generic, but risked exposing much of My implementation takes a simpler approach of just keeping an owner object around for the duration that is necessary. I hope you may find this useful. I'm hoping it's in a shape that can be merged in - or close enough with some feedback and edits. Thank you! |
At some point in the future, we will want to expose the vtable backing
Bytes
. This issue tracks relevant discussion as well as API issues that relate to the vtable.Bytes
->BytesMut
0.5 release prevents going back from Bytes to BytesMut #350try_unsplit
AddBytes::try_unsplit()
#287The text was updated successfully, but these errors were encountered: