Skip to content

Commit

Permalink
beef::lean::Cow (#10)
Browse files Browse the repository at this point in the history
* Benches

* Bench as_ref() only

* Crank up bench profile

* Split the fat pointer

* as_ptr is unnecessary

* Skinny and Fat Cow

* Fat and skinny

* Recursifying 1

* Generic cursed beef!

* Missing Eq derive

* Working on docs

* Fixed PartialEq impls with macros

* Docs, comments, fixing cursed borrows

* More readme

* Add const_fn back

* skinny -> lean

* Cursed -> Lean

* Removed commented code

* Doc comment links

* Test for soundness of beef::lean::Cow

* Minor doc tweaks

* Better illustrate beef::lean::Cow in readme
  • Loading branch information
maciejhirsz authored Mar 17, 2020
1 parent 64b7812 commit c1197ee
Show file tree
Hide file tree
Showing 9 changed files with 891 additions and 397 deletions.
4 changes: 4 additions & 0 deletions .travis.yml
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,10 @@ matrix:
- rustup target add thumbv6m-none-eabi
script:
- cargo build --target thumbv6m-none-eabi --verbose
- rust: nightly
name: "nightly all features"
script:
- cargo build --all-features
jobs:
allow_failures:
- rust: nightly
Expand Down
11 changes: 9 additions & 2 deletions Cargo.toml
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
[package]
name = "beef"
version = "0.2.1"
version = "0.3.0"
authors = ["Maciej Hirsz <hello@maciej.codes>"]
edition = "2018"
description = "More compact Cow"
Expand All @@ -14,6 +14,13 @@ categories = ["no-std", "memory-management"]
[features]
default = []

# makes `Cow::borrowed` a const fn
# adds `Cow::const_borrow` as a const fn
# requires nightly: https://github.com/rust-lang/rust/issues/57563
const_fn = []

[profile.bench]
opt-level = 3
debug = false
lto = 'fat'
debug-assertions = false
codegen-units = 1
86 changes: 71 additions & 15 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@
[![Crates.io version shield](https://img.shields.io/crates/v/beef.svg)](https://crates.io/crates/beef)
[![Crates.io license shield](https://img.shields.io/crates/l/beef.svg)](https://crates.io/crates/beef)

Alternative implementation of `Cow` that's more compact in memory.
Faster, more compact implementation of `Cow`.

**[Changelog](https://github.com/maciejhirsz/beef/releases) -**
**[Documentation](https://docs.rs/beef/) -**
Expand All @@ -14,16 +14,30 @@ Alternative implementation of `Cow` that's more compact in memory.
```rust
use beef::Cow;

let borrowed = Cow::borrowed("Hello");
let owned = Cow::from(String::from("World"));
let borrowed: Cow<str> = Cow::borrowed("Hello");
let owned: Cow<str> = Cow::owned(String::from("World"));

assert_eq!(
format!("{} {}!", borrowed, owned),
"Hello World!",
);
```

There are two versions of `Cow` exposed by this crate:

+ `beef::Cow` is 3 words wide: pointer, length, and capacity. It stores the ownership tag in capacity.
+ `beef::lean::Cow` is 2 words wide, storing length, capacity, and the ownership tag all in a fat pointer.

Both versions are leaner than the `std::borrow::Cow`:

```rust
use std::mem::size_of;

// beef::Cow is 3 word sized, while std::borrow::Cow is 4 word sized
assert!(std::mem::size_of::<Cow<str>>() < std::mem::size_of::<std::borrow::Cow<str>>());
const WORD: usize = size_of::<usize>();

assert_eq!(size_of::<std::borrow::Cow<str>>(), 4 * WORD);
assert_eq!(size_of::<beef::Cow<str>>(), 3 * WORD);
assert_eq!(size_of::<beef::lean::Cow<str>>(), 2 * WORD);
```

## How does it work?
Expand All @@ -43,21 +57,63 @@ For the most common pairs of values - `&str` and `String`, or `&[u8]` and `Vec<u
means that the entire enum is 4 words wide:

```text
Padding
|
v
+----------+----------+----------+----------+
Borrowed: | Tag | Pointer | Length | XXXXXXXX |
+----------+----------+----------+----------+
+----------+----------+----------+----------+
Owned: | Tag | Pointer | Length | Capacity |
+----------+----------+----------+----------+
Padding
|
v
+-----------+-----------+-----------+-----------+
Borrowed: | Tag | Pointer | Length | XXXXXXXXX |
+-----------+-----------+-----------+-----------+
+-----------+-----------+-----------+-----------+
Owned: | Tag | Pointer | Length | Capacity |
+-----------+-----------+-----------+-----------+
```

Instead of being an enum with a tag, `beef::Cow` uses capacity to determine whether the
value it's holding is owned (capacity is greater than 0), or borrowed (capacity is 0).

`beef::lean::Cow` goes even further and puts length and capacity on a single 64 word,
which has a further advantage of it being just a fat pointer.

```text
+-----------+-----------+-----------+
beef::Cow | Pointer | Length | Capacity? |
+-----------+-----------+-----------+
+-----------+-----------+
beef::lean::Cow | Pointer | Cap | Len |
+-----------+-----------+
```

Any owned `Vec` or `String` that has 0 capacity is effectively treated as a borrowed
value. Since having no capacity means there is no actual allocation behind the pointer
this is safe.

## Benchmarks

```
cargo +nightly bench
```

Microbenchmarking obtaining a `&str` reference is rather flaky and you can have widely different results. In general the following seems to hold true:

+ `beef::Cow` and `beef::lean::Cow` are faster than `std::borrow::Cow` at obtaining a reference `&T`. This makes sense since we avoid the enum tag branching.
+ The 3-word `beef::Cow` is faster at creating borrowed variants, but slower at creating owned variants than `std::borrow::Cow`.
+ The 2-word `beef::lean::Cow` is faster at both.

```
running 9 tests
test beef_as_ref ... bench: 57 ns/iter (+/- 15)
test beef_create ... bench: 135 ns/iter (+/- 5)
test beef_create_mixed ... bench: 659 ns/iter (+/- 52)
test lean_beef_as_ref ... bench: 50 ns/iter (+/- 2)
test lean_beef_create ... bench: 77 ns/iter (+/- 3)
test lean_beef_create_mixed ... bench: 594 ns/iter (+/- 52)
test std_as_ref ... bench: 70 ns/iter (+/- 6)
test std_create ... bench: 142 ns/iter (+/- 7)
test std_create_mixed ... bench: 663 ns/iter (+/- 32)
```

## License

This crate is distributed under the terms of both the MIT license
Expand Down
160 changes: 160 additions & 0 deletions benches/bench.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,160 @@
#![feature(test)]

extern crate beef;
extern crate test;

use std::borrow::{Cow as StdCow, ToOwned};
use test::{Bencher, black_box};

const NTH_WORD: usize = 4;
static TEXT: &str = "In less than a half-hour, Joe had distributed ninety-two paper cups of tomato juice containing AUM, the drug that promised to turn neophobes into neophiles. He stood in Pioneer Court, just north of the Michigan Avenue Bridge, at a table from which hung a poster reading FREE TOMATO JUICE. Each person who took a cupful was invited to fill out a short questionnaire and leave it in a box on Joe's table. However, Joe explained, the questionnaire was optional, and anyone who wanted to drink the tomato juice and run was welcome to do so.";

#[bench]
fn beef_create(b: &mut Bencher) {
use beef::Cow;

let words: Vec<_> = TEXT.split_whitespace().collect();

b.iter(|| {
let cow_words: Vec<Cow<str>> = words.iter().copied().map(Cow::borrowed).collect();

black_box(cow_words)
});
}

#[bench]
fn beef_create_mixed(b: &mut Bencher) {
use beef::Cow;

let words: Vec<_> = TEXT.split_whitespace().collect();

b.iter(|| {
let cow_words: Vec<Cow<str>> = words.iter().copied().map(|word| {
if word.len() % NTH_WORD == 0 {
Cow::owned(word.to_owned())
} else {
Cow::borrowed(word)
}
}).collect();

black_box(cow_words)
});
}

#[bench]
fn beef_as_ref(b: &mut Bencher) {
use beef::Cow;

let cow_words: Vec<_> = TEXT.split_whitespace().map(|word| {
if word.len() % NTH_WORD == 0 {
Cow::owned(word.to_owned())
} else {
Cow::borrowed(word)
}
}).collect();

b.iter(|| {
for word in cow_words.iter() {
let word: &str = word.as_ref();
black_box(word);
}
});
}

#[bench]
fn lean_beef_create(b: &mut Bencher) {
use beef::lean::Cow;

let words: Vec<_> = TEXT.split_whitespace().collect();

b.iter(|| {
let cow_words: Vec<Cow<str>> = words.iter().copied().map(Cow::borrowed).collect();

black_box(cow_words)
});
}

#[bench]
fn lean_beef_create_mixed(b: &mut Bencher) {
use beef::lean::Cow;

let words: Vec<_> = TEXT.split_whitespace().collect();

b.iter(|| {
let cow_words: Vec<Cow<str>> = words.iter().copied().map(|word| {
if word.len() % NTH_WORD == 0 {
Cow::owned(word.to_owned())
} else {
Cow::borrowed(word)
}
}).collect();

black_box(cow_words)
});
}

#[bench]
fn lean_beef_as_ref(b: &mut Bencher) {
use beef::lean::Cow;

let cow_words: Vec<_> = TEXT.split_whitespace().map(|word| {
if word.len() % NTH_WORD == 0 {
Cow::owned(word.to_owned())
} else {
Cow::borrowed(word)
}
}).collect();

b.iter(|| {
for word in cow_words.iter() {
let word: &str = word.as_ref();
black_box(word);
}
});
}

#[bench]
fn std_create(b: &mut Bencher) {
let words: Vec<_> = TEXT.split_whitespace().collect();

b.iter(|| {
let stdcow_words: Vec<StdCow<str>> = words.iter().copied().map(StdCow::Borrowed).collect();

black_box(stdcow_words)
});
}

#[bench]
fn std_create_mixed(b: &mut Bencher) {
let words: Vec<_> = TEXT.split_whitespace().collect();

b.iter(|| {
let stdcow_words: Vec<StdCow<str>> = words.iter().copied().map(|word| {
if word.len() % NTH_WORD == 0 {
StdCow::Owned(word.to_owned())
} else {
StdCow::Borrowed(word)
}
}).collect();

black_box(stdcow_words)
});
}

#[bench]
fn std_as_ref(b: &mut Bencher) {
let stdcow_words: Vec<_> = TEXT.split_whitespace().map(|word| {
if word.len() % NTH_WORD == 0 {
StdCow::Owned(word.to_owned())
} else {
StdCow::Borrowed(word)
}
}).collect();

b.iter(|| {
for word in stdcow_words.iter() {
let word: &str = word.as_ref();
black_box(word);
}
});
}
36 changes: 36 additions & 0 deletions src/fat.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
use core::num::NonZeroUsize;
use core::ptr::slice_from_raw_parts_mut;
use crate::traits::Capacity;

/// Compact three word `Cow` that puts the ownership tag in capacity.
/// This is a type alias, for documentation see [`beef::generic::Cow`](./generic/struct.Cow.html).
pub type Cow<'a, T> = crate::generic::Cow<'a, T, Option<NonZeroUsize>>;

impl Capacity for Option<NonZeroUsize> {
type NonZero = NonZeroUsize;

#[inline]
fn as_ref<T>(ptr: *const [T]) -> *const [T] {
ptr
}

#[inline]
fn empty<T>(ptr: *mut T, len: usize) -> (*mut [T], Self) {
(slice_from_raw_parts_mut(ptr, len), None)
}

#[inline]
fn store<T>(ptr: *mut T, len: usize, capacity: usize) -> (*mut [T], Self) {
(slice_from_raw_parts_mut(ptr, len), NonZeroUsize::new(capacity))
}

#[inline]
fn unpack(len: usize, capacity: NonZeroUsize) -> (usize, usize) {
(len, capacity.get())
}

#[inline]
fn maybe(_: usize, capacity: Option<NonZeroUsize>) -> Option<NonZeroUsize> {
capacity
}
}
Loading

0 comments on commit c1197ee

Please sign in to comment.