Skip to content

Commit

Permalink
feat!: Enhanced AsyncStreamCDC
Browse files Browse the repository at this point in the history
- Support both `tokio` and `futures` via feature flags
- Add documentation and usage examples for `AsyncStreamCDC`
- Add tests that specifically target `AsyncStreamCDC`
- Run both `tokio` and `futures`-based tests in CI
  • Loading branch information
cdata committed Jul 13, 2023
1 parent 453753f commit b8b43f9
Show file tree
Hide file tree
Showing 5 changed files with 422 additions and 177 deletions.
12 changes: 8 additions & 4 deletions .github/workflows/test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -2,10 +2,10 @@ on: [push, pull_request]
name: Test
jobs:
build:
name: Build
name: Build
runs-on: ubuntu-latest
steps:
- name: Checkout repo
- name: Checkout repo
uses: actions/checkout@v2

- name: Install rust stable toolchain
Expand All @@ -15,10 +15,10 @@ jobs:
toolchain: stable
override: true

- name: Run cargo build
- name: Run cargo build
uses: actions-rs/cargo@v1
with:
command: build
command: build

test:
name: Test
Expand All @@ -38,6 +38,10 @@ jobs:
uses: actions-rs/cargo@v1
with:
command: test
- name: Run cargo test (futures)
uses: actions-rs/cargo@v1
with:
command: test --no-default-features --features futures

lints:
name: Lints
Expand Down
14 changes: 13 additions & 1 deletion Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -13,13 +13,25 @@ exclude = [
"test/*",
]

[features]
default = ["tokio"]
tokio = ["dep:tokio", "tokio-stream", "async-stream"]
futures = ["dep:futures"]


[dev-dependencies]
aes = "0.8.2"
byteorder = "1.4.3"
clap = { version = "4.2.1", features = ["cargo"] }
ctr = "0.9.2"
md-5 = "0.10.5"
memmap2 = "0.5.8"
tokio = { version = "1.29.1", features = ["io-util", "rt", "macros"] }
futures-test = { version = "0.3.28" }

[dependencies]
futures = "0.3.28"
futures = { version = "0.3.28", optional = true }
tokio = { version = "1.29.1", features = ["io-util"], optional = true }
tokio-stream = { version = "0.1.14", optional = true }
async-stream = { version = "0.3.5", optional = true }

42 changes: 29 additions & 13 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ This crate contains multiple implementations of the "FastCDC" content defined ch

## Requirements

* [Rust](https://www.rust-lang.org) stable (2018 edition)
- [Rust](https://www.rust-lang.org) stable (2018 edition)

## Building and Testing

Expand Down Expand Up @@ -54,6 +54,22 @@ for result in chunker {
}
```

### Async Streaming

The `v2020` module has an async streaming version of FastCDC named `AsyncStreamCDC`, which takes an `AsyncRead` (both `tokio` and `futures` are supported via feature flags) and uses a byte vector with capacity equal to the specified maximum chunk size.

```rust
let source = std::fs::File::open("test/fixtures/SekienAkashita.jpg").unwrap();
let chunker = fastcdc::v2020::AsyncStreamCDC::new(&source, 4096, 16384, 65535);
let stream = chunker.as_stream();
let chunks = stream.collect::<Vec<_>>().await;

for result in chunks {
let chunk = result.unwrap();
println!("offset={} length={}", chunk.offset, chunk.length);
}
```

## Migration from pre-3.0

If you were using a release of this crate from before the 3.0 release, you will need to make a small adjustment to continue using the same implementation as before.
Expand All @@ -78,15 +94,15 @@ The original algorithm from 2016 is described in [FastCDC: a Fast and Efficient

## Other Implementations

* [jrobhoward/quickcdc](https://github.com/jrobhoward/quickcdc)
+ Similar but slightly earlier algorithm by some of the same authors?
* [rdedup_cdc at docs.rs](https://docs.rs/crate/rdedup-cdc/0.1.0/source/src/fastcdc.rs)
+ Alternative implementation in Rust.
* [ronomon/deduplication](https://github.com/ronomon/deduplication)
+ C++ and JavaScript implementation of a variation of FastCDC.
* [titusz/fastcdc-py](https://github.com/titusz/fastcdc-py)
+ Pure Python port of FastCDC. Compatible with this implementation.
* [wxiacode/FastCDC-c](https://github.com/wxiacode/FastCDC-c)
+ Canonical algorithm in C with gear table generation and mask values.
* [wxiacode/restic-FastCDC](https://github.com/wxiacode/restic-FastCDC)
+ Alternative implementation in Go with additional mask values.
- [jrobhoward/quickcdc](https://github.com/jrobhoward/quickcdc)
- Similar but slightly earlier algorithm by some of the same authors?
- [rdedup_cdc at docs.rs](https://docs.rs/crate/rdedup-cdc/0.1.0/source/src/fastcdc.rs)
- Alternative implementation in Rust.
- [ronomon/deduplication](https://github.com/ronomon/deduplication)
- C++ and JavaScript implementation of a variation of FastCDC.
- [titusz/fastcdc-py](https://github.com/titusz/fastcdc-py)
- Pure Python port of FastCDC. Compatible with this implementation.
- [wxiacode/FastCDC-c](https://github.com/wxiacode/FastCDC-c)
- Canonical algorithm in C with gear table generation and mask values.
- [wxiacode/restic-FastCDC](https://github.com/wxiacode/restic-FastCDC)
- Alternative implementation in Go with additional mask values.
Loading

0 comments on commit b8b43f9

Please sign in to comment.