Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Exposed deflateSetDictionary and inflateSetDictionary functionality when using zlib #74

Merged
merged 7 commits into from
Mar 19, 2018
1 change: 1 addition & 0 deletions src/ffi.rs
Original file line number Diff line number Diff line change
Expand Up @@ -29,6 +29,7 @@ mod imp {
pub use self::z::Z_STREAM_END as MZ_STREAM_END;
pub use self::z::Z_SYNC_FLUSH as MZ_SYNC_FLUSH;
pub use self::z::Z_STREAM_ERROR as MZ_STREAM_ERROR;
pub use self::z::Z_NEED_DICT as MZ_NEED_DICT;

pub const MZ_DEFAULT_WINDOW_BITS: c_int = 15;

Expand Down
130 changes: 128 additions & 2 deletions src/mem.rs
Original file line number Diff line number Diff line change
Expand Up @@ -134,10 +134,26 @@ pub enum FlushDecompress {
#[doc(hidden)] _Nonexhaustive,
}

/// The inner state for an error when decompressing
#[derive(Debug, Default)]
struct DecompressErrorInner {
needs_dictionary: Option<u32>,
}

/// Error returned when a decompression object finds that the input stream of
/// bytes was not a valid input stream of bytes.
#[derive(Debug)]
pub struct DecompressError(());
pub struct DecompressError(DecompressErrorInner);

impl DecompressError {
/// Indicates whether decompression failed due to requiring a dictionary.
///
/// The resulting integer is the Adler-32 checksum of the dictionary
/// required.
pub fn needs_dictionary(&self) -> Option<u32> {
self.0.needs_dictionary
}
}

/// Error returned when a compression object is used incorrectly or otherwise
/// generates an error.
Expand Down Expand Up @@ -218,6 +234,21 @@ impl Compress {
self.inner.total_out
}

/// Specifies the compression dictionary to use.
///
/// Returns the Adler-32 checksum of the dictionary.
#[cfg(feature = "zlib")]
pub fn set_dictionary(&mut self, dictionary: &[u8]) -> u32 {
let stream = &mut *self.inner.stream_wrapper;
let rc = unsafe {
ffi::deflateSetDictionary(stream, dictionary.as_ptr(), dictionary.len() as ffi::uInt)
};

assert_eq!(rc, ffi::MZ_OK);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe return Result at least as Result<u32, ()>?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As above, I think I was following the old style with assertions. I will make this change this evening.


stream.adler as u32
}

/// Quickly resets this compressor without having to reallocate anything.
///
/// This is equivalent to dropping this object and then creating a new one.
Expand Down Expand Up @@ -376,10 +407,13 @@ impl Decompress {
self.inner.total_out += (raw.next_out as usize - output.as_ptr() as usize) as u64;

match rc {
ffi::MZ_DATA_ERROR | ffi::MZ_STREAM_ERROR => Err(DecompressError(())),
ffi::MZ_DATA_ERROR | ffi::MZ_STREAM_ERROR => Err(DecompressError(Default::default())),
ffi::MZ_OK => Ok(Status::Ok),
ffi::MZ_BUF_ERROR => Ok(Status::BufError),
ffi::MZ_STREAM_END => Ok(Status::StreamEnd),
ffi::MZ_NEED_DICT => Err(DecompressError(DecompressErrorInner {
needs_dictionary: Some(raw.adler as u32),
})),
c => panic!("unknown return code: {}", c),
}
}
Expand Down Expand Up @@ -419,6 +453,17 @@ impl Decompress {
}
}

/// Specifies the decompression dictionary to use.
#[cfg(feature = "zlib")]
pub fn set_dictionary(&mut self, dictionary: &[u8]) {
let stream = &mut *self.inner.stream_wrapper;
let rc = unsafe {
ffi::inflateSetDictionary(stream, dictionary.as_ptr(), dictionary.len() as ffi::uInt)
};

assert_eq!(rc, ffi::MZ_OK);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking it would be better to return Result<u32, DecompressError> from this function so that error in case of incorrect dictionary or in case of calling set_dictionary at incorrect time (it has to be called only immediately after Z_NEED_DICT) could be still handled gracefully.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe I originally implemented these methods before flate2 returned Result and simply had asserts against MZ_OK.

Returning an actual Result is definitely better, I assume it's safe to presume the adler code will already be populated from a previous call to inflate. So if inflateSetDictionary works without error I can return that.

}

/// Performs the equivalent of replacing this decompression state with a
/// freshly allocated copy.
///
Expand Down Expand Up @@ -513,6 +558,9 @@ mod tests {
use write;
use {Compression, Decompress, FlushDecompress};

#[cfg(feature = "zlib")]
use {Compress, FlushCompress};

#[test]
fn issue51() {
let data = vec![
Expand Down Expand Up @@ -571,4 +619,82 @@ mod tests {
assert_eq!(decoder.total_out(), string.len() as u64);
assert!(dst.starts_with(string));
}

#[cfg(feature = "zlib")]
#[test]
fn set_dictionary_with_zlib_header() {
let string = "hello, hello!".as_bytes();
let dictionary = "hello".as_bytes();

let mut encoded = Vec::with_capacity(1024);

let mut encoder = Compress::new(Compression::default(), true);

let dictionary_adler = encoder.set_dictionary(&dictionary);

encoder
.compress_vec(string, &mut encoded, FlushCompress::Finish)
.unwrap();

assert_eq!(encoder.total_in(), string.len() as u64);
assert_eq!(encoder.total_out(), encoded.len() as u64);

let mut decoder = Decompress::new(true);
let mut decoded = [0; 1024];
let decompress_error = decoder
.decompress(&encoded, &mut decoded, FlushDecompress::Finish)
.expect_err("decompression should fail due to requiring a dictionary");

let required_adler = decompress_error.needs_dictionary();

assert_eq!(required_adler, Some(dictionary_adler),
"the first call to decompress should indicate a dictionary is required along with the required Adler-32 checksum");

decoder.set_dictionary(&dictionary);

// Decompress the rest of the input to the remainder of the output buffer
let total_in = decoder.total_in();
let total_out = decoder.total_out();

let decompress_result = decoder.decompress(
&encoded[total_in as usize..],
&mut decoded[total_out as usize..],
FlushDecompress::Finish,
);
assert!(decompress_result.is_ok());

assert_eq!(&decoded[..decoder.total_out() as usize], string);
}

#[cfg(feature = "zlib")]
#[test]
fn set_dictionary_raw() {
let string = "hello, hello!".as_bytes();
let dictionary = "hello".as_bytes();

let mut encoded = Vec::with_capacity(1024);

let mut encoder = Compress::new(Compression::default(), false);

encoder.set_dictionary(&dictionary);

encoder
.compress_vec(string, &mut encoded, FlushCompress::Finish)
.unwrap();

assert_eq!(encoder.total_in(), string.len() as u64);
assert_eq!(encoder.total_out(), encoded.len() as u64);

let mut decoder = Decompress::new(false);

decoder.set_dictionary(&dictionary);

let mut decoded = [0; 1024];
let decompress_result = decoder.decompress(&encoded, &mut decoded, FlushDecompress::Finish);

assert!(decompress_result.is_ok());

assert_eq!(&decoded[..decoder.total_out() as usize], string);
}

}