Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Do not error out of g{read,write}_with if the offset is equal to slice len #81

Closed
wants to merge 1 commit into from

Conversation

DCNick3
Copy link
Contributor

@DCNick3 DCNick3 commented Aug 15, 2022

It does not allow reading zero-sized types (like possibly an empty array) from empty slice

…e len

It does not allow reading zero-sized types (like possibly an empty array) from empty slice
@m4b
Copy link
Owner

m4b commented Mar 27, 2023

Is this correct? What bug does it fix exactly, can you give an example?

@DCNick3
Copy link
Contributor Author

DCNick3 commented Mar 27, 2023

So, an example where this breaks can be this:

use scroll::{ctx, Pread, LE};

fn main() -> Result<(), scroll::Error> {
    let empty_slice: &[u8] = b"";

    let read_empty: &[u8] = empty_slice.pread_with::<&[u8]>(0, 0)?;

    Ok(())
}

Here, I try to read an empty slice (by passing 0 as ctx) from an empty slice. This should be valid, as the empty slice just has size of zero, but it fails due to the check referenced and returns BadOffset(0).

This problem happens with any types that can be validly read from an empty slice, like, for example, empty arrays.

@RaitoBezarius
Copy link
Contributor

Can confirm it solves a bug I encountered when you try to write exactly the same amount of bytes you have left in the buffer you are passing.
Otherwise you need to allocate an extra byte.

@m4b
Copy link
Owner

m4b commented Nov 20, 2023

Ok I looked at this some more, and few things; it struck me as odd an off by one error like this didn't get more reports, so i played around with both cases.

So in the case of a 0-sized array, it is obvious, as others pointed out, the behavior is potentially problematic.

So first thing is changing to the < slightly alters our semantics of preads, e.g.

    let bytes = [1u8; 1];
    bytes.pread::<&[u8]>(1).unwrap()

before this change, like 0 sized array, this is considered a bad offset and the unwrap will panic, because the offset >= the len (which is generally correct logic when indexing). However, if you view pread as "slicing" the array, which internally it does, the semantics with a > only check will return a empty array.

This is because, e.g.:

let bytes = [1u8; 1];
let slice = &bytes[bytes.len()..];

returns an empty array, whereas anything greater than the len panics. To be honest this strikes me as a bit odd now, I'd never really thought about it, but it probably is more intuitive/easier on various forms of logic if you adopt this behavior, I haven't ever really thought about it though, as I said. For example, why doesn't any slice greater or equal to the len return an empty slice? why just the slice len returns an empty, but otherwise panics. Similarly, if i index into the array with bytes.len() i panic, but that makes a bit more sense since I'm expecting a value out.

Anyway, so the crux of the issue for pread then is whether preading is more like slicing or more like extracting a value out (e.g, indexing).

I don't have a super strong preference on this, but it is somewhat of a big semantic change. Since i've had two reports now that slicing an empty array behaves oddly, that is definitely a point in its favor.

On the other hand, all other aspects of the scroll api are more like extracting a value (one could argue the offset is an index of sorts, just more low level to what the stride of the thing in bytes is in the array). And for anything larger than a byte, the subsequent try_from_ctx on that type will fail with bad offset, since it doesn't have enough room to extract out further data. 🤷

Note there is one failure when I test your change, the "pread_str_weird" test which would need to be fixed (it's a similar issue, preading an empty byte array was an error before, and now it's not, and it's an empty string instead). I'd also like to see a test added for reading 0 sized bytes as well as reading the len of byte array for slices.

To be honest I am a little nervous as this changes pretty fundamentally the behavior of reads in certain circumstances, in a breaking manner, but which is basically invisible to users as the internal semantics changes ( returning an empty slice/string versus an error in certain circumstances).

In that sense, it would be good if we could document this empty string/empty array when preading on the len boundary somewhere appropriate (i'm not sure where at the moment?)

On the other hand it does seem to make it more natural and better aligned with those slice types... so I'm probably ok with this change.

Sorry for the long wall of text :) And thanks for everyones patience here :)

Copy link
Owner

@m4b m4b left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would like to see tests added and ideally some documentation about the behavior when preading slices and strs at len boundaries

@DCNick3
Copy link
Contributor Author

DCNick3 commented Nov 20, 2023

Really glad to see some movement here! Unfortunately, I won't have time to work on this PR right now, so I'll close it. I guess a separate PR with the tests and documentation updates can be opened

@DCNick3 DCNick3 closed this Nov 20, 2023
@RaitoBezarius
Copy link
Contributor

I will submit this again addressing the feedback as we need it :-).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants