Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deserializing a flattened struct containing a ByteBuf fails to deserialize with invalid UTF-8. #855

Open
scrblue opened this issue Jan 25, 2022 · 3 comments

Comments

@scrblue
Copy link

scrblue commented Jan 25, 2022

As per the comment for deserialize_bytes, it is expected that you can deserialize a non-UTF-8 string into a ByteBuf without failure. However, with flattened structures, deserialize_map is called in place of deserialize_struct (see serde-rs/serde#1529), meaning members are deserialized with deserialize_any bypassing deserialize_bytes. deserialize_any assumes values surrounded by quotation marks are valid UTF-8 strings and returns an error otherwise.

An example follows

use serde::{Deserialize, Serialize};

fn main() {
    let a_success: A = serde_json::from_slice(b"{\"b\": {\"buf\": \"\xe5\x00\xe5\"}}").unwrap();
    println!("A Success: {:?}", a_success);

    let a_flat_success: AFlat = serde_json::from_slice(b"{\"buf\": \"abc\"}").unwrap();
    println!("A Flat Success: {:?}", a_flat_success);
    
    let a_flat_failure: AFlat = serde_json::from_slice(b"{\"buf\": \"\xe5\x00\xe5\"}").unwrap();
}

#[derive(Debug, Deserialize, Serialize)]
struct A {
    b: B,
}

#[derive(Debug, Deserialize, Serialize)]
struct AFlat {
    #[serde(flatten)]
    b: B,
}

#[derive(Debug, Deserialize, Serialize)]
struct B {
    buf: serde_bytes::ByteBuf,
}
@scrblue scrblue changed the title Deserializing a flattened struct containing a ByteBuf fails to deserialize with UTF-8. Deserializing a flattened struct containing a ByteBuf fails to deserialize with invalid UTF-8. Jan 25, 2022
@scrblue
Copy link
Author

scrblue commented Feb 1, 2022

As reported in the issue I just mentioned this one in, deserialize_any is also used on all enums with tag directives meaning this deviation from expected results appears in multiple locations.

The fix you can make here would just be to ensure the deserialize_any function matches the functionality of the other deserialize_... functions. My PR to this crate will fix it in the case of deserializing into ByteBufs, but I'm not sure if other areas are affected as no other eccentricities have appeared in my use of serde and serde_json.

To fix this in Serde would likely require more fundamental changes that I don't have the expertise to make a PR for.

@jonasbb
Copy link

jonasbb commented Feb 1, 2022

Since the issue mentions flattening and deserialize_any, the most likely culprit here is serde-rs/serde#1183.

@lucacasonato
Copy link
Contributor

Also relevant is #890

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

3 participants