Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error deserializing the unit type from an empty sequence #2340

Open
nicolaiunrein opened this issue Dec 14, 2022 · 4 comments
Open

Error deserializing the unit type from an empty sequence #2340

nicolaiunrein opened this issue Dec 14, 2022 · 4 comments

Comments

@nicolaiunrein
Copy link

nicolaiunrein commented Dec 14, 2022

I stumbled across the following inconsistency in serde and wondered if this is a bug or not:

    let _: (i32,) = serde_json::from_value(json!([42])).unwrap(); // works fine
    let _: () = serde_json::from_value(json!([])).unwrap(); // panics

AFAIK the unit type is can be thought of a tuple of length 0 and there is no other way to express that in rust. But while deserialization of tuples with elements works fine, deserializing an empty sequence into a unit type fails. I know that I could work around this in various ways but this behaviour does not seem to be correct.

I found out about this behaviour while writing a macro where I deserialize function arguments into tuples.

Any thoughts?

@nicolaiunrein nicolaiunrein changed the title Deserialize empty tuple aka unit type from an empty sequence Error deserializing the unit type from an empty sequence Dec 14, 2022
@Mingun
Copy link
Contributor

Mingun commented Aug 11, 2023

Actually ability to deserialize () from empty sequence was removed in #839 (and ability to deserialize unit structs from empty sequences removed in #857).

However, there is still some inconsistency. Internal ContentDeserializer allows deserialization of () from empty maps / sequences:

serde/serde/src/private/de.rs

Lines 1297 to 1345 in 05a5b7e

fn deserialize_unit<V>(self, visitor: V) -> Result<V::Value, Self::Error>
where
V: Visitor<'de>,
{
match self.content {
Content::Unit => visitor.visit_unit(),
// Allow deserializing newtype variant containing unit.
//
// #[derive(Deserialize)]
// #[serde(tag = "result")]
// enum Response<T> {
// Success(T),
// }
//
// We want {"result":"Success"} to deserialize into Response<()>.
Content::Map(ref v) if v.is_empty() => visitor.visit_unit(),
_ => Err(self.invalid_type(&visitor)),
}
}
fn deserialize_unit_struct<V>(
self,
_name: &'static str,
visitor: V,
) -> Result<V::Value, Self::Error>
where
V: Visitor<'de>,
{
match self.content {
// As a special case, allow deserializing untagged newtype
// variant containing unit struct.
//
// #[derive(Deserialize)]
// struct Info;
//
// #[derive(Deserialize)]
// #[serde(tag = "topic")]
// enum Message {
// Info(Info),
// }
//
// We want {"topic":"Info"} to deserialize even though
// ordinarily unit structs do not deserialize from empty map/seq.
Content::Map(ref v) if v.is_empty() => visitor.visit_unit(),
Content::Seq(ref v) if v.is_empty() => visitor.visit_unit(),
_ => self.deserialize_any(visitor),
}
}

This is required, because this deserializer is used for internally tagged enums:

#[derive(Deserialize)]
#[serde(tag = "tag")]
enum InternallyTagged {
  Unit,
}

where InternallyTagged::Unit variant would be represented as ["Unit"] in JSON (for example). Because this is a sequence, it is deserialized using visit_seq and when deserializer consumes tag the sequence for the value becomes empty. In order to be able to read () (which is the representation of unit variants for internally tagged enums) the deserializer implements such conversions.

However, ContentDeserializer used not only for internally tagged enums, but for flattening types too, and using it for that creates inconsistency, because you are able to deserialize () from the empty sequence when it in flattened structure, but not able to do this when it in the ordinary type.

@nicolaiunrein
Copy link
Author

If I unterstand you correctly, you argue that () should never be deserialized from an empty sequence because it would be a type conversion? I actually think the other way around. In my regard it should be possible because it is just an empty tuple which has some special meaning in rust (e.g. it is the implicit return type). The inconsistency I see is that if a tuple of length n >= 1 can be deserialized/serialized from/to a sequence it should be possible for tuples of length 0 as well. Of course serialization is more tricky because() can also be thought of as null in other languages and it makes sense that the default serialization of () in e.g. JSON is null. I think in essence the problem is that () has more than one semantic meaning. The best solution (the solution with the highest level of correctness) I could think of at the moment would be an attribute like #[serde(serialize_as_tuple)] and allow deserialization of empty sequences to (). If that is not feasible maybe #[serde(as_tuple)] could be a good compromise.

@Mingun
Copy link
Contributor

Mingun commented Aug 14, 2023

If I unterstand you correctly, you argue that () should never be deserialized from an empty sequence because it would be a type conversion?

I would not say what I argue, but this is the current behavior which was implemented in this way with such motivation. I actually agree with you, that it would be feasible to have #[serde(as_tuple)] or similar

@nicolaiunrein
Copy link
Author

I will draft a PR if I find time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

2 participants