Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix failing double JsonCreators in jackson 2.12.0 #2978

Merged

Conversation

carterkozak
Copy link
Contributor

The addition of Big numeric types, BigInteger and BigDecimal,
prevents some tokens from being handled by existing double
deserializers.

This path occurs when data is buffered into a TokenBuffer while
deserializing polymorphic types if the type name is not the first
field received. It's possible that we can add data to the TokenBuffer
as double rather than BigDecimal to reduce memory overhead,
however that's always going to be tricky due to how floating point
works. The opt-in flag to represent all decimal numbers as
BigDecimal would still break this case, so we may want to implement
both.

@carterkozak carterkozak force-pushed the ckozak/fix_double_creator_regression branch from 747258b to 9b61a42 Compare December 10, 2020 18:26
a2q("{'double': 2.0,'type':'double'}"),
new TypeReference<UnionExample>() {});
assertEquals(expected, actual);
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test passes on 2.11 and fails on 2.12 with:

Cannot construct instance of com.fasterxml.jackson.databind.jsontype.TestDoubleJsonCreator$AliasDouble (although at least one Creator exists): no BigDecimal/double/Double-argument constructor/factory method to deserialize from Number value (2.0)

com.fasterxml.jackson.databind.exc.MismatchedInputException: Cannot construct instance of `com.fasterxml.jackson.databind.jsontype.TestDoubleJsonCreator$AliasDouble` (although at least one Creator exists): no BigDecimal/double/Double-argument constructor/factory method to deserialize from Number value (2.0)
 at [Source: (String)"{"double": 2.0,"type":"double"}"; line: 1, column: 23] (through reference chain: com.fasterxml.jackson.databind.jsontype.TestDoubleJsonCreator$UnionExample$DoubleWrapper["double"])

	at com.fasterxml.jackson.databind.exc.MismatchedInputException.from(MismatchedInputException.java:63)
	at com.fasterxml.jackson.databind.DeserializationContext.reportInputMismatch(DeserializationContext.java:1588)
	at com.fasterxml.jackson.databind.DeserializationContext.handleMissingInstantiator(DeserializationContext.java:1213)
	at com.fasterxml.jackson.databind.deser.ValueInstantiator.createFromBigDecimal(ValueInstantiator.java:351)
	at com.fasterxml.jackson.databind.deser.std.StdValueInstantiator.createFromBigDecimal(StdValueInstantiator.java:463)
	at com.fasterxml.jackson.databind.deser.BeanDeserializerBase.deserializeFromDouble(BeanDeserializerBase.java:1518)
	at com.fasterxml.jackson.databind.deser.BeanDeserializer._deserializeOther(BeanDeserializer.java:211)
	at com.fasterxml.jackson.databind.deser.BeanDeserializer.deserialize(BeanDeserializer.java:197)
	at com.fasterxml.jackson.databind.deser.SettableBeanProperty.deserialize(SettableBeanProperty.java:542)
	at com.fasterxml.jackson.databind.deser.BeanDeserializer._deserializeWithErrorWrapping(BeanDeserializer.java:565)
	at com.fasterxml.jackson.databind.deser.BeanDeserializer._deserializeUsingPropertyBased(BeanDeserializer.java:449)
	at com.fasterxml.jackson.databind.deser.BeanDeserializerBase.deserializeFromObjectUsingNonDefault(BeanDeserializerBase.java:1390)
	at com.fasterxml.jackson.databind.deser.BeanDeserializer.deserializeFromObject(BeanDeserializer.java:362)
	at com.fasterxml.jackson.databind.deser.BeanDeserializer._deserializeOther(BeanDeserializer.java:230)
	at com.fasterxml.jackson.databind.deser.BeanDeserializer.deserialize(BeanDeserializer.java:197)
	at com.fasterxml.jackson.databind.jsontype.impl.AsPropertyTypeDeserializer._deserializeTypedForId(AsPropertyTypeDeserializer.java:135)
	at com.fasterxml.jackson.databind.jsontype.impl.AsPropertyTypeDeserializer.deserializeTypedFromObject(AsPropertyTypeDeserializer.java:105)
	at com.fasterxml.jackson.databind.deser.AbstractDeserializer.deserializeWithType(AbstractDeserializer.java:263)
	at com.fasterxml.jackson.databind.deser.impl.TypeWrappedDeserializer.deserialize(TypeWrappedDeserializer.java:74)
	at com.fasterxml.jackson.databind.deser.BeanDeserializerBase.deserializeFromObjectUsingNonDefault(BeanDeserializerBase.java:1383)
	at com.fasterxml.jackson.databind.deser.BeanDeserializer.deserializeFromObject(BeanDeserializer.java:362)
	at com.fasterxml.jackson.databind.deser.BeanDeserializer.deserialize(BeanDeserializer.java:195)
	at com.fasterxml.jackson.databind.deser.DefaultDeserializationContext.readRootValue(DefaultDeserializationContext.java:322)
	at com.fasterxml.jackson.databind.ObjectMapper._readMapAndClose(ObjectMapper.java:4593)
	at com.fasterxml.jackson.databind.ObjectMapper.readValue(ObjectMapper.java:3548)
	at com.fasterxml.jackson.databind.ObjectMapper.readValue(ObjectMapper.java:3531)
	at com.fasterxml.jackson.databind.jsontype.TestDoubleJsonCreator.testDeserializationTypeFieldLast(TestDoubleJsonCreator.java:216)
...

a2q("{'type':'double','double': 2.0}"),
new TypeReference<UnionExample>() {});
assertEquals(expected, actual);
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While this test passes on both versions.

@cowtowncoder
Copy link
Member

Quick comment: looking through the PR, I found a small bug with #2215 implementation (plus the fact that added tests did not actually exercise the changed code :-( ). I checked in a fix so you may want to merge from 2.12 (might require manual merge due to indentation).

I wish changes here weren't needed since ideally we wouldn't add range checks and implicit coercions, but I'll have a look at tests to see what is happening. Buffering (TokenBuffer) always complicates things.

@carterkozak carterkozak force-pushed the ckozak/fix_double_creator_regression branch from 9b61a42 to 8ebe80b Compare December 10, 2020 21:44
@carterkozak
Copy link
Contributor Author

I've rebased and re-pushed (it went cleanly) and I've verified the fix you pushed to 2.12 didn't impact my failing test. I don't have a reproducer for the long case, so it's possible that it's not necessary -- in the buffering case ideally we'd use the smallest possible type to represent a number. I added it to be defensive because the double case made it through all our test coverage since we didn't think to test polymorphic containers around each other test case, and was pretty gnarly to figure out the cause/reproducer.

I'm not sure there's any path away from supporting BigDecmial to double coercion because without knowing type information beforehand, we buffer the most specific/accurate type. Whether the value is a BigDecimal or double may depend on the type-info type field.

@cowtowncoder
Copy link
Member

On buffering: buffer should ideally represent EXACT input that was received, wrt type, without changes: so if integral number exposed as NUMBER_LONG was seen, that should remain as-is, even if it'd fit as NUMBER_INT. This to minimize any changes due to buffering itself: original parser is free to choose minimal type, of course, which is what json-parser does, but that binary parsers typically won't. There are potential complications for this discrepancy too, but buffering should be minimally invasive.

As I said, I'll have so see what is going on: polymorphic handling is tricky, and BigDecimal's lack of "exceptional values" (NaN, infinities) further complicates things.

@carterkozak
Copy link
Contributor Author

I've created a somewhat alternative proposal #2982 which illustrates the problem on the TokenBuffer side. I think the numertype description of floating point values is problematic, as loss of precision is generally expected because we humans (and json) use base-ten decimal representation, where the computer internally does not.

@carterkozak carterkozak force-pushed the ckozak/fix_double_creator_regression branch from 8ebe80b to b47997b Compare December 13, 2020 05:05
@cowtowncoder
Copy link
Member

Thanks for the other PR. As I mentioned in the notes, I don't think that change is something I'd want to do -- ideally TokenBuffer really should not change the behavior in any way, and automatic loss in buffering case vs no loss in non-buffering case is something that has been problematic in the past. While there is the setting to force use of BigDecimal, I am not sure that is enough.
PR did point out discrepancy, however, in how JSON-backed JsonParser indicates "real" floating-point type -- getNumberType() will claim NumberType.DOUBLE (due to legacy reasons), whereas getting exact value will use BigDecimal. For most binary formats this is not the case (as they indicate actual type at format level).

On plus side at least I understand better the role of buffering wrt. need to accept BigDecimal/double somewhat interchangeably. I'll have a look at this PR and probably add couple of suggestions on how to get this ready for merge.

@carterkozak
Copy link
Contributor Author

Thanks :-)

It might be interesting to validate how buffering BigDecimal values rather than the input impacts memory overhead, for example JSON .5 is two bytes of data where the java double .5D is eight bytes, and BigDecimal of .5 is ~32 bytes. Not worth sacrificing correctness for, but might be worth considering the implementation specific buffered values.

@cowtowncoder
Copy link
Member

Ah. You had removed the long case already -- that would have been my only suggestion. Integers are not similarly prone to this issue since their minimal type is accurately detected, so there should not be similar loss of range.
Of course there may be other issues but it is probably better to wait for specific ones to work around and solve ones we have now wrt double-vs-BigDecimal.

The addition of Big numeric types, BigInteger and BigDecimal,
prevents some tokens from being handled by existing `double`
deserializers.

This path occurs when data is buffered into a TokenBuffer while
deserializing polymorphic types if the type name is not the first
field received.
@carterkozak carterkozak force-pushed the ckozak/fix_double_creator_regression branch from b47997b to dd30aa0 Compare December 13, 2020 17:59
@carterkozak
Copy link
Contributor Author

I've updated the conversion method name based on your suggestion.

@cowtowncoder
Copy link
Member

ah. Thanks!

@cowtowncoder
Copy link
Member

Thanks :-)

It might be interesting to validate how buffering BigDecimal values rather than the input impacts memory overhead, for example JSON .5 is two bytes of data where the java double .5D is eight bytes, and BigDecimal of .5 is ~32 bytes. Not worth sacrificing correctness for, but might be worth considering the implementation specific buffered values.

Right. My main concern really is that of correctness and consistency, but here's what I am thinking of implementation... will have to wait until 2.13 at least but:

  1. TokenBuffer really needs format-specific overrides for some things: for JSON it'd be fp number issue; for XML there is metadata (namespace of element/attribute; attribute-vs-element); for YAML may need to retain anchor ids
  2. A straight-forward way to allow possible format-specific buffer subtype would be via SerializerProvider / DeserializationContext, to have factory methods similar to what TokenBuffer itself has in 3.0 databind
  3. With format-specific buffering subtypes, JSON-based JsonParser could then do textual number validation normally (to ensure it is valid JSON number), but store textual representation as separate buffer-cell type (but only for JSON!), along with decoder to use

For (3), may need to extract parsing code so that jackson-core has the container (FloatingPointValue?) so code need not be duplicated, buffer just delegates necessary calls for access.

@cowtowncoder cowtowncoder merged commit cd2afd6 into FasterXML:2.12 Dec 13, 2020
cowtowncoder added a commit that referenced this pull request Dec 15, 2020
@cowtowncoder cowtowncoder modified the milestones: 1.9.13, 2.12.1 Dec 16, 2020
@cowtowncoder cowtowncoder changed the title Fix failing double JsonCreators in jackson 2.12 Fix failing double JsonCreators in jackson 2.12.0 Dec 16, 2020
cowtowncoder added a commit that referenced this pull request Dec 16, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants