Fix failing `double` JsonCreators in jackson 2.12.0 #2978

carterkozak · 2020-12-10T18:20:41Z

The addition of Big numeric types, BigInteger and BigDecimal,
prevents some tokens from being handled by existing double
deserializers.

This path occurs when data is buffered into a TokenBuffer while
deserializing polymorphic types if the type name is not the first
field received. It's possible that we can add data to the TokenBuffer
as double rather than BigDecimal to reduce memory overhead,
however that's always going to be tricky due to how floating point
works. The opt-in flag to represent all decimal numbers as
BigDecimal would still break this case, so we may want to implement
both.

carterkozak · 2020-12-10T18:51:50Z

src/test/java/com/fasterxml/jackson/databind/jsontype/TestDoubleJsonCreator.java

+                a2q("{'double': 2.0,'type':'double'}"),
+                new TypeReference<UnionExample>() {});
+        assertEquals(expected, actual);
+    }


This test passes on 2.11 and fails on 2.12 with:

Cannot construct instance of com.fasterxml.jackson.databind.jsontype.TestDoubleJsonCreator$AliasDouble (although at least one Creator exists): no BigDecimal/double/Double-argument constructor/factory method to deserialize from Number value (2.0)

com.fasterxml.jackson.databind.exc.MismatchedInputException: Cannot construct instance of `com.fasterxml.jackson.databind.jsontype.TestDoubleJsonCreator$AliasDouble` (although at least one Creator exists): no BigDecimal/double/Double-argument constructor/factory method to deserialize from Number value (2.0) at [Source: (String)"{"double": 2.0,"type":"double"}"; line: 1, column: 23] (through reference chain: com.fasterxml.jackson.databind.jsontype.TestDoubleJsonCreator$UnionExample$DoubleWrapper["double"]) at com.fasterxml.jackson.databind.exc.MismatchedInputException.from(MismatchedInputException.java:63) at com.fasterxml.jackson.databind.DeserializationContext.reportInputMismatch(DeserializationContext.java:1588) at com.fasterxml.jackson.databind.DeserializationContext.handleMissingInstantiator(DeserializationContext.java:1213) at com.fasterxml.jackson.databind.deser.ValueInstantiator.createFromBigDecimal(ValueInstantiator.java:351) at com.fasterxml.jackson.databind.deser.std.StdValueInstantiator.createFromBigDecimal(StdValueInstantiator.java:463) at com.fasterxml.jackson.databind.deser.BeanDeserializerBase.deserializeFromDouble(BeanDeserializerBase.java:1518) at com.fasterxml.jackson.databind.deser.BeanDeserializer._deserializeOther(BeanDeserializer.java:211) at com.fasterxml.jackson.databind.deser.BeanDeserializer.deserialize(BeanDeserializer.java:197) at com.fasterxml.jackson.databind.deser.SettableBeanProperty.deserialize(SettableBeanProperty.java:542) at com.fasterxml.jackson.databind.deser.BeanDeserializer._deserializeWithErrorWrapping(BeanDeserializer.java:565) at com.fasterxml.jackson.databind.deser.BeanDeserializer._deserializeUsingPropertyBased(BeanDeserializer.java:449) at com.fasterxml.jackson.databind.deser.BeanDeserializerBase.deserializeFromObjectUsingNonDefault(BeanDeserializerBase.java:1390) at com.fasterxml.jackson.databind.deser.BeanDeserializer.deserializeFromObject(BeanDeserializer.java:362) at com.fasterxml.jackson.databind.deser.BeanDeserializer._deserializeOther(BeanDeserializer.java:230) at com.fasterxml.jackson.databind.deser.BeanDeserializer.deserialize(BeanDeserializer.java:197) at com.fasterxml.jackson.databind.jsontype.impl.AsPropertyTypeDeserializer._deserializeTypedForId(AsPropertyTypeDeserializer.java:135) at com.fasterxml.jackson.databind.jsontype.impl.AsPropertyTypeDeserializer.deserializeTypedFromObject(AsPropertyTypeDeserializer.java:105) at com.fasterxml.jackson.databind.deser.AbstractDeserializer.deserializeWithType(AbstractDeserializer.java:263) at com.fasterxml.jackson.databind.deser.impl.TypeWrappedDeserializer.deserialize(TypeWrappedDeserializer.java:74) at com.fasterxml.jackson.databind.deser.BeanDeserializerBase.deserializeFromObjectUsingNonDefault(BeanDeserializerBase.java:1383) at com.fasterxml.jackson.databind.deser.BeanDeserializer.deserializeFromObject(BeanDeserializer.java:362) at com.fasterxml.jackson.databind.deser.BeanDeserializer.deserialize(BeanDeserializer.java:195) at com.fasterxml.jackson.databind.deser.DefaultDeserializationContext.readRootValue(DefaultDeserializationContext.java:322) at com.fasterxml.jackson.databind.ObjectMapper._readMapAndClose(ObjectMapper.java:4593) at com.fasterxml.jackson.databind.ObjectMapper.readValue(ObjectMapper.java:3548) at com.fasterxml.jackson.databind.ObjectMapper.readValue(ObjectMapper.java:3531) at com.fasterxml.jackson.databind.jsontype.TestDoubleJsonCreator.testDeserializationTypeFieldLast(TestDoubleJsonCreator.java:216) ...

carterkozak · 2020-12-10T18:52:07Z

src/test/java/com/fasterxml/jackson/databind/jsontype/TestDoubleJsonCreator.java

+                a2q("{'type':'double','double': 2.0}"),
+                new TypeReference<UnionExample>() {});
+        assertEquals(expected, actual);
+    }


While this test passes on both versions.

cowtowncoder · 2020-12-10T21:23:50Z

Quick comment: looking through the PR, I found a small bug with #2215 implementation (plus the fact that added tests did not actually exercise the changed code :-( ). I checked in a fix so you may want to merge from 2.12 (might require manual merge due to indentation).

I wish changes here weren't needed since ideally we wouldn't add range checks and implicit coercions, but I'll have a look at tests to see what is happening. Buffering (TokenBuffer) always complicates things.

carterkozak · 2020-12-10T21:50:38Z

I've rebased and re-pushed (it went cleanly) and I've verified the fix you pushed to 2.12 didn't impact my failing test. I don't have a reproducer for the long case, so it's possible that it's not necessary -- in the buffering case ideally we'd use the smallest possible type to represent a number. I added it to be defensive because the double case made it through all our test coverage since we didn't think to test polymorphic containers around each other test case, and was pretty gnarly to figure out the cause/reproducer.

I'm not sure there's any path away from supporting BigDecmial to double coercion because without knowing type information beforehand, we buffer the most specific/accurate type. Whether the value is a BigDecimal or double may depend on the type-info type field.

cowtowncoder · 2020-12-10T22:31:45Z

On buffering: buffer should ideally represent EXACT input that was received, wrt type, without changes: so if integral number exposed as NUMBER_LONG was seen, that should remain as-is, even if it'd fit as NUMBER_INT. This to minimize any changes due to buffering itself: original parser is free to choose minimal type, of course, which is what json-parser does, but that binary parsers typically won't. There are potential complications for this discrepancy too, but buffering should be minimally invasive.

As I said, I'll have so see what is going on: polymorphic handling is tricky, and BigDecimal's lack of "exceptional values" (NaN, infinities) further complicates things.

carterkozak · 2020-12-12T22:54:12Z

I've created a somewhat alternative proposal #2982 which illustrates the problem on the TokenBuffer side. I think the numertype description of floating point values is problematic, as loss of precision is generally expected because we humans (and json) use base-ten decimal representation, where the computer internally does not.

cowtowncoder · 2020-12-13T17:52:09Z

Thanks for the other PR. As I mentioned in the notes, I don't think that change is something I'd want to do -- ideally TokenBuffer really should not change the behavior in any way, and automatic loss in buffering case vs no loss in non-buffering case is something that has been problematic in the past. While there is the setting to force use of BigDecimal, I am not sure that is enough.
PR did point out discrepancy, however, in how JSON-backed JsonParser indicates "real" floating-point type -- getNumberType() will claim NumberType.DOUBLE (due to legacy reasons), whereas getting exact value will use BigDecimal. For most binary formats this is not the case (as they indicate actual type at format level).

On plus side at least I understand better the role of buffering wrt. need to accept BigDecimal/double somewhat interchangeably. I'll have a look at this PR and probably add couple of suggestions on how to get this ready for merge.

src/main/java/com/fasterxml/jackson/databind/deser/std/StdValueInstantiator.java

carterkozak · 2020-12-13T17:55:58Z

Thanks :-)

It might be interesting to validate how buffering BigDecimal values rather than the input impacts memory overhead, for example JSON .5 is two bytes of data where the java double .5D is eight bytes, and BigDecimal of .5 is ~32 bytes. Not worth sacrificing correctness for, but might be worth considering the implementation specific buffered values.

cowtowncoder · 2020-12-13T17:57:28Z

Ah. You had removed the long case already -- that would have been my only suggestion. Integers are not similarly prone to this issue since their minimal type is accurately detected, so there should not be similar loss of range.
Of course there may be other issues but it is probably better to wait for specific ones to work around and solve ones we have now wrt double-vs-BigDecimal.

The addition of Big numeric types, BigInteger and BigDecimal, prevents some tokens from being handled by existing `double` deserializers. This path occurs when data is buffered into a TokenBuffer while deserializing polymorphic types if the type name is not the first field received.

carterkozak · 2020-12-13T17:59:29Z

I've updated the conversion method name based on your suggestion.

cowtowncoder · 2020-12-13T17:59:49Z

ah. Thanks!

cowtowncoder · 2020-12-13T18:06:17Z

Thanks :-)

It might be interesting to validate how buffering BigDecimal values rather than the input impacts memory overhead, for example JSON .5 is two bytes of data where the java double .5D is eight bytes, and BigDecimal of .5 is ~32 bytes. Not worth sacrificing correctness for, but might be worth considering the implementation specific buffered values.

Right. My main concern really is that of correctness and consistency, but here's what I am thinking of implementation... will have to wait until 2.13 at least but:

TokenBuffer really needs format-specific overrides for some things: for JSON it'd be fp number issue; for XML there is metadata (namespace of element/attribute; attribute-vs-element); for YAML may need to retain anchor ids
A straight-forward way to allow possible format-specific buffer subtype would be via SerializerProvider / DeserializationContext, to have factory methods similar to what TokenBuffer itself has in 3.0 databind
With format-specific buffering subtypes, JSON-based JsonParser could then do textual number validation normally (to ensure it is valid JSON number), but store textual representation as separate buffer-cell type (but only for JSON!), along with decoder to use

For (3), may need to extract parsing code so that jackson-core has the container (FloatingPointValue?) so code need not be duplicated, buffer just delegates necessary calls for access.

carterkozak force-pushed the ckozak/fix_double_creator_regression branch from 747258b to 9b61a42 Compare December 10, 2020 18:26

carterkozak commented Dec 10, 2020

View reviewed changes

cowtowncoder added a commit that referenced this pull request Dec 10, 2020

Minor fix for #2215 implementation, related to #2978

866b2b4

carterkozak force-pushed the ckozak/fix_double_creator_regression branch from 9b61a42 to 8ebe80b Compare December 10, 2020 21:44

carterkozak mentioned this pull request Dec 12, 2020

Fix TokenBuffer._copyBufferValue floating point values #2982

Closed

carterkozak force-pushed the ckozak/fix_double_creator_regression branch from 8ebe80b to b47997b Compare December 13, 2020 05:05

cowtowncoder reviewed Dec 13, 2020

View reviewed changes

src/main/java/com/fasterxml/jackson/databind/deser/std/StdValueInstantiator.java Outdated Show resolved Hide resolved

carterkozak force-pushed the ckozak/fix_double_creator_regression branch from b47997b to dd30aa0 Compare December 13, 2020 17:59

cowtowncoder merged commit cd2afd6 into FasterXML:2.12 Dec 13, 2020

cowtowncoder added a commit that referenced this pull request Dec 15, 2020

Minor tweaking wrt #2978

98777f6

cowtowncoder modified the milestones: 1.9.13, 2.12.1 Dec 16, 2020

cowtowncoder changed the title ~~Fix failing double JsonCreators in jackson 2.12~~ Fix failing double JsonCreators in jackson 2.12.0 Dec 16, 2020

cowtowncoder added a commit that referenced this pull request Dec 16, 2020

Update release notes wrt #2978

ec4819b

rshkv mentioned this pull request Jan 28, 2021

Upgrade Jackson to 2.12.1 palantir/spark#725

Merged

patrickmann mentioned this pull request Jun 28, 2022

Update Jackson to version 2.13 Graylog2/graylog2-server#12961

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix failing `double` JsonCreators in jackson 2.12.0 #2978

Fix failing `double` JsonCreators in jackson 2.12.0 #2978

carterkozak commented Dec 10, 2020

carterkozak Dec 10, 2020

carterkozak Dec 10, 2020

cowtowncoder commented Dec 10, 2020

carterkozak commented Dec 10, 2020

cowtowncoder commented Dec 10, 2020

carterkozak commented Dec 12, 2020

cowtowncoder commented Dec 13, 2020

carterkozak commented Dec 13, 2020

cowtowncoder commented Dec 13, 2020

carterkozak commented Dec 13, 2020

cowtowncoder commented Dec 13, 2020

cowtowncoder commented Dec 13, 2020

Fix failing double JsonCreators in jackson 2.12.0 #2978

Fix failing double JsonCreators in jackson 2.12.0 #2978

Conversation

carterkozak commented Dec 10, 2020

carterkozak Dec 10, 2020

Choose a reason for hiding this comment

carterkozak Dec 10, 2020

Choose a reason for hiding this comment

cowtowncoder commented Dec 10, 2020

carterkozak commented Dec 10, 2020

cowtowncoder commented Dec 10, 2020

carterkozak commented Dec 12, 2020

cowtowncoder commented Dec 13, 2020

carterkozak commented Dec 13, 2020

cowtowncoder commented Dec 13, 2020

carterkozak commented Dec 13, 2020

cowtowncoder commented Dec 13, 2020

cowtowncoder commented Dec 13, 2020

Fix failing `double` JsonCreators in jackson 2.12.0 #2978

Fix failing `double` JsonCreators in jackson 2.12.0 #2978