-
Notifications
You must be signed in to change notification settings - Fork 322
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Recommendation on supporting int64/long values #427
Comments
Well, JSON Schema is not the real problem, actually. JSON Schema has a type The real problem comes from interoperability and this mostly from the way JavaScript is defined, see for example the June 2018 revision specification of the JavaScript Language Number Type, which is based off of double-precision 64-bit format IEEE 754-2008. So factually only 53 Bit integer Numbers can be precisely represented in any compliant JavaScript implementation. IMHO the discussion in JSON Schema issue 361 provides the best approach to this problem: Using a string type with a format specification. For example extending the JSON Schema Validation specification with new sections as follows: 6.2.X precisionThe value of "precision" MUST be a number, representing an inclusive upper limit on the number of places for the fraction part of a numeric value. If the instance is a number or a string with format decimal, then this keyword validates only if the instance's numeric value is less than or exactly equal to "maximum". 7.3.X. DecimalsThese attributes apply to string instances. decimal integer The intent of the The validation keywords of Schema Validation section 6.2 apply to the numeric value of the string. Note: Technically an Likewise the JSON Schema meta schema would need to be extended:
We could add this to our meta schemas already today as an extension to the standard JSON meta schema. |
@fmeschbe - What about just copying the approach of Google and instead define int64 values (or long) as type:string and we then introduce a new format value of "int64". We would update our exiting data validators to know how to interpret this new format value. Whatever we decide, we need to update the table here ASAP so users can define the fields correctly. |
Technically speaking the value range of int64 compared int54 is only 1000 times larger (9E18 as compared to 9E15). I don't think we buy us much with a Compared to that with So it would be easy to just define a subschema for int64 with appropriate All in all, I am not sure the story will end with |
I'm a bit confused here... The JSON Schema spec, as referenced, is very clear that NUMBER/Integer is arbitrary precision - which means that there is no need to specify if it is a "short" or a "long" integer. Clients will handle it as they are able to (though we should, of course, be sure to update our docs to make that clear to clients). Using strings for numbers is just bad/wrong (esp. for security concerns) and we shouldn't be supporting that, except as a last resort. I am going through this same issue with the AEM Forms team and their desire to add "big decimal" support to PDF forms. |
Hi @hiteshs do we have a specific case where a 64-bit integer is required, and a 53-bit integer is not sufficient? As you and others have noted, there simply isn't an interoperable way to represent the full range of a 64-bit integer in Javascript/JSON. Anything we define is going to be limited to processors that understand our proprietary extensions, and have the capability to operate on the extended range. So we need to consider the implications. A proprietary 64-bit integer format is only going to be useful in cases where integer values are used, 53-bits is too narrow, but the values never exceed 64-bits. (This is also assuming a signed value...is the need for an unsigned value?) I'm not sure what cases this applies to. We have discussed support for BigDecimal numbers (which @lrosenthol mentions). Do we need both a 64-bit integer and a BigDecimal? Or if we supported BigDecimal would that cover the cases that we think require a 64-bit integer? A definitely think we should outline a concrete use case before introducing a new, proprietary data type. This would help answer all of these questions. |
I don't have a specific use-case on where the 53-bit space is insufficient. My main concern is that existing users already use long/bigint/int64 in various systems (Hadoop, relational DBs). Using a decimal or big decimal also has potential performance overheads (both compute and memory). Do we plan to make the transition for existing users of long/int64 easier? An explicit int64 type also makes life simpler for application developers. Having only decimal adds a high amount of burden on applications to become smarter about what primitive types to leverage for the necessary perf optimizations? |
@hiteshs, I believe that we already have an approach: in XDM the equivalent of a long/int64 in other systems is an "integer" type. The limitation is that this type is only 53-bits wide. Users of int64 in other systems should use "integer" in XDM, being aware of the narrower range. The only reason this approach would not work is if there are cases where values exceed the 53-bit space. But we haven't identified any such cases. As has been described in this thread, JSON simply doesn't support a full 64-bit integer. Which means the only way we can define one is to create an encoding into a type that JSON does support. This will necessarily put burden on applications and tools to handle this proprietary extension. So we shouldn't do it without some careful consideration. it is difficult to do that without a use case. |
@kstreeter The limitation of 53bits is only in specific implementations (eg. JavaScript), not systemic to XDM or JSON. JSON itself is format & implementation agnostic (as noted in its spec). |
Technically @lrosenthol is right, that ideally there is no limit on the size and precision of I agree with you @lrosenthol that using So we are left, with three options for this issue:
I have a preference for the flexible approach which, of course, gives power and thus responsibility. |
@fmeschbe you are reading the wrong spec. The core JSON spec is RFC 7159 and the relevant section is 6. There is very clearly says:
Of course, it also takes the same point you and others have taken - that 2^53 is a best practice.
All that said, regardless of anything else, we need to do |
I> As far as what else to do, I would be willing to consider json-schema-org/json-schema-spec#3 if we picked a very specific standard (eg. BigDecimal from Java) that it meant. Leaving it vague is no better than the problem that got us here in the first place. I think my proposal for
We could certainly go for better wording. But if we'd go for |
The following discussions cover the issue of why a json-schema integer cannot be used to represent int64 or long numbers:
Is there a recommendation on how to define a schema that needs to support integers that support the full spectrum of int64/uint64?
The approach mentioned in the json-schema repo discussions has been to use strings and enhance the format support for such integer/decimal types. A reference approach taken by Google (https://developers.google.com/discovery/v1/type-format) follows a similar line however the format types used are not yet part of the standard.
Until a new draft of the spec is published to introduce these new formats, it would be good to have a recommendation in place.
The text was updated successfully, but these errors were encountered: