Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unsigned integers should be printed as hexadecimal #46

Closed
Roger-luo opened this issue Dec 13, 2022 · 5 comments · Fixed by JuliaLang/julia#47903
Closed

Unsigned integers should be printed as hexadecimal #46

Roger-luo opened this issue Dec 13, 2022 · 5 comments · Fixed by JuliaLang/julia#47903

Comments

@Roger-luo
Copy link
Contributor

Currently, everything is converted to Int64, but this has a fundamental incompatibility with unsigned integers, and this is not consistent with the original TOML specification, e.g

the following TOML file should be valid according to the specification, but the Julia parser fails to parse or print

hex1 = 0xc53fc6704b16fdc2
@StefanKarpinski
Copy link
Member

One interesting question here is whether we should preserve type differences stemming from how integer values are formatted. The TOML spec doesn't say much about what types should be used to represent values, merely that

Arbitrary 64-bit signed integers (from −2^63 to 2^63−1) should be accepted and handled losslessly. If an integer cannot be represented losslessly, an error must be thrown.

That means that while we should parse hex literals (and binary and octal), we could still represent the values using Int64. However, we don't have to. We could parse unsigned literals to unsigned integer types determined by the number of digits of the literal like Julia itself does. If we do that, then we further have the option to print them out using the corresponding hex format. Slightly tangential, but we can also support parsing very large integer literals as larger types like Int128, UInt128 and BigInt.

@KristofferC
Copy link
Member

the following TOML file should be valid according to the specification, but the Julia parser fails to parse or print

As Stefan showed, 64-bit integers should be handled losslessly (which the current parser does) so I would say it is within spec.

There's been some discussion about this:
toml-lang/toml#688
https://twitter.com/hsivonen/status/1461684220519141384?lang=da

@StefanKarpinski
Copy link
Member

Since it's within the bounds of the spec, I'd certainly be in favor of using larger types to faithfully represent integer values larger than the range of Int64. There are two options that seems viable to me:

  1. Use Int128 and BigInt to represent large values no matter the syntax. Always print integers using decimal syntax.
  2. Parse large decimal values using Int128 and BigInt; parse binary, octal and hex values as we do in Julia, choosing an unsigned type based on the number of digits in the literal. Print similarly to Julia show: use decimal syntax for signed integers and hex syntax for unsigned integers, with the number of digits determined by type.

Option 1 is simpler, but I feel like option 2 makes TOML more useful for Julia usage where that type information can be useful. Changing the type used to represent hex/octal/binary literals is technically a breaking change, but I doubt it would cause much trouble.

@Roger-luo
Copy link
Contributor Author

I second @StefanKarpinski 's option 2 - if we don't do it, the current way is to write them as strings and parse to a Julia type anyway, I'd rather let it be handled by the parser directly.

FYI. I bring this issue up mainly because it fails to serialize the random number generator seed in UInt64 because it cannot always be converted to Int64, which initially I was expecting the TOML parser could handle.

@StefanKarpinski
Copy link
Member

I definitely feel that for Julia being able to store integer values of arbitrary size in TOML files is a good idea. Of course, other TOML implementations might not be able to read them, but that's ok.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants