Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consider a right decimals approach for encoding decimal numbers #149

Closed
robert-zaremba opened this issue Nov 5, 2020 · 15 comments
Closed

Comments

@robert-zaremba
Copy link
Collaborator

robert-zaremba commented Nov 5, 2020

Summary

For blockchain use-case we need to do arithmetic operation which is both deterministic and support big numbers. Moreover, since numbers are used very often we should have an efficient (storage and cpu wise) serialization.

Context

In Cosmos-SDK we are using sdk.Int which is a wrapper around Go big.Int. Big integers are pretty efficient for arithmetic operations. Moreover sdk.Int supports efficient serialization using Int.Bytes()

The problem with sdk.Int is that it's not very friendly to encode Decimal numbers. In the real world we like to think with divisible units, eg: 1.2 USD rather than 120 US Cents

Related Work

We started with an experiment in x/ecocredits to use apd. Currently it's used only in that module.

Considerations

With respect to cosmos/cosmos-sdk#7113, let's consider and be consistent how we want to represent decimal values across Regen Ledger and SDK.

  1. Add Bytes serialization for apd
  2. Don't use decimal library and use sdk.Int and establish a standard in the SDK community about Decimal numbers representation.

Goal: we should keep in mind, that we don't want to end up with 15 standards of coin / credit / token amount representation.

References:

@aaronc
Copy link
Member

aaronc commented Nov 5, 2020

I would just note that I'm not sure binary representation is a problem. The SDK has decided to use strings instead. We should discuss there. If we do use binary, I suggest decimal128

@aaronc
Copy link
Member

aaronc commented Nov 5, 2020

However decimal128 is always 16 bytes. Not so space efficient. An efficient representation would be varints for mantissa and exponent. But again I'm not really sold on needing binary over strings.

@robert-zaremba
Copy link
Collaborator Author

robert-zaremba commented Nov 5, 2020

In ecocredtis we are serializing numbers using a decimal string. We can keep adp and use big.Int.Bytes() for serialization in kvstore.

In x/bank we are using a binary serialization:

bz := k.cdc.MustMarshalBinaryBare(&balance)
accountStore.Set([]byte(balance.Denom), bz)

@aaronc
Copy link
Member

aaronc commented Nov 5, 2020

Which is a proto wrapper around a string not binary. Pretty unnecessary imho to wrap it.

There are reasons we use strings in the SDK. Most notably because it's a standardized representation.

@aaronc
Copy link
Member

aaronc commented Nov 5, 2020

How and why would we use bigint serialization for decimals? Do you mean mantissa and exponent. Wouldn't a varint be better than big median anyway?

@robert-zaremba
Copy link
Collaborator Author

robert-zaremba commented Nov 5, 2020

Mantissa only, because the exponent is int32. We can use varint as well, but I think its an overhead. The serialization is super simple:

<mentisa><exponent>
  • mentisa is a low endian int32
  • exponent big.Int.Bytes

Deserializing is simple:

  • read first 4 bytes and set it to int32
  • pass the rest (data[4:]) to Int.SetBytes

The Advantage of this is that we can push this feature to the upstream (apd).

@aaronc
Copy link
Member

aaronc commented Nov 5, 2020

Seems like varint will save a few bytes right?

@robert-zaremba
Copy link
Collaborator Author

robert-zaremba commented Nov 5, 2020

I don't think so. Because we need to encode the mentisa anyway and varint has the bytes length prefix encoded in a sequence of bits (using unary encoding)

@robert-zaremba
Copy link
Collaborator Author

If we will like to optimize a storage, we just encode big.Int and read decimals from other place (eg: BatchInfo).
Simpler optimization is to use only one byte for int32 (exponent) - I can't think about any use case where we will need more than 32 decimal places.

@robert-zaremba
Copy link
Collaborator Author

and apd is limiting exponent as well: apd.MaxExponent = 100000

@aaronc
Copy link
Member

aaronc commented Nov 5, 2020

Okay well these are details we can optimize later. I first would want to see if we can 1) get buy-in on the SDK side to adopt GDA (cosmos/cosmos-sdk#7773) and 2) whether they're open to a binary format. I personally don't believe we should just invent our own decimal standard which is what was done with sdk.Dec. Do you agree GDA (of some precision) is where we should be starting? Wonder what your thoughts are on Dev's comments in that thread.

@aaronc
Copy link
Member

aaronc commented Nov 5, 2020

IMHO even int16 should be plenty for exponent no?

@robert-zaremba
Copy link
Collaborator Author

To be honest, I believe that 16 decimal places (int8) is more than enough. This is only for storing. For intermediate operations we could have more precision.

@robert-zaremba
Copy link
Collaborator Author

  • unless we want to support negative numbers, then 8 decimals might be not enough.

@ruhatch
Copy link
Contributor

ruhatch commented Sep 1, 2021

@clevinson can we close this as I think it's meaningfully tracked elsewhere?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants