-
-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] Adding Decimal to Std library #17699
Conversation
ToDoDocs
Targets
String in/out
Types
Ops
Collection Ops
Methods
Rounding
Tests
|
According to this, new modules should go to |
@@ -0,0 +1,32 @@ | |||
import std/[decimal, strutils] | |||
|
|||
template assertEquals*(actual, expected: untyped): untyped = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
reuse unittest.check
I submitted some feedback, note that I didn't review the code thoroughly, as it's expected to change. |
lib/pure/decimal.nim
Outdated
@@ -0,0 +1,1040 @@ | |||
##[ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should have a performance comparison against established decimal128 libraries, let's followup on this point in timotheecour#690
lib/pure/decimal.nim
Outdated
case state: | ||
of psPre: | ||
if ch == '-': | ||
state = psLeadingMinus | ||
elif ch == '0': | ||
state = psLeadingZeroes | ||
elif DIGITS.contains(ch): | ||
state = psIntCoeff | ||
elif ch == '.': # yes, we are allowing numbers like ".123" even though that is bad form | ||
state = psDecimalPoint | ||
of psLeadingMinus: | ||
if ch == '0': | ||
state = psLeadingZeroes | ||
elif DIGITS.contains(ch): | ||
state = psIntCoeff | ||
elif ch == '.': # yes, we are allowing numbers like "-.123" even though that is bad form | ||
state = psDecimalPoint | ||
else: | ||
state = psDone # anything else is not legit | ||
of psLeadingZeroes: | ||
if ch == '0': | ||
discard | ||
elif DIGITS.contains(ch): | ||
state = psIntCoeff | ||
elif ch == '.': | ||
state = psDecimalPoint | ||
elif ch == 'e': | ||
state = psSignForExp | ||
else: | ||
state = psDone | ||
of psIntCoeff: | ||
if DIGITS.contains(ch): | ||
discard | ||
elif IGNORED_CHARS.contains(ch): | ||
discard | ||
elif ch == '.': | ||
state = psDecimalPoint | ||
elif ch == 'e': | ||
state = psSignForExp | ||
else: | ||
state = psDone | ||
of psDecimalPoint: | ||
if DIGITS.contains(ch): | ||
state = psFracCoeff | ||
else: | ||
state = psDone | ||
of psFracCoeff: | ||
if DIGITS.contains(ch): | ||
discard | ||
elif IGNORED_CHARS.contains(ch): | ||
discard | ||
elif ch == 'e': | ||
state = psSignForExp | ||
else: | ||
state = psDone | ||
of psSignForExp: | ||
if DIGITS.contains(ch): | ||
state = psExp | ||
elif (ch == '-') or (ch == '+'): | ||
discard | ||
else: | ||
state = psDone | ||
of psExp: | ||
if DIGITS.contains(ch): | ||
discard | ||
else: | ||
state = psDone | ||
of psDone: | ||
discard |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
case state: | |
of psPre: | |
if ch == '-': | |
state = psLeadingMinus | |
elif ch == '0': | |
state = psLeadingZeroes | |
elif DIGITS.contains(ch): | |
state = psIntCoeff | |
elif ch == '.': # yes, we are allowing numbers like ".123" even though that is bad form | |
state = psDecimalPoint | |
of psLeadingMinus: | |
if ch == '0': | |
state = psLeadingZeroes | |
elif DIGITS.contains(ch): | |
state = psIntCoeff | |
elif ch == '.': # yes, we are allowing numbers like "-.123" even though that is bad form | |
state = psDecimalPoint | |
else: | |
state = psDone # anything else is not legit | |
of psLeadingZeroes: | |
if ch == '0': | |
discard | |
elif DIGITS.contains(ch): | |
state = psIntCoeff | |
elif ch == '.': | |
state = psDecimalPoint | |
elif ch == 'e': | |
state = psSignForExp | |
else: | |
state = psDone | |
of psIntCoeff: | |
if DIGITS.contains(ch): | |
discard | |
elif IGNORED_CHARS.contains(ch): | |
discard | |
elif ch == '.': | |
state = psDecimalPoint | |
elif ch == 'e': | |
state = psSignForExp | |
else: | |
state = psDone | |
of psDecimalPoint: | |
if DIGITS.contains(ch): | |
state = psFracCoeff | |
else: | |
state = psDone | |
of psFracCoeff: | |
if DIGITS.contains(ch): | |
discard | |
elif IGNORED_CHARS.contains(ch): | |
discard | |
elif ch == 'e': | |
state = psSignForExp | |
else: | |
state = psDone | |
of psSignForExp: | |
if DIGITS.contains(ch): | |
state = psExp | |
elif (ch == '-') or (ch == '+'): | |
discard | |
else: | |
state = psDone | |
of psExp: | |
if DIGITS.contains(ch): | |
discard | |
else: | |
state = psDone | |
of psDone: | |
discard | |
state = case state | |
of psPre: | |
case ch | |
of '-': psLeadingMinus | |
of '0': psLeadingZeroes | |
of '1'..'9': psIntCoeff | |
of '.': psDecimalPoint | |
else: psDone | |
of psLeadingMinus: | |
case ch | |
of '0': psLeadingZeroes | |
of '1'..'9': psIntCoeff | |
of '.': psDecimalPoint | |
else: psDone # anything else is not legit | |
of psLeadingZeroes: | |
case ch | |
of '0': psLeadingZeroes | |
of '1'..'9': psIntCoeff | |
of '.': psDecimalPoint | |
of 'e': psSignForExp | |
else: psDone | |
of psIntCoeff: | |
case ch | |
of '0'..'9','_',',': psIntCoeff | |
of '.': psDecimalPoint | |
of 'e': psSignForExp | |
else: psDone | |
of psDecimalPoint: | |
case ch | |
of '0'..'9': psFracCoeff | |
else: psDone | |
of psFracCoeff: | |
case ch | |
of '0'..'9','_',',': psFracCoeff | |
of 'e': psSignForExp | |
else: psDone | |
of psSignForExp: | |
case ch | |
of '0'..'9': psExp | |
of '-','+': psSignForExp | |
else: psDone | |
of psExp: | |
case ch | |
of '0'..'9': psExp | |
else: psDone | |
of psDone: psDone |
just a bit terser
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
but don't indent of
should this go under stdx/decimals instead of std/decimals ? /cc @Araq |
|
ping on benchmarks, are there any benchmarks against some established high performance C libraries? |
There is a well-used standard C library at https://www.bytereef.org/mpdecimal/ and is highly tweaked for performance. It is the library used by Python behind the scenes. I don't know of any performance benchmarks; but one could certainly be made if there are none. I seriously doubt the initial version of Decimal will be as fast as Among other things, the initial version will not take advantage of 128bit math ops available on some CPU platforms. But the c lib can be conditionally compiled to take advantage of them. |
why can't we do like python and other languages and provide a wrapper against mpdecimal instead of re-implementing a decimal library with likely worse performance (by a unknown factor) and less features; it uses a permissive license (BSD) and has a C API so should be no problem to wrap (along with vmops for VM use). Performance and features matters more than whether pure nim code was used for most users, and the efforts spent on playing catch up in terms of performance or API feature set could be spent elsewhere; it's potentially a massive effort. |
I can see the benefit of that. If that is what is wanted by the core team, I could merge this code over to my existing pure nimble library. Then a new project using the mpdecimal could be started. Are you thinking of us bringing in the c source of mpdecimal (definitely allowed by the license) into the Nim repo? Or, requiring an external dependency? |
what I suggest is something very similar to #18008:
whether it uses The main work is the writing the nim wrapper with an idiomatic nim API (it's still much easier than re-implementing mpdecimal from scratch, obviously), and that's what I'd like to review. The good news is this can be done gradually, starting with the most obvious procs first, but at least it'll work from day 1, with the same performance as mpdecimal. I think std/jsbigints is a good example of the kind of API std/decimals could shoot for. |
Please no. Adding C/C++ dependencies everywhere would make it very difficult to have backends such as wasm, llvm etc. |
you can compile complex C++ libraries like opencv to wasm; if a nim backend can't handle a C dependency, it has bigger problems.
how often did you trace into libpcre when using std/re, std/nre? or any of the other library wrappers nim relies on for that matter. Even if mpdecimal ends up having a bug you run into, what makes you think a nim port with comparable feature set/performance would have less bugs than a library that's been developed for over a decade, is used extensively including by python in its decimal module? This is pure NIH. I don't want a port with sub-par performance/feature just for the purpose of using pure nim code, when there are established alternatives that give state of the art performance and usable from day 1 with a wrapper. Don't underestimate the development effort that would go into writing and maintaining a port of something like mpdecimal. |
|
Feel free to ping me on Discord if you want help. |
Thanks, @Varriount ! I might do that when I get time tomorrow. Sadly, I'll be unable to work on it further till then. In the short term, I've pulled in short pass with c2nim here: https://github.com/JohnAD/mpdecimal-nim It is far from reaching the point of compilation. Among other things, c2nim's parser did not like seeing unsigned long longs (ULL) so I did a lot of pre-emptive commenting. |
Yeah, right. Every programming language community in the world reinventing the wheel because they do not use json/xml/yaml/etc parsers written in C. Oh wait, according to the internet, writing programming languages in itself is NIH. I understand that writing anything from scratch is going to be time consuming, but it has plenty of benefits (because of which every programming language community "reinvents the wheel" instead of just wrapping everything over). Just dismissing the benefits and labelling it NIH is ignorance and nothing else. In any case, I do not have any use of Decimal, so I don't have any problems with wrapping a C library. But yeah, if your solution to everything is to wrap over a C library because "NIH", I would much rather move over to a different programming language, maybe C itself, because hey, it has tested libraries for everything and has no downsides whatsoever... |
Most C code is full of manual RAII and error handling, you think you need to replicate the hard work of "thousands of lines of C code", but you don't, it's usually full of artificial complexity, just like you would expect from a "portable assembler" language. |
Ok. More like 40 source files in the end. And, while I'm sure it's fast, it is crazy complicated:
So, a c2nim started conversion is possible. But it is not a weekend project by any stretch. |
I get the impression it resorts to inline ASM so often because:
|
There is another possible project that we can pull C code from. And is much more straightforward and clean: the BSON library for the MongoDb database supports 128bit decimal internally. Essentially, we would be taking out just the decimal portions. http://mongoc.org/libbson/current/creating.html It is not likely be as fast at The downside would be that not all the bells and whistles of a decimal library are written because the code really is for the database client (and server). But the core would be solid. Just for kicks, I I see that the file is covered by Apache 2.0 license. Not sure if that is a problem. |
we should measure performance with a benchmark, not guess; this should be done earlier rather than later to avoid any surprises after spending significant effort |
Finished up a crude benchmark: https://github.com/JohnAD/decimal-benchmark The Nim versions are not nearly as fast at the C versions, but neither Nim version has been optimized to any real extent yet. I'm actually surprised that the libbson library ran so fast. Unlike the mpdecimal monster, it does not resort to assembly tweaks and other unreadable measures. Unfortunately, libbson does not support math ops as it pretty much assumes that any higher-level (non-c) language that uses it will already have a built-in decimal library (with math); so libbson is mostly about the interface between BSON and the other language. BTW, I've taken on another customer that will have me over-busy again. But I hope to begin work on decimal again in a month or so. |
This pull request has been automatically marked as stale because it has not had recent activity. If you think it is still a valid PR, please rebase it on the latest devel; otherwise it will be closed. Thank you for your contributions. |
This is VERY early. Just the basic ability to store the IEEE 128-bit decimal format to/from binary; and the first passes at converting to/from strings are supported.
I'm starting the PR right now as the core is written. However, I have much to do still before this really needs review.