-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
30% optimization of DateTime.GetDate()
/.Year/.Month/.Day/.DayOfYear
by 'Euclidean affine functions'
#72712
30% optimization of DateTime.GetDate()
/.Year/.Month/.Day/.DayOfYear
by 'Euclidean affine functions'
#72712
Conversation
I couldn't figure out the best area label to add to this PR. If you have write-permissions please help me learn by adding exactly one area label. |
Tagging subscribers to this area: @dotnet/area-system-runtime Issue DetailsThis optimization based on article It avoids loops and minimizes branching in calendar calculations. Benchmarks:
It is difficult to explain values of magic constants of the algorithm in comments.
|
(uint y400, uint r1) = Math.DivRem(((uint)(UTicks / TicksPer6Hours) | 3U) + 1224, DaysPer400Years); | ||
ulong u2 = (ulong)Math.BigMul(2939745, (int)r1 | 3); | ||
ushort daySinceMarch1 = (ushort)((uint)u2 / 11758980); | ||
var n3 = 2141 * daySinceMarch1 + 197913; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
... Some additional comments, especially around the specific magic numbers, would be greatly appreciated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These magic number have not explainable meaning. They just give right result.
To understand them it is neccessary to follow chain of propositions in the article.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Then you should at least be noting the source inside the code. If there is any sort of naming for those constants in the source, you should probably be pulling them out into named constants as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did in the function comment https://github.com/dotnet/runtime/pull/72712/files#diff-5afffb30d8c00fa852b7df6e5bce6255be93acc1281d20b308ab57e8b70f567bR1356
Authors of the algorithm don't use any constant naming in the article and their reference implementation https://github.com/cassioneri/calendar/blob/master/calendar.hpp#L285
Do we have test coverage here sufficient to give high confidence? If you can think of any other useful test cases this PR is a good time to add them. |
Maybe @lemire is interested in this trick. |
I tested it for every possible date (range 1/1/0001 .. 12/31/9999). Results are identical to current implementation. I see the existing DateTime tests have good enough coverage: |
@danmoseley I knew of the article, but I am happy to see this work come to dotnet. |
I've compared the paper to the code changes here and I can reproduce the calculations, so the code makes sense to me. (Although, I see I added some more random tests locally and they all pass, which should not be a surprise since you say you've tried all possible valid dates yourself.
Would it make for clearer code, albeit with a bit more duplication, to manually inline the appropriate parts inside each of these? Then it will be self evident that the unrelated branches are discarded. Re my question about perf tests, I guess this covers it indirectly. Did you run this test before/after? But, it would probably be good to add benchmarks in this file for Year, Month, Day, DayOfYear individually, especially if you end up copy/pasting instead of inlining as suggested above. (If you add benchmarks, that would of course be a PR into dotnet/performance that we'd merge after code flows, but you could run them locally and report here.) |
@tannergooding in case he is interested in the math.. |
I notice we are missing tests for DayOfYear that use randomized inputs. Year, Month, Day are indirectly covered by the ToString() tests with randomized inputs
Those ToString() tests indirectly test Year, Month, Day getters. |
@richlander when we're using an algorithm from a paper like this, ie., there's no particular software license, I assume credit in a comment is fine, and no entry in THIRD-PARTY-NOTICES.TXT is expected? |
…ead of Common GetDatePart() function
Inlined them |
Locally I have a source file with both: original & new implementations. The first benchmarking result were before applying Added individual benchmarks for the properties. Results arу in the PR description. The most significant gain (50%) is for |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM - I didn't recheck the math in detail, if the tests pass.
As mentioned, DayOfYear coverage is relatively poor, while the others have good indirect coverage, it would be nice to add some tests with randomized data.
Thanks for the nice improvement!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
on latest changes
@richlander is going to check on licensing here. Marking do not merge meantime. The change is otherwise fine. |
Right. I sent mail to the authors of the paper + @SergeiPavlov. |
DateTime.GetDate()
/.GetDatePart()
by 'Euclidean affine functions'DateTime.GetDate()
/.Year/.Month/.Day/.DayOfYear
by 'Euclidean affine functions'
Lorenz and I are glad to see that our algorithm might go into .NET. We wish to thank, especially, @SergeiPavlov for implementing and creating the PR. On the licensing question sent by email, Lorenz and I have discussed the matter and we agreed with your second bullet point which I quote:
It is clear that Sergei did implement the algorithm as explained in the paper. In terms of implementation, we consider his C# code substantially different from the GPL-licensed C++ code found in my repository. For the reasons above, as far as Lorenz and I are concerned, we do not believe that GPL applies to Sergei's PR. As I said above, we would like to see the PR merged into the .NET's main branch for the benefit of .NET users. We would appreciate (but not require) if the release notes could mention our work. |
Appreciation to @cassioneri for the licensing. I will follow up with an addition to the third party notices, no need to reset CI. AOT and tvOS timed out, but previously passed, and vanishingly likely to be relevant here. |
We will add in release notes and a blog post (including the release blog post). Thanks much! |
Ignoring the In pure algebra: |
+1 |
Hm, this implication does not work for integer division in general case.
|
Certainly there are truncation errors for integers where significance is lost, that's why I was talking about it in algebraic form :(
... and so on... but doing that puts you at risk of integer overflow if |
The proof for integer division: let X = YA+B; X/Y = A = ZQ+R then: X =
X/(YZ) = Q = X/Y/Z Q.E.D. |
again |
FWIW:
|
Perf improvements dotnet/perf-autofiling-issues#6998 Nice! Improvements on Ubuntu-arm64 dotnet/perf-autofiling-issues#7150 |
Also improvements on |
Opened an issue for this. Thanks everyone for explaining the math :) |
This optimization based on article
Cassio Neri, Lorenz Schneider - Euclidean Affine Functions and Applications to Calendar Algorithms - 2021
It avoids loops and minimizes branching in calendar calculations.
Benchmarks:
It is difficult to explain values of magic constants of the algorithm in comments.
They are inferred algebraically.
GetDatePart()
function and inline its functionality directly intoYear, Month, Day, DayOfYear
properties