-
-
Notifications
You must be signed in to change notification settings - Fork 2.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adjust catch scoring to match stable score V2 #26405
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
bastoo0
reviewed
Jan 5, 2024
!diffcalc |
peppy
approved these changes
Jan 9, 2024
bdach
added a commit
to bdach/osu
that referenced
this pull request
Jan 10, 2024
…`difficultyPeppyStars` rounding Fixes issue that occurs on *about* 245 beatmaps and was first described by me on discord: https://discord.com/channels/188630481301012481/188630652340404224/1154367700378865715 and then rediscovered again during work on ppy#26405: https://gist.github.com/bdach/414d5289f65b0399fa8f9732245a4f7c#venenog-on-ultmate-end-by-blacky-overdose-631 It so happens that in stable, due to .NET Framework internals, float math would be performed using x87 registers and opcodes. .NET (Core) however uses SSE instructions on 32- and 64-bit words. x87 registers are _80 bits_ wide. Which is notably wider than _both_ float and double. Therefore, on a significant number of beatmaps, the rounding would not produce correct values. See following gist for corroboration of the above: https://gist.github.com/bdach/dcde58d5a3607b0408faa3aa2b67bf10 Thus, to crudely - but, seemingly accurately, after checking across all ranked maps - emulate this, use `decimal`, which is slow, but has bigger precision than `double`. Doing this requires some fooling of the compiler / runtime (see second inline comment in new method). To corroborate that this is required, you can try the following code snippet: Console.WriteLine(string.Join(' ', BitConverter.GetBytes(1.3f).Select(x => x.ToString("X2")))); Console.WriteLine(string.Join(' ', BitConverter.GetBytes(1.3).Select(x => x.ToString("X2")))); Console.WriteLine(); decimal d1 = (decimal)1.3f; decimal d2 = (decimal)1.3; decimal d3 = (decimal)(double)1.3f; Console.WriteLine(string.Join(' ', decimal.GetBits(d1).SelectMany(BitConverter.GetBytes).Select(x => x.ToString("X2")))); Console.WriteLine(string.Join(' ', decimal.GetBits(d2).SelectMany(BitConverter.GetBytes).Select(x => x.ToString("X2")))); Console.WriteLine(string.Join(' ', decimal.GetBits(d3).SelectMany(BitConverter.GetBytes).Select(x => x.ToString("X2")))); which will print 66 66 A6 3F CD CC CC CC CC CC F4 3F 0D 00 00 00 00 00 00 00 00 00 00 00 00 00 01 00 0D 00 00 00 00 00 00 00 00 00 00 00 00 00 01 00 8C 5D 89 FB 3B 76 00 00 00 00 00 00 00 00 0E 00 Note that despite `d1` being converted from a less-precise floating- -point value than `d2`, it still is represented 100% accurately as a decimal number. After applying this change, recomputation of legacy scoring attributes for *all* rulesets will be required.
bdach
added a commit
to bdach/osu
that referenced
this pull request
Jan 10, 2024
…`difficultyPeppyStars` rounding Fixes issue that occurs on *about* 246 beatmaps and was first described by me on discord: https://discord.com/channels/188630481301012481/188630652340404224/1154367700378865715 and then rediscovered again during work on ppy#26405: https://gist.github.com/bdach/414d5289f65b0399fa8f9732245a4f7c#venenog-on-ultmate-end-by-blacky-overdose-631 It so happens that in stable, due to .NET Framework internals, float math would be performed using x87 registers and opcodes. .NET (Core) however uses SSE instructions on 32- and 64-bit words. x87 registers are _80 bits_ wide. Which is notably wider than _both_ float and double. Therefore, on a significant number of beatmaps, the rounding would not produce correct values due to insufficient precision. See following gist for corroboration of the above: https://gist.github.com/bdach/dcde58d5a3607b0408faa3aa2b67bf10 Thus, to crudely - but, seemingly accurately, after checking across all ranked maps - emulate this, use `decimal`, which is slow, but has bigger precision than `double`. The single known exception beatmap in whose case this results in an incorrect result is https://osu.ppy.sh/beatmapsets/1156087#osu/2625853 which is considered an "acceptable casualty" of sorts. Doing this requires some fooling of the compiler / runtime (see second inline comment in new method). To corroborate that this is required, you can try the following code snippet: Console.WriteLine(string.Join(' ', BitConverter.GetBytes(1.3f).Select(x => x.ToString("X2")))); Console.WriteLine(string.Join(' ', BitConverter.GetBytes(1.3).Select(x => x.ToString("X2")))); Console.WriteLine(); decimal d1 = (decimal)1.3f; decimal d2 = (decimal)1.3; decimal d3 = (decimal)(double)1.3f; Console.WriteLine(string.Join(' ', decimal.GetBits(d1).SelectMany(BitConverter.GetBytes).Select(x => x.ToString("X2")))); Console.WriteLine(string.Join(' ', decimal.GetBits(d2).SelectMany(BitConverter.GetBytes).Select(x => x.ToString("X2")))); Console.WriteLine(string.Join(' ', decimal.GetBits(d3).SelectMany(BitConverter.GetBytes).Select(x => x.ToString("X2")))); which will print 66 66 A6 3F CD CC CC CC CC CC F4 3F 0D 00 00 00 00 00 00 00 00 00 00 00 00 00 01 00 0D 00 00 00 00 00 00 00 00 00 00 00 00 00 01 00 8C 5D 89 FB 3B 76 00 00 00 00 00 00 00 00 0E 00 Note that despite `d1` being converted from a less-precise floating- -point value than `d2`, it still is represented 100% accurately as a decimal number. After applying this change, recomputation of legacy scoring attributes for *all* rulesets will be required.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This pull request changes the catch scoring algorithm to - as far as I can tell - match stable score V2 exactly.
Key changes to the scoring algorithm
All non-bonus objects now are equally weighted for accuracy
Compare stable.
This is achieved by assigning 300 base value to
Great
,LargeTickHit
, andSmallTickHit
inGetBaseScoreForResult()
.Notably, there are a couple of usages that sort by
GetBaseScoreForResult()
. The first I honestly can't really bring myself to care about too much (responsible for converting ancient lazer scores to new algorithm, which I find disposable and irrelevant at this time, but should be trivial to fix if we do care), and the second filters byIsBasic()
, so is not broken by this in any capacity.Combo / "accuracy" apportioning is now beatmap-dependent
Compare stable.
The "accuracy" proportion of score is given by
And yes, it is intentional that the denominator does not include large droplets. Compare stable.
If the denominator of the expression above is 0, the accuracy portion is considered non-existent, which also matches stable.
The combo portion is 1 million minus the above.
Only fruits and large droplets contribute to combo portion
Compare stable.
Notably, for combo portion, normal fruits are weighted 300 and large droplets are weighted 100. Compare stable.
Only small droplets contribute to "accuracy" portion
Compare stable.
Bananas now give 200 bonus
Compare stable.
A notable edge case difference here is that on beatmaps that do not have any fruits, stable score V2 would just refuse to give any points whatsoever, even if bananas were caught. This matters for beatmaps that only consist of banana showers, such as this one. lazer will instead grant 1M score for free and then tack on points from bananas caught onto that.
All of the above was experimentally and organoleptically examined, cross-checked, and compared against stable. Comparisons were generally not easy, because replay playback consistency is not guaranteed, but I believe the implementation to now be matching 1:1.
Score conversion
To match the modified scoring algorithm, the conversion code from score V1 was also adjusted to match.
I started off from a very simple version of it, but on the resulting spreadsheet I found it to have very wrong results for some beatmaps. Further investigation showed that due to differences in rate of growth between score V1 and score V2, unique to catch, the previous method of estimation of combo portion was not suitable in specific circumstances.
To spare the reader's eyes, the derivation of the algorithm implemented in this PR is explained in this gist. Enter at your own risk.
The resulting spreadsheet showing the impact of the modified score conversion can be found here. The analysis of selected outliers on that spreadsheet can be found in this gist.
All replays used in testing above are here: replays.zip
Notably, some score conversions were and continue to be visibly broken, and are not fixed due to overarching concerns that affect all rulesets, not only catch. I may attempt to fix them separately, but that might require more time.
Implications on infrastructure
Legacy scoring attributes need to be adjusted for catch ruleset
The specific change that forces this is the change of bonus given by bananas. Because
BonusScoreRatio
is a ratio of bonus given by standardised scoring to ratio of bonus given by stable score V1 bonus, changing the number of points given by each banana obviously changes the ratio itself.The simplest and safest way to resolve this would be to do a full run
osu-difficulty-calculator
over catch beatmaps and converts. However, a riskier but quicker option would be to do this manually, leveraging the fact that the change toBonusScoreRatio
is simple - because bonus score per banana rose (200 / 50) = fourfold, it may be enough to just execute something along the lines of (WARNING, UNTESTED, CHECK FIVE TIMES BEFORE EXECUTING):Stable scores need to be reimported
Because this pull changes how both accuracy and total score are computed when importing stable scores, a full reimport of stable scores for catch is required.
cc @Firuqwq @bastoo0 @SecreOsu @Kimitakari @Nirfu as potentially interested in cross-checking / testing this