Calculating mod multipliers based on community survey (n = 68) #26999

MaklovitzLazer · 2024-02-03T16:37:14Z

MaklovitzLazer
Feb 3, 2024

Abstract / tldr:

Methodology and results:

Based on the raw data from "What % accuracy should X scores tie NM SS scores?" survey assessed in osu! community (https://docs.google.com/forms/d/1ccBSyGq9tN_phmJIFetGOxdfBlHtB2RHNtyMLFJuR1A) I calculated mod multipliers for HD, HR, DT and FL in a way that they match the mean expectation of 68 voters. I got surveys' raw data from Elijah (sevenend7 on discord).

I calculated mean results of What accuracy should X scores tie NM SS scores?
Fitted bell curves (assumed normal distribution) to that
Used lazer score formula from ppy github

osu/osu.Game/Rulesets/Scoring/ScoreProcessor.cs

Line 380 in ef2e230

protected virtual double ComputeTotalScore(double comboProgress, double accuracyProgress, double bonusPortion)

to evaluate mod multipliers, so that for the cutoff accuracy (from the survey) of the FC score with given mod gives as much score as NM SS, which is 1 000 000 (assumed bonus portion is 0 & assumed FC with no sliderends dropped). The resulted formula is
$m = \frac{2}{a + a^5}$
a - mean accuracy from the survey
m - corresponding mod multiplier
Calculated mod multipliers for HD, HR, DT and FL and for the most popular mod combinations.

Here the exactly fitted values, if someone wants to use them for their own research:
HD | 1.0815889623755
HR | 1.14705534707846
FL | 1.22892198047254
DT | 1.26891860957178
Used score formula to calculate FC score in function of accuracy for the most popular mod multipliers

Conclusions:

Due to the new accuracy scaling
stable score ~ acc (if FC)
lazer score ~ $\frac{acc^5 + acc}{2}$ (if FC)
people find ALL mod multipliers underrated, which causes weird looking leaderboards (for example HDDT SS worth as much score as 98.0% HDHRDT FC) and a lot of frustration in leaderboard playing community. My take on this topic can be found here:
https://twitter.com/Maklovitz_osu/status/1752436258465259745
This analysis showed that we can easily evaluate the best mod multipliers, based on what players think about comparing an FC with a mod to a nomod SS. The upgraded version of this experiment would require asking thousands of active osu! community members (maybe even vote weight based on voter's pp rank & ranked score/# of lb scores), which would be hard without an official survey from peppy. High diversity of answers and independence of the voters is the key factor to benefit from the intelligence of the crowd.

Thanks to Elijah (SevenEnd7) for inspring me to do this analysis. His tweet that started all this:
https://twitter.com/SevenEnd7/status/1753668111998546200

WitherFlower's spreadsheet that works alomst the same as my calculations:
You can make a copy to play with different values: https://docs.google.com/spreadsheets/d/1iv7ptvppa9n-cBWrUlSDIlKZgjM0kf9LYMAMPpnSfow/edit?usp=sharing

More discussion about this topic can be found under this tweet:
https://twitter.com/Maklovitz_osu/status/1753743401990697415
Note that the results included in my tweet are slightly incorrect, which I explain in the comment:
https://twitter.com/Maklovitz_osu/status/1753755502880715089

MaklovitzLazer · 2024-02-03T17:16:46Z

MaklovitzLazer
Feb 3, 2024
Author

Additional thoughts: In the survey we shouldn't we ask users
What would be the perfect mod multipliers
but
With what accuracy should scores with mod X tie NM SS scores,
because lazer score formula is not as intuitive as the stable one. If a mod multiplier is 1.12x then on stable you can estimate that with this mod you need FC with x/1.12 accuracy to match the NM FC with x accuracy. With lazer score this thinking pattern is just wrong and the best* mod multipliers will always look too big for most of the users, who still think about them in a stable way.

*the best = maximizing the user's experience that scores on the leaderboards are ordered correctly

0 replies

Purplegaze · 2024-02-04T07:35:25Z

Purplegaze
Feb 4, 2024

I feel like interactions between common mod combinations should definitely be accounted for too when surveying a question like "what accuracy should be required to snipe an SS".

In stable, ~94.6% HDDT is needed to snipe an HDHR SS. Do people really agree with raising this to ~96.6% in lazer? (as well as the comparatively high value of 99.3% for DT vs HDHR) What about other contentious mod combinations, such as FL being worth less than HDHR in this proposal?

If further surveying is done, this is definitely something to consider.

0 replies

Livium129 · 2024-02-05T04:19:55Z

Livium129
Feb 5, 2024

Really like this change overall, but had a little bit of an issue with the HDHR-FL weighting. Generally, FL and DTFL scores are much rarer than HDHR and 3mod scores, and giving a larger boost to HDHR disincentivizes playing an already rare mod combo for leaderboards outside of Easy and Normal difficulties. This problem exists on scorev1 as well, and it's also annoying there. So, I reweighted HD and HR to be slightly below FL when combined, giving FL players an ~0.5% boost over HDHR players. My new acc chart, and new multipliers, attached below.

This change also had the side effect of giving HDHRFL a 1% disadvantage compared to DTFL and 3mod. HDHRFL's equal weighting to 3mod and DTFL has been a constant annoyance for both DT playerbases, since the lack of rate increase makes it much easier than the other two mod combos in basically all circumstances.

These are the new multipliers that I used for HD and HR. The HR boost is exactly twice HD's, which is just an arbitrary choice on my part. Feel free to change around if necessary.

0 replies

OwenCMYK · 2024-02-05T07:22:27Z

OwenCMYK
Feb 5, 2024

I think it would be best to instead gather this data based on real-world data of how many players pass/FC a beatmap depending on different mods. And even then, I personally think mods should be slighty underscored on average compared to their difficulty. Because the difficulty that each mod adds depends heavily on the way the map is designed and the skillset a player has. And I think it's better for mods to be "not worth the effort" than it is for unmodded plays to be "not worth the effort". Because not matter what you make the multiplier, as long as it's above 1, HR will always be worth it to somebody. Even more so with HD. So I feel it's better to ere on the side of caution.

0 replies

ashandelle · 2024-09-29T07:37:08Z

ashandelle
Sep 29, 2024

I know that this discussion hasn't been used in a while but it seems important and I thought it would be fun to set up a new poll since 68 is too few people. I made a new google forms with new mods from Lazer, most of them are optional though.

Link to form: https://docs.google.com/forms/d/e/1FAIpQLSfrDzo-cdaZDXfNRYY4oEfXiA5N3s-m7Oam_lybzzAF677cIA/viewform?usp=sf_link
Link to sheet: https://docs.google.com/spreadsheets/d/1UOKY_L6E7FtY7tqqD6UWqX3jO-fUF19OANe9qmgKC5E/edit?usp=sharing

I'm not very good at making sheets so if you want editing access send a request.

3 replies

MaklovitzLazer Sep 29, 2024
Author

Very nice survey, but I found problem not being able to type "100" entry (indicating a mod should have 1.00 multiplier).

Also, idk if DT/HT 0.6x, 0.7x, ..., 2.0x data form user input will be useful; instead I'd rather use user's DT 1.5x and HT 0.75x data and interpolate all other results, so for example if the mean for DT 1.5x will be 92.5% => 1.25 mod multiplier then 1.1x = 1.05, 1.2x = 1.10, ..., 2.0x = 1.50

ashandelle Sep 29, 2024

It should be possible to enter 100 now. As for the extra data points we can always ignore them if they don't line up nicely but personally I think that 3 points is too few, I made them optional too so I don't think it will be a huge problem.

WitherFlower Sep 29, 2024

Thanks for sharing such a thorough form ! Will try to share it with as many people as possible.

It would be nice to make the histograms have more columns. I can help with that if needed, you can contact me on discord using the same username as here.

ashandelle · 2024-09-29T10:22:55Z

ashandelle
Sep 29, 2024

For difficulty reducing mods is the multiplier just the inverse (a+a^5)/2?

1 reply

MaklovitzLazer Sep 29, 2024
Author

For difficulty reducing mods you calculate mod multiplayer the same way as for difficulty increasing and then you take 1/x of it, e.g. 1.25 -> 0.8

Purplegaze · 2024-09-29T13:30:47Z

Purplegaze
Sep 29, 2024

@ashandelle To clarify since I think my feedback response was misinterpreted:
I think an important thing to also ask people about is when mods snipe other mods, not necessarily NM. This may result in different multipliers compared to the ones surveyed with NM comparisons, so it would be interesting to see those differences.

For example, historically it's much more common for a map to have an HDHR SS (or high acc score) as number 1 before it gets its first HDDT FC, so the question of "at what acc does a HDDT FC snipe an HDHR SS" (stable: ~94.9%) (or, perhaps more interestingly, DT only vs. HDHR) is a lot more relevant than DT sniping NM

3 replies

ashandelle Sep 29, 2024

Ok

ashandelle Sep 29, 2024

I'm not sure how to calculate the current accuracy required

ashandelle Sep 29, 2024

Its added to the form but I'm not making the sheet for it right now

ashandelle · 2024-10-02T00:10:58Z

ashandelle
Oct 2, 2024

I've made it so that everyone is an editor but I've restricted it so that the only thing that can be edited is the mods that are listed on the leaderboard.

0 replies

Givikap120 · 2024-10-04T18:12:01Z

Givikap120
Oct 4, 2024

DT multiplier should be 1.3x, this is very close to the value derived from stable
And I doubt people think that this values should be lowered
Have you provided surveyed people information about how values are like in scorev1?

11 replies

ashandelle Oct 4, 2024

Not length, hit objects which is proportional

ashandelle Oct 4, 2024

The nonlinearity is added by the combo multiplier

Givikap120 Oct 4, 2024

The nonlinearity is added by the combo multiplier

combo multiplier is multiplied by acc too
look into formula again

oh, you mean that mod multiplier is not exact, not the acc
I'm sorry then

ashandelle Oct 4, 2024

The problem is that the weight of a hit object scales with its distance into the map, objects at the end effect accuracy more, and so longer maps have different required mod accuracies.

Givikap120 Oct 4, 2024

The problem is that the weight of a hit object scales with its distance into the map, objects at the end effect accuracy more, and so longer maps have different required mod accuracies.

Okay, the lowest I've been able to get is 1.105 multiplier (on normal diff of 20 seconds map), what is 90.5% accuracy, still entire 1.5% lower than poll suggests
Real multiplier on the actual map will be like 1.118

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Calculating mod multipliers based on community survey (n = 68) #26999

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 10 comments 29 replies

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

Select a reply

Calculating mod multipliers based on community survey (n = 68) #26999

Abstract / tldr:

Methodology and results:

Conclusions:

Replies: 10 comments · 29 replies

MaklovitzLazer Feb 3, 2024 Author

MaklovitzLazer Sep 29, 2024 Author

MaklovitzLazer Sep 29, 2024 Author

Replies: 10 comments 29 replies

MaklovitzLazer
Feb 3, 2024
Author

MaklovitzLazer Sep 29, 2024
Author

MaklovitzLazer Sep 29, 2024
Author