-
Notifications
You must be signed in to change notification settings - Fork 4.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reduce weariness instability (includes tests); clarified #47653
Reduce weariness instability (includes tests); clarified #47653
Conversation
Start on adding tests for unrealistic fluctuations in weary level; see CleverRaven#46384 (and some cases in CleverRaven#46941) for example problems. The initial tests look for problems with the weary_recovery task of digging for 8 hours then waiting for 8 hours; weary level should not go down in the first 8 hours, and should not go up in the second 8 hours. (cherry picked from commit bdd942b)
In some conditions, namely continuous exercise at the same level, a decrease in weariness level is unrealistic. Check for this. (cherry picked from commit 49fc687)
Weary levels keep fluctuating unrealistically, probably because weary.intake (and weary.tracker?) are changing in large jumps at times. (cherry picked from commit db03ee5)
This adjusts expected test values to the (quite consistent) ones for the altered weary intake/tracker. It also does the weary.tracker adjustment in a less-perfectionistic (but more-functional) way. (cherry picked from commit 028b0f0)
This is an extension of a previous modification to try to smooth out the weary.intake reduction (smaller changes but more frequent), and both increases that and does similar for weary.tracker. The weary.intake changes are leading up to - probably after 0.F - an exponential moving average being used instead, so that characters are not, essentially hypoglycemic. (cherry picked from commit ab4ba67)
As far as I can tell, the cause of the weary.tracker going up during resting periods is that rest does still expend calories (bmr), and it happens every 5 minutes - while weary.tracker was only reduced every 30 (current) or 15 (this branch) minutes. This commit makes weary.tracker reduction occur every 5 minutes - every time try_reduce_weariness() is called. (cherry picked from commit 5d9dcb2)
Some of the test times were being altered by the every-30-minute (awake) weary.tracker reductions. Alter to match new ones, also taking into account local testing (including in scrambled order). While with some scrambled tests am seeing inconsistencies between 8-hour and 12-hour digging, 8-hour without fluctuations indicated 3->4 should not be 470 minutes, but no more than 465 - which it already was for 12-hour, weirdly enough (oops by me earlier?). (cherry picked from commit c68507b)
It is going to require more work - most likely adjusting weary.intake to an exponential moving average - to get rid of these two failed tests. (Note that these were added tests in the first place; the failure - weariness fluctuation - in question was happening before but simply wasn't detected.)
This clears the avatar prior to the (potential) debug output so what is actually the case inside do_activity can be seen.
@BrettDong: Thanks! Mechanics: Character/Player is another appropriate label. |
The Travis error was due to a fields problem (see #47218 for one way this might be decreased). |
Merge branch 'master' of https://github.com/CleverRaven/Cataclysm-DDA into weariness_fluctuations_1
@anothersimulacrum: Is it a feature, or a bugfix? I can admittedly see arguments for it being both... |
You're making mechanical changes, so that was my reasoning. |
If you'd like to have more time to review over it first, feel free to do the Feature Freeze; as I said, I can see arguments for both - and I'd feel horrible if something somehow did go wrong with it for a stable release (OTOH, players are unlikely to be terribly happy with their weariness levels jumping around...). |
This (near-duplicate of have_weary_decrease()) is not in use, so removing to be in accord with CDDA guidelines.
Anticipating a request to remove unused code. |
The LGTM failure was due to "There is not enough space on the disk"... sigh. (See #44598 for an earlier episode of this.) |
…aven/Cataclysm-DDA into weariness_fluctuations_1
…thub.com/CleverRaven/Cataclysm-DDA into weariness_fluctuations_1
The expected values for the 8-hour and 12-hour digging tasks, for the portion before the 8-hour mark, were different. (Not sure who did that - quite possibly me!) Fixing to approximate more-common values from these two plus the 8-hour digging/8-hour rest tests, looking at ones with only the weary tests run.
@anothersimulacrum: I have just noticed that the expected timing numbers differ for the first 8 hours of the 8-hour and 12-hour digging tasks. I regret not having spotted this at the time of the metabolism de-randomization PR. I've just corrected the numbers for this PR's alterations (those I have lots of data for, thanks to #47791 - in particular, data for the weary tests done without any others). EDIT: Now doing for 12 and 24-hour, and double-checking some of the 24-hour using new data. (The failure for the last one was with a sunburn test, which does not appear to be related.) |
I realized that I should also align the 12 and 24-hour digging expected values. (cherry picked from commit e445990)
The Travis failure was the grenade lethality test... 🙄 |
https://github.com/CleverRaven/Cataclysm-DDA into weariness_fluctuations_1
GCC 7 General Build Matrix failure - fields test... sigh 🙄 . |
@I-am-Erk: Could you take a look at this one? Thanks! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like it, especially that it's decreasing the lag between starting to rest and actually recovering.
Once my reviews are resolved, I think this is good.
Co-authored-by: anothersimulacrum <anothersimulacrum@gmail.com>
Co-authored-by: anothersimulacrum <anothersimulacrum@gmail.com>
…' of https://github.com/CleverRaven/Cataclysm-DDA into weariness_fluctuations_1
Travis failure is due to a fields test:
I've seen this one previously (enabled by wapcaplet's very helpful PR) - |
Since #45316 has been committed, I'll need to do some conflict-fixing - probably tomorrow. EDIT: Done. |
…leverRaven/Cataclysm-DDA into activity_per_turn_weariness_fluctuations
I forgot to include <cmath> for std::ceil, std::floor, and std::pow when I moved things over to activity_tracker.cpp (character.cpp had it already... as well as lots of others, of course!).
Unlike some of the other tests, for which one can figure out things like consistency from test to test (8-hour vs 12-hour vs 24-hour) or simply the weariness level not going down when it should be going up, this one is currently close to impossible to figure out what SHOULD result from it. It is more a check on whether the numbers are a lot different than they should be. (There would be some justification in changing the margins for this one to 10.) Thus, small adjustments in it are sometimes needed; this is an example, with the value used being from tests of multiple examples of it (see CleverRaven#47653).
A rewritten version of the above commit description:
|
Summary
Bugfixes "Alter weary.tracker, weary.intake changes to be smoother; add tests for illogical fluctuations in weary level"
Purpose of change
This PR is to help avoid situations with unexpected fluctuations of weary levels, such as - during the same activity - the weary level reversing direction. (For instance, going from 2 to 3 during rest.)
These changes were mostly worked out in #47273. They reduce the current jumps in weary.tracker (and partially in weary.intake); these jumps, as demonstrated in test results (see #47273 and #47590), are partially responsible for weary level fluctuations. (Note that the test results for #47590 were without any modifications to
try_reduce_weariness()
.)It also adds tests for these fluctuations, although two are commented out since this PR does not completely fix the problem. (That will require more significant alterations in weary.intake - probably going to an exponential moving average (see #46906 for a bit on this) - which will take only a bit more work but require significantly more testing.) (See also #46384 and #48009; this PR is not a full fix for these, but should reduce the frequency of problems.)
Describe the solution
Decrement weary.tracker on a more frequent basis but - to compensate - reduce it by a smaller amount. (This will be every 5 minutes, which is how often
try_reduce_weariness()
is called.)Also do this - decrementing more frequently but by a smaller amount - to weary.intake. Unlike with weary.tracker, this has not been done as much as it theoretically could be. This process involves a multiplication that is likely to result in more significant roundoff errors if done in much smaller amounts than the current PR. (By "more significant" is meant frequently having no reduction in weary.intake. The purpose behind the exact multiplier in use is also rather unclear.)There is some possibility of such roundoff errors happening with this PR's adjustments, but rather less so than if one tries multiplying by the 1/12th power of something at or above 0.95. (Note in this regard #47791's most recent results - specifically, the consistency across architectures and compilers, provided the testing seed is the same. Given that #47791 incorporates this PR's changes, this suggests that roundoff is not a significant problem with this PR.)
Describe alternatives you've considered
As noted above, this PR could go all the way to changing weary.intake completely. It could also try reducing weary.intake jumps further, at the risk of significant roundoff error.
TheDone.have_weary_increase()
test function is not currently used; it can be deleted if desired.One could also stabilize the weary level by implementing hysteresis - as in, increase weary level at, say, 125% of the current threshold, but only decrease it at, say, 75%. This would be rather complicated due to the frequent threshold changes.
Testing
Test functions have been added to activity_scheduling_helper.cpp/h, and used in weary_test.cpp, to check for illogical fluctuations. (As noted above, these have been commented out in two instances since this PR is not a complete fix.) For other testing, see #47273.
Testing just the weary tests in a different order (with the 24-hour tests first, for instance) results in errors from both the new tests and existing tests. However, the same thing - with just the existing tests, of course - happens with the current master.
Additional context
Ping: @anothersimulacrum, @I-am-Erk?