-
Notifications
You must be signed in to change notification settings - Fork 4.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[CR] Catch weariness fluctuations #47273
[CR] Catch weariness fluctuations #47273
Conversation
Give more detailed information on weariness (tracker and intake), to try to figure out why keeps going up and down during tests. Since the "healthy" stored calories are at bmi 25, put calories to healthy minus debug_nutrition, to prevent going over. (cherry picked from commit b897540)
The caloric subtraction (minus calories for debug_nutrition) is causing errors in other tests, and it is also desirable to make sure it isn't doing anything to the weariness tests themselves (weary intake). With the new information (weary tracker and intake), the summarize transition output is linewrapping; trying to prevent. (cherry picked from commit e967767)
Start on adding tests for unrealistic fluctuations in weary level; see CleverRaven#46384 (and some cases in CleverRaven#46941) for example problems. The initial tests look for problems with the weary_recovery task of digging for 8 hours then waiting for 8 hours; weary level should not go down in the first 8 hours, and should not go up in the second 8 hours.
The failure for the General Build Matrix (GCC 7, Ubuntu, Tiles, CMake) is indeed from one of the added tests:
Analysis (of both this and the local one in the PR):
|
Appveyor result is consistent:
|
A local run with weary_recovery after both weary_24h_tasks and weary_assorted_tasks gives an almost identical result (with regard to weary_recovery) to the Appveyor run:
This indicates that there is an interaction going on that influences whether the 1->0 fluctuation occurs, which may be in either weary_24h_tasks or weary_assorted_tasks (will try to check tonight but may be tomorrow or later). weary_24h_tasks, which ran first, also had errors (and 1->0 fluctuations):
|
Same result as with Appveyor on Travis (Test/first build only):
|
Local, weary_assorted_tasks then weary_recovery:
So as per the Appveyor results (2->3 fluctuation only, involving both Weary_24h_tests then weary_recovery:
So:
Local weary_recovery followed by weary_24h_tests:
|
@anothersimulacrum (or anyone else doing tagging): C++, Code: Tests, Mechanics: Character/Player? |
In some conditions, namely continuous exercise at the same level, a decrease in weariness level is unrealistic. Check for this.
Heavy tasks, while a logical section, do have the problem of repeating the earlier task's information, making it harder to tell which task triggered the message. Also make debug_weary_info() more informative using additional clear_avatar().
Forgot to copy over identifier declaration.
Weary levels keep fluctuating unrealistically, probably because weary.intake (and weary.tracker?) are changing in large jumps at times.
This adjusts expected test values to the (quite consistent) ones for the altered weary intake/tracker. It also does the weary.tracker adjustment in a less-perfectionistic (but more-functional) way.
I'm going to temporarily allow failures in two tests failing on 8/12-hour digging, to check on other-OS/compiler/etc w/regard to the other testing. (I'm not going to simply comment them out, so as to get more of a look at what's going on.) For at least the General Build Matrix, this should also speed things up a bit in the future via caching. |
Two of the tests are doing a consistent failure due to fluctuations. Ultimately, these fluctuations need eliminating, but it would be nice to get some information on if anything else is going on.
This is an extension of a previous modification to try to smooth out the weary.intake reduction (smaller changes but more frequent), and both increases that and does similar for weary.tracker. The weary.intake changes are leading up to - probably after 0.F - an exponential moving average being used instead, so that characters are not, essentially hypoglycemic.
Doing some further smoothing, this time of both weary.intake and weary.tracker. The weary.intake parts will hopefully later be superseded by making it into an exponential moving average (unless someone wants a hypoglycemic character, perhaps?). Again allowing some failures in order to see rest of test results. |
Huh. Seeing EDIT: Done; see #47590. |
This monitors, with weary level transitions, the low_activity_ticks and tick_counter, to see if this can help figure out why weary.tracker is increasing while resting. (cherry picked from commit 4c4b9cd)
As far as I can tell, the cause of the weary.tracker going up during resting periods is that rest does still expend calories (bmr), and it happens every 5 minutes - while weary.tracker was only reduced every 30 (current) or 15 (this branch) minutes. This commit makes weary.tracker reduction occur every 5 minutes - every time try_reduce_weariness() is called.
After I get some sleep and do other things, I will put the timing numbers from the weary tests together, likely with some local runs, and change the test minutes to fit the more gradual changes. The other change I plan to make next is to have the |
Some of the test times were being altered by the every-30-minute (awake) weary.tracker reductions. Alter to match new ones, also taking into account local testing (including in scrambled order). While with some scrambled tests am seeing inconsistencies between 8-hour and 12-hour digging, 8-hour without fluctuations indicated 3->4 should not be 470 minutes, but no more than 465 - which it already was for 12-hour, weirdly enough (oops by me earlier?).
The local tests mentioned in the commit message: |
I suppose this PR is abandoned, since last commit was a year ago? I'm not sure if it should stay in PR tracker, especially with this many unresolved conflicts. |
Closing as abandoned. |
Summary
Infrastructure "Catch weariness fluctuations during testing"
Purpose of change
As seen in #46384 and elsewhere (e.g., some test failures in #46941), there are some unintuitive changes in weariness level:
Further information on this, and any differences between platforms, is desirable, particularly since most of the information is from before #46906.
Describe the solution
This PR adds one set of tests (more will likely be added) to catch these fluctuations. It also uses some from #46473 to extract more detailed information. (I have not submitted the latter as a separate, non-draft PR partially because they will hopefully be superseded by an added function in #45316.)
Expansion of the available tests to include ones for, simply, "did weariness level go down ever" and similar is planned for the 8, 12, and 24-hour digging tasks.
For the future, figuring out exactly what it is reasonable (intuitive) to expect from the
1 day vehicle work
weary_recovery task is a question of interest.Additional context
Example test result from running
cata_test --rng-seed 'time' weary_recovery
locally:BTW, the duplicate messages re "holy SPAM of debugging" and "you start walking" are from
clear_avatar
being called both right before and right after invocation ofdebug_weary_info
- see #46941.