-
Notifications
You must be signed in to change notification settings - Fork 33
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make test suite more random and add add_noise_to_params, move to Github Actions, etc.. #320
Conversation
…/respy into random-tests
…etter with estimagic.
Codecov Report
@@ Coverage Diff @@
## master #320 +/- ##
=======================================
Coverage 81.67% 81.67%
=======================================
Files 38 38
Lines 2467 2467
=======================================
Hits 2015 2015
Misses 452 452 Continue to review full report at Codecov.
|
@@ -35,6 +31,46 @@ def validate_options(o): | |||
) | |||
|
|||
|
|||
def validate_params(params, optim_paras): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I dont get the gain of this function am I missing something or is this style guide stuff?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Previously we sorted the shocks within respy. Unfortunately, if we optimize with estimagic, estimagic assumes that the shocks are already in the correct format for the constraint before params even reaches respy. Now, we throw an error if the shock matrix is not correctly sorted.
|
||
corrs_flat = [] | ||
for i, c_1 in enumerate(choices): | ||
for c_2 in choices[: i + 1]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't there an easier way to get all combinations permutations of two elements with itertools or a similar package?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Due to the different labels for diagonal elements and lower triangular elements I would say no :). Using itertools would probably involve generating more than necessary and deleting some elements. I found this to be cleaner because labels are generated in the same order as required by the index.
>>> np.round(np.bincount(choices), decimals=-3) / n_samples | ||
array([0.4, 0. , 0.6]) | ||
|
||
""" | ||
if isinstance(choices, int): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
WHy should we have so many cases/choices at such a low level? WOuldnt it be better to make the container of choices uniform at some point?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Although the function is low-level, it is actually a vectorized replacement for numpy.random.choice
. It is only used in two different places, but needs to support ints and dict keys as choices. tuples and lists are added because it does not change anything. I think one could remove the np.ndarray
check and the conversion to scalars at the end and defer some steps to where it is applied, but it reduces code and mental effort in other places if we keep it this way.
Current behavior
Solution / Implementation
Random tests
pytest-randomly
which generates a timestamp based seed for every test session and the random number generator is reset to this seed before every test. Using this seed we have one random model, but different models for every test session, Small win."model_or_seed"
, I rerun the test five times with the seeds[seed + i for i in range(5)]
whereseed
is the seed generated bypytest-randomly
.Thus, we test multiple random models per session and different models across sessions. Thus, we have a better coverage than ever before. Now, it only depends on how diverse are the random models and how good are the tests.
The tests are still reproducible because every test session outputs
--randomly-seed=<value>
before each session which can be set as a pytest argument to re-run the tests.Others
add_noise_to_params()
to vary the parameters of an existing model.