-
-
Notifications
You must be signed in to change notification settings - Fork 122
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Regression testing #316
Regression testing #316
Conversation
run_examples/run_ogusa_serial.py
Outdated
kwargs={'output_base':output_base, 'baseline_dir':BASELINE_DIR, | ||
'test':True, 'time_path':True, 'baseline':False, | ||
kwargs={'output_base':output_base, 'baseline_dir':baseline_dir, | ||
'test':False, 'time_path':True, 'baseline':False, | ||
'analytical_mtrs':False, 'age_specific':True, | ||
'user_params':user_params,'guid':'', 'reform':reform , | ||
'run_micro':False, 'small_open': False, 'budget_balance':False, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@hdoupe One setting affecting your run times is the kwarg run_micro:False
. run_micro
is a boolean that is true if the tax functions are to be estimated. This takes about 2 hours on my MacBook Pro (thought it's embarrassingly parallelizable). You will need this to be true unless you have pickle files with the saved parameters. You can look at the parameters.py
script to see how those pickle files are reference in the cast the tax function parameters for the reforms are pre-estimated. They do use the guid
, so you'll probably want to set that to be specific to each reform.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @jdebacker. I'll look into that.
Should small_open
be set to True
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's set it to false for these runs. I think that's a more interesting case of the model.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And keep budget_balance=False
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok sounds good. Thanks
@jdebacker One more question before I run the reforms again. Where should |
@hdoupe Good question. I think it needs to be in the directory that you have the I don't know Tax-Calc we'll enough - is there anyway we can run this such that it gets puf.csv from the Tax-Calc directory? i.e., so we don't have to have this file in more than one place. BTW, now the that you are doing things in parallel, you might rename that script to something else like |
@jdebacker Ok I'll give that at shot.
That would work if I used my local TaxCalc repo but I'm not sure if we can do that with conda install. It looks like
Ok that makes sense. Now I see that you have Thanks |
Commit |
@jdebacker I have finally run all of the reforms. It took some time to figure out how to set the path for the tax function parameter files. I had to devote a lot of my time to the webapp the last few weeks. Going forward, I'll get to spend more time on this project. Hopefully, the turn around time on these issues should decrease. Some of the reforms had some errors but I edited them just to get the model to run. Can you take a look at the edited reforms to make sure that's what you want? Some of the reforms are slight variations of other reforms. I plotted the percent change for the similar reforms together and the unrelated reforms by themselves. The plots are in |
@hdoupe This is great - thanks. I'll take a look and get back you to about the results. |
@hdoupe @jdebacker . These 10 reforms look good. I @hdoupe and I went over them last Friday, and reforms 0, 1, 2, 6, 7, 8, and 9 checked out. I just finished checking 3, 4, and 5 today, and the results check out. In verifying the validity of these
|
…gression testing to regression
I think this PR is close to being ready to go, but there are a few things that need to be done first. First, I made a few changes that really need to be added in separate PRs.
Second, I'm not sure where the tolerance should be set for each parameter. By default, I set the relative tolerance to 0.001. I think it will take a few modifications to the code and subsequent tests to get the tolerances dialed in. However, if anyone has any recommendations for where they should be set now, then I'm willing to change them. @jdebacker @rickecon @PeterDSteinberg what are your thoughts on this? |
@hdoupe I like both your additions (1) and (2). Nice work! Re the tolerance. By relative, do you mean that 0.001 is looks for a 0.1% difference, and if larger, flags this for an issue? If so, I think that's a reasonable starting point. As you said, we may need to adjust this. |
@jdebacker Thanks, I'll try to get those up today. By relative tolerance I mean 0.1%. I just used |
@jdebacker @rickecon do you think the regression data should be compressed? Compressed, all of the regression data is 93 MB. Also, what are the next steps for this PR? I think the tolerances will need some adjusting, but it's hard to do that without having results to compare to the regression results. |
@hdoupe Sure, let's compress. I don't see the downside. Shall we go ahead and merge this? We can make any adjustments to the tolerance in another PR. |
@jdebacker Ok, sounds good. The most recent commit compresses the files. This should be good to go. |
thanks @hdoupe ! Merging. |
Regression testing
Addresses PRs #309, #232, and webapp-public issue #539.
The initial commits just get the ball rolling on the regression testing plan. I ran the 10 reforms that @jdebacker listed in this comment. I used the
Pool
class from themultiprocessing
module and was able to run the reforms in parallel. Using my macbook with 4 cores, it took about 12 hours to run all of these reforms. I used a modified version ofogusa/run_examples/run_ogusa_serial.py
in order to do this. The outputs are 20 directories, a reform and baseline directory for each reform.TODO:
puf.csv
orcps.csv
)I have some concerns with the initial commits. It only took 12 hours to run all 10 reforms and I expected it to take much longer. So, I probably did not set the parameters correctly. Further, I cannot tell whether the
puf.csv
file or test data was used since I cannot find a place in OG-USA wherepuf.csv
is explicitly passed totaxcalc.Records
.@jdebacker @rickecon @MattHJensen @PeterDSteinberg