Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Regression testing #316

Merged
merged 19 commits into from
Nov 27, 2017
Merged

Regression testing #316

merged 19 commits into from
Nov 27, 2017

Conversation

hdoupe
Copy link
Contributor

@hdoupe hdoupe commented Sep 11, 2017

Addresses PRs #309, #232, and webapp-public issue #539.

The initial commits just get the ball rolling on the regression testing plan. I ran the 10 reforms that @jdebacker listed in this comment. I used the Pool class from the multiprocessing module and was able to run the reforms in parallel. Using my macbook with 4 cores, it took about 12 hours to run all of these reforms. I used a modified version of ogusa/run_examples/run_ogusa_serial.py in order to do this. The outputs are 20 directories, a reform and baseline directory for each reform.

TODO:

  • Verify that the reforms are created correctly (correct parameters, using puf.csv or cps.csv)
  • Perform sensitivity analysis to make sure that results are not changing dramatically to small changes in the reforms
  • Choose tolerance for differences between future results and saved results

I have some concerns with the initial commits. It only took 12 hours to run all 10 reforms and I expected it to take much longer. So, I probably did not set the parameters correctly. Further, I cannot tell whether the puf.csv file or test data was used since I cannot find a place in OG-USA where puf.csv is explicitly passed to taxcalc.Records.

@jdebacker @rickecon @MattHJensen @PeterDSteinberg

kwargs={'output_base':output_base, 'baseline_dir':BASELINE_DIR,
'test':True, 'time_path':True, 'baseline':False,
kwargs={'output_base':output_base, 'baseline_dir':baseline_dir,
'test':False, 'time_path':True, 'baseline':False,
'analytical_mtrs':False, 'age_specific':True,
'user_params':user_params,'guid':'', 'reform':reform ,
'run_micro':False, 'small_open': False, 'budget_balance':False,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hdoupe One setting affecting your run times is the kwarg run_micro:False. run_micro is a boolean that is true if the tax functions are to be estimated. This takes about 2 hours on my MacBook Pro (thought it's embarrassingly parallelizable). You will need this to be true unless you have pickle files with the saved parameters. You can look at the parameters.py script to see how those pickle files are reference in the cast the tax function parameters for the reforms are pre-estimated. They do use the guid, so you'll probably want to set that to be specific to each reform.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @jdebacker. I'll look into that.

Should small_open be set to True?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's set it to false for these runs. I think that's a more interesting case of the model.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And keep budget_balance=False.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok sounds good. Thanks

@hdoupe
Copy link
Contributor Author

hdoupe commented Sep 13, 2017

@jdebacker One more question before I run the reforms again. Where should puf.csv be in the OG-USA` directory? I'm having trouble figuring out where that path is set.

@jdebacker
Copy link
Member

@hdoupe Good question. I think it needs to be in the directory that you have the run_ogusa_serial.py script.

I don't know Tax-Calc we'll enough - is there anyway we can run this such that it gets puf.csv from the Tax-Calc directory? i.e., so we don't have to have this file in more than one place.

BTW, now the that you are doing things in parallel, you might rename that script to something else like run_ogusa.py or ogusa_regression_parallel.py or similar.

@hdoupe
Copy link
Contributor Author

hdoupe commented Sep 13, 2017

@jdebacker Ok I'll give that at shot.

I don't know Tax-Calc we'll enough - is there anyway we can run this such that it gets puf.csv from the Tax-Calc directory? i.e., so we don't have to have this file in more than one place.

That would work if I used my local TaxCalc repo but I'm not sure if we can do that with conda install. It looks like get_micro_data.get_calculator takes a keyword argument, data. I'll try to figure out what's getting passed to that.

BTW, now the that you are doing things in parallel, you might rename that script to something else like run_ogusa.py or ogusa_regression_parallel.py or similar.

Ok that makes sense. Now I see that you have run_many_ogusa_parallel.py that's pretty similar to the script in this PR. I'll merge those.

Thanks

@hdoupe
Copy link
Contributor Author

hdoupe commented Sep 19, 2017

Commit Estimate tax functions, explicitly set puf data incorporates suggestions from @jdebacker. I'm running the reforms now. Hopefully, I'll have results in a day or two.

@hdoupe
Copy link
Contributor Author

hdoupe commented Sep 29, 2017

@jdebacker I have finally run all of the reforms. It took some time to figure out how to set the path for the tax function parameter files. I had to devote a lot of my time to the webapp the last few weeks. Going forward, I'll get to spend more time on this project. Hopefully, the turn around time on these issues should decrease.

Some of the reforms had some errors but I edited them just to get the model to run. Can you take a look at the edited reforms to make sure that's what you want?

Some of the reforms are slight variations of other reforms. I plotted the percent change for the similar reforms together and the unrelated reforms by themselves. The plots are in run_examples/regression_plots and the file that generated them is run_examples/regression_plots.py. Most of the results look pretty sensible in that the reforms that were similar had similar looking plots and the differences between them made sense. Can you take a look at these results and let me know what you think?

@rickecon @MattHJensen

@jdebacker
Copy link
Member

@hdoupe This is great - thanks. I'll take a look and get back you to about the results.

@rickecon
Copy link
Member

rickecon commented Oct 3, 2017

@hdoupe @jdebacker . These 10 reforms look good. I @hdoupe and I went over them last Friday, and reforms 0, 1, 2, 6, 7, 8, and 9 checked out. I just finished checking 3, 4, and 5 today, and the results check out.

In verifying the validity of these OG-USA runs, as well as others in the past, a pattern of common steps to follow emerges.

  1. Start with the steady-state to make sure those results make sense. They usually do, but they also include the effects of the budget closure rules.
  2. Look at the differences in time paths of the aggregate variables to see if their results are intuitive. Often, the steady-state is intuitive while the transition path is more difficult to explain.
  3. Test whether the reform has the expected effect on the tax functions. Create a difference in ETR's, MRTx's (labor income), and MTRy's (capital income) from the basellne to the reform and make sure that the change fits the expectation.
  4. Check the baseline and reform time paths of aggregate labor and the underlying difference in the distribution of labor supply over the time path.
  5. Check the baseline and reform time paths of the aggregate capital stock and the underlying difference in the distribution of savings over the time path. This is where some difficulty often arises. You have demographic effects and you have the fact that investment is the change over time in the aggregate capital stock. For example, the difference in the aggregate capital stock can be negative, but the difference in aggregate investment can be positive.

@hdoupe hdoupe mentioned this pull request Oct 27, 2017
@hdoupe
Copy link
Contributor Author

hdoupe commented Nov 1, 2017

I think this PR is close to being ready to go, but there are a few things that need to be done first.

First, I made a few changes that really need to be added in separate PRs.

  1. I added a data keyword argument to execute.runner, txfunc.get_tax_func_estimate, txfunc.tax_func_estimate, and get_micro_data.get_data. This allowed me to load a dataframe and pass that to get_micro_data.get_calculator. This isn't necessary since Tax-Calculator looks for puf.csv in the current working directory. However, I think this is a useful capability because it allows you to specify puf.csv.gz or potentially use the CPS file. This should not be an invasive change since we just need to add on a data keyword to the functions listed above.

  2. I ran into some problems when specifying the directory where the results were sent. For example, you can only specify the GUI ID in parameters.get_parameters which is where the tax function variables are read. I wanted to put all of the output files in one directory with a name of my choosing, such as REG_REFORM_2. I made some changes in txfunc.get_tax_func_estimate and parameters.get_parameters that allowed me to do this. I think these changes make setting where the output is stored more explicit and I think that makes the model easier to use.

Second, I'm not sure where the tolerance should be set for each parameter. By default, I set the relative tolerance to 0.001. I think it will take a few modifications to the code and subsequent tests to get the tolerances dialed in. However, if anyone has any recommendations for where they should be set now, then I'm willing to change them.

@jdebacker @rickecon @PeterDSteinberg what are your thoughts on this?

@jdebacker
Copy link
Member

@hdoupe I like both your additions (1) and (2). Nice work!

Re the tolerance. By relative, do you mean that 0.001 is looks for a 0.1% difference, and if larger, flags this for an issue? If so, I think that's a reasonable starting point. As you said, we may need to adjust this.

@hdoupe
Copy link
Contributor Author

hdoupe commented Nov 2, 2017

@jdebacker Thanks, I'll try to get those up today.

By relative tolerance I mean 0.1%. I just used rtol in numpy.allclose

@hdoupe
Copy link
Contributor Author

hdoupe commented Nov 14, 2017

@jdebacker @rickecon do you think the regression data should be compressed? Compressed, all of the regression data is 93 MB.

Also, what are the next steps for this PR? I think the tolerances will need some adjusting, but it's hard to do that without having results to compare to the regression results.

@jdebacker
Copy link
Member

@hdoupe Sure, let's compress. I don't see the downside.

Shall we go ahead and merge this? We can make any adjustments to the tolerance in another PR.

@hdoupe
Copy link
Contributor Author

hdoupe commented Nov 27, 2017

@jdebacker Ok, sounds good. The most recent commit compresses the files. This should be good to go.

@jdebacker
Copy link
Member

thanks @hdoupe ! Merging.

@jdebacker jdebacker merged commit c3d8bfe into PSLmodels:master Nov 27, 2017
keiirizawa pushed a commit to keiirizawa/OG-USA that referenced this pull request Mar 20, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants