Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

What data work needs to be done for next (0.20.2) release of Tax-Calculator? #235

Closed
martinholmer opened this issue Jun 26, 2018 · 3 comments
Labels

Comments

@martinholmer
Copy link
Contributor

martinholmer commented Jun 26, 2018

I'm raising this issue in order to facilitate a discussion about what should be on our short-term to-do list.

@andersonfrailey and I are in agreement that we will stage recent and pending data enhancements, so that 0.20.2 incorporates all the enhancements except the switch to the new CBO projection, which is pending in PR #180. Then the new CBO projections will be incorporated in Tax-Calculator release 0.20.3. This two-stage approach will provide a clear notion about how much results change from the data fixes and how much they change because of the projection update.

So, the question for discussion here is what enhancements need to be done for 0.20.2?

Here are some candidate enhancements for discussion:

  1. Finish Makefile development and documentation.
    Progress on this point in Eliminate redundant puf_stage1/Stage_I_factors_transpose.csv file #237 and Update taxdata/README.md info and csvcopy.sh script #238 and Add re-zip logic to taxdata Makefile #240

  2. Reduce precision of stage2 targets (in the CSV files) so that the LP calculations are not triggered by tiny changes in the targets (as discussed in PR Makefile revisions: add puf-files target and add cps-files target #233).
    Completed in Standardize float precision in various CSV files #236

  3. Merge PR Update Medicare and Medicaid values in cps.csv.gz file #185 that converts health insurance amounts to actuarial values of the insurance.

  4. Revise tests/test_benefit.py in light of actuarial values of health insurance amounts.

  5. Figure out why there is one PUF record that has zero weight in every year (see code in tests/test_weights.py) and (presumably?) eliminate this filing unit from the PUF files.
    Investigated in PUF record with zero weight #239 but filing unit is not being eliminated from PUF files

  6. Revise cps_stage4/extrapolation.py so that it runs to completion and produces sensible output.
    How best to resolve this is being discussed in Benefits extrapolation script does not work #232
    Completed in PR Revise cps_stage4/extrapolation.py script #242

  7. Complete the work in tests/test_extrapolation.py so that we don't have to skip these tests.

Are there other items that should be added to the list?
Are there items that you think should not be on the list?
Which items do you volunteer to be responsible for?

On this last question, I'm happy to continue working on the Makefile and reduce the target precision and revise the benefit test after PR #185 is merged, but I can't make much more progress on the Makefile until the cps_stage4/extrapolation.py script is working.

@andersonfrailey @hdoupe @MattHJensen

@andersonfrailey
Copy link
Collaborator

@martinholmer, I think you've listed everything that should be done before the taxcalc 0.20.2 release.

The IRS recently released 2015 SOI data (we currently use 2014 data in our targeting), but that update can wait until we also implement the CBO updates.

I can look into the PUF weight with zero weight.

@hdoupe can you take a look at the benefit extrapolation scripts?

@martinholmer with regards to point 2, since PR #233 has already been merged hasn't that already been staged to be included in the release?

@martinholmer
Copy link
Contributor Author

martinholmer commented Jun 28, 2018

@andersonfrailey asked:

with regards to point 2 ["Reduce precision of stage2 targets (in the CSV files) so that the LP calculations are not triggered by tiny changes in the targets (as discussed in PR #233)"], since PR #233 has already been merged hasn't that already been staged to be included in the release?

PR #233 did not change the precision of the target amounts. Point 2 would add something like this format statement to the to_csv calls that write the two target CSV files:

xxx.to_csv("<target_file_name>", float_format="%.0f")

Does it make sense to round the target amounts to whole numbers (without any fractions)?

@hdoupe @MattHJensen

@martinholmer
Copy link
Contributor Author

With the merge of #258, the data preparation for the Tax-Calculator 0.20.2 release has been completed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants