-
-
Notifications
You must be signed in to change notification settings - Fork 157
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update data #2599
Update data #2599
Conversation
@andersonfrailey Thanks for this PR! I haven't run into these errors before, but they both look related to the data, so I'd start looking through the objects in these functions, to identify where things take on unexpected shapes/values. |
Figured out the test failures. The benefits test failed because benefits growth factors were missing from The policy test failures were a rounding issue. Here's the test currently: np.testing.assert_equal(
(pol2.EITC_c[1] / pol2.EITC_c[0] - 1).round(4),
pol0.inflation_rates(year=2022) + (-0.001),
) And the actual values for each element in that comparison: In [7]: pol0.inflation_rates(year=2022) + (-0.001)
Out[7]: 0.019799999999999998
In [9]: (pol2.EITC_c[1] / pol2.EITC_c[0] - 1).round(4)
Out[9]: array([0.0198, 0.0198, 0.0198, 0.0198]) To fix the bug, I just need to round np.testing.assert_equal(
(pol2.EITC_c[1] / pol2.EITC_c[0] - 1).round(4),
(pol0.inflation_rates(year=2022) + (-0.001)).round(4),
) |
Codecov Report
@@ Coverage Diff @@
## master #2599 +/- ##
=======================================
Coverage 98.46% 98.46%
=======================================
Files 14 14
Lines 2611 2611
=======================================
Hits 2571 2571
Misses 40 40
Flags with carried forward coverage won't be shown. Click here to find out more.
|
@andersonfrailey I wanted to check in on the status of this PR -- we are waiting on this to be updated to TaxData 0.3.1, correct? |
@jdebacker, the last commit updated it. It's ready to go now |
@andersonfrailey What is the reason for making CPS-specific variables such as line and sequence numbers available for |
@jdebacker, adding those gives users the option to find the individual CPS records that have been matched to the PUF in the raw CPS file. To date, most users have wanted the identifiers to link tax units from taxdata_cps to the raw CPS, but with the TaxData refactor adding identifiers to the PUF was easy so I figured it'd be better to just have those available in the PUF as well. |
@andersonfrailey Thanks for the previous reply. I now understand and I think that makes sense to add those variables to the PUF. To be sure -- is the TaxData documentation clear in noting that these are statistically matched records and not direct matches of the same individual in the two datasets? For the other variables now available in the PUF, such as |
Yes, we have documentation that explains that these are two different datasets created in two different ways.
I don't offhand, but I'll get those together and post them. |
Latest commit removes benefit variables from the PUF. |
@andersonfrailey Thanks for the updates to this PR. It looks good to me and I'll plan to merge tomorrow after a final review unless there are suggestions otherwise. |
Thank you for the contribution, @andersonfrailey. Merging. |
This PR updates all of the data in Tax-Calc to be up to date with TaxData release 0.3.0. It also extends the projections out to 2031. I had to make a few changes to the tests to get (almost) all of them passing:
AINTS
) fore00300
is now > 1 in 2015 (a result of updating the SOI estimates) so the relative value test had to be flippedThere are still a couple errors failing that I don't know how to fix. The first is in
test_benefits.py
:And the second is in
testpolicy.py
:I think this last one is just an issue with array sizes.
Let me know if there are any questions or tips on fixing the failing tests.
cc @MattHJensen @jdebacker