-
Notifications
You must be signed in to change notification settings - Fork 30
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add puf_stage3 #58
Add puf_stage3 #58
Conversation
…tors to be used in TaxCalc
…lating the next year's
# Conflicts: # Stage I/Stage_I_factors.csv # Stage II/Stage_II_targets.csv # Stage II/Stage_I_factors.csv
Why are the values in the |
@martinholmer the precision is there so that the aggregate total for interest income doesn't change with the adjustments. I tried rounding off the factors to carrying degrees but in each case total interest income changed. |
@andersonfrailey said:
You need to be more forthcoming: how much rounding caused exactly how much change in aggregate interest income? Show us the quantitative results of the rounding experiments you carried out. |
@martinholmer here are levels of precision and the difference between interest income with and without the adjustment for the years 2014-2026 as well as the average. If the goal is to stay under 50MB, up to 10 decimal places will work, and the difference will be pretty small. Keep in mind this is also factors for just one variable. If more are added over time the file size will continue to increase unless a method to compare TaxCalc distributions to SOI distributions that doesn't rely on AGI is used so that assigning each record a factor before running TaxCalc becomes unnecessary. |
@martinholmer @MattHJensen, what are y'all's thoughts on how precise to make the factors given the information above? Like |
@andersonfrailey asked about issues related to taxdata pull request #58. Anderson, I'd like to schedule a phone call so that we can discuss a variety of issues related to #58. |
…tage3 # Conflicts: # Stage I/Stage I.py # Stage II/CLP_solver_16Years.py # puf_data/finalprep.py
Ignore most recent commits. Didn't notice the merge conflicts. Will commit again after fixing. |
I did some more work on stage 3 making it fit the redesigned TaxData repo. This changes the directory and file names to fit the new convention we're going with in TaxData and added the necessary files for stage 3 to work and adds documentation. I've also updated Files in stage 3 directory:
@martinholmer and I talked about adding documentation for each step in the data process to its specific directory rather than putting it all in one high-level docs directory. I didn't do that in this PR to fit with our current documentation, but if we decide to go that route the stage 3 documentation can be moved. The results of adding stage 3 can be seen in this notebook I'll open a corresponding PR in the TaxCalc repo after this has been reviewed and any problems addressed. cc: @MattHJensen |
@andersonfrailey, Thanks for the revisions to taxdata pull request #58, which adds a Stage 3 to the puf taxdata preparations. This pull request looks good on substance, but needs a few cosmetic changes that will reduce the chance that others will be confused by Stage 3 logic. Here is what I have in mind: (1) We need a unique noun to name what is produced by Stage 3. As you know, Stage 1 produces grow factors in the (2) The
This file structure makes the bin numbering explicit and is better suited to adding extra years and extra ratio types (like DIV). It is better suited because adding new ratio types or years involves simply adding rows without changing any existing rows. Does this change in the structure of the |
@martinholmer changing from factors to ratios is good with me. With regards to the restructure to |
@andersonfrailey said:
OK, go ahead and make the changes when you have time. @andersonfrailey said:
OK, these are important considerations. Let talk about this. What is the one line of code in Tax-Calculator that makes it so easy to apply the adjustment ratios when the |
@martinholmer, right now I can just use |
@andersonfrailey said:
OK, please do it that way: read in the |
@martinholmer taken care of in latest commit. |
@andersonfrailey, Thanks for all your work on taxdata pull request #58. |
This PR adds the necessary files to add a "stage 3" step to the extrapolation/blowup process, as discussed in TaxCalc issue #1110.
The methodology of this has been discussed in the previously mentioned PR and can be found in
Stage III.md
, so I won't go over it again here, but I'm happy to answer any questions and open to any suggestions that would improve it.The biggest aspect of this PR is adding the Stage III directory, which includes the stage 3 script, targets used to determine income distributions, stage 1 factors, and weights file.
I've also adjusted the stage 1 and stage 2 scripts so they will write a copy of
Stage_I_factors.csv
andWEIGHTS.csv
to Stage III directory.After talking with @MattHJensen, interest income will be the only variable adjusted at first, but the stage 3 script is written so that adding additional variables in the future won't be difficult.
I'm running the final tests to implement this into TaxCalc and will open a corresponding PR when that is finished.
@Amy-Xu @martinholmer @codykallen