-
-
Notifications
You must be signed in to change notification settings - Fork 159
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adding an extra step to extraploation/records blowup #1110
Comments
@andersonfrailey, I think this is a valuable addition to Tax-Calculator, and it is worth the additional runtime. I think this adjustment should (at first) by applied to the most inaccurate distributions: taxable interest, itemized deductions, and perhaps pass-through income. I would prefer to target SOI distributions, but I am open to alternatives. |
Update on this issue: I have worked out an initial solution to implementing this into both TaxData and TaxCalc. Here is an overview of steps I've taken and some initial observations. ProcessTaxData:
TaxCalc:
I assign each record an adjustment factor rather than simply creating a file with a few factors and adding logic to TaxCalc to determine which to apply because it is based on AGI and this way TaxCalc does not lose the ability to advance to a future year without having to calculate AGI for each year in-between. There are two obvious drawbacks to this:
On the other hand, because all the logic for determining which factor is associated with which record is handled outside of TaxCalc, the additional runtime from this step is minimal. As with the first issue, I'm open to suggestions on other ways to implement this. Observations from initial testingTotal interest income does decrease by a few decimal points:
Overall the distribution looks significantly more like that in the SOI data than previously. Total income tax liabilities also increases somewhat while total AGI actually drops a little. These results and a few other observations can be seen here. This notebook uses the weights and blowup factors currently used by TaxCalc. Not those uploaded in PR #1105. |
👍 |
@andersonfrailey proposed in issue #1110:
Several questions: (a) How many variables (including interest income used in your example) do you envision adjusting? Which ones? (b) Wound all those adjustment factors go into the same new file? So that there was only one new file in the taxcalc package. (c) Why not put the new code that applies these adjustments at the end of the |
@martinholmer asked:
I want to keep this focused on variables where the distribution is significantly off. Interest income is the only one I have set in stone, but @codykallen has told me the distribution of itemized deductions also needs improvement so I'm planning on looking into the individual components of that go into that calculation see if any improvements can be made. I'm open to adding any variables that are deemed necessary by contributors.
Yes. All the factors would go into the same file.
The new code could be added to the end of the _blowup function. The only reason I created a new one was to make a clear distinction that these were two different steps as the number of variables that gets adjusted gets longer. Other than that there's no need for a new function. |
@andersonfrailey, Thanks for the prompt and clear answers to my questions about issue #1110. |
@andersonfrailey, does the stage 3 influence how close we are to any stage 2 targets? |
@MattHJensen, can you clarify your question a bit? Are you asking if stage 3 changes aggregate totals of any of the variables targeted in stage 2? If so, it does not. All aggregate totals remain the same (with exception to the slight change in the targeted variables as noted above) |
Looking at each of the components in itemized deductions, it seems the one that is the most unlike its SOI distribution is non-cash contributions (see here). The difference is particularly noticeable for those with AGI above $10M. The other deduction items seem relatively close to their actual distribution so I'm not sure how beneficial adding an adjustment would be. The notebook does not have comparisons for the medical and casualty or theft loss deductions because the SOI data did not provide totals for each AGI bin for disclosure purposes. cc: @codykallen |
@andersonfrailey, In a notebook you reference in #1110 you read in a file called |
@martinholmer, that data can be found here, in the section "All Returns: Sources of Income, Adjustments Deductions and Exemptions, and Tax Items." If you download the 2014 file under that section, taxable interest can be found in column I of the Excel file. |
@martinholmer asked
|
@codykallen and @andersonfrailey, Thanks for the pointers to the 2014 IRS-SOI data. |
To account for changes in the distribution of wages, CBO has an extra step in their blowup process to adjust wages to match their targeted distribution. The process goes something like this:
My idea is to essentially add a stage 3 to our extrapolation process that finds these adjustment factors before hand to be read and applied in
records.py
. In addition to having the current blowup factor applied, each record would be multiplied by an adjustment factor. This would add some runtime to TaxCalc, but if the adjustment factors are computed before hand like the blowup factors are, I believe it would be minimal and given how skewed some of our distribution are (see interest income here) the time tradeoff could be worth it.There are a couple questions I can think of right off the bat that would need addressing. First, how far off does our distribution need to be to merit adding this additional step? Second, what would the target distribution be for the various sources of income?. In the notebook linked above I compared against SOI data which is broken down by level of AGI, but there could be better options I am not familiar with.
I'd love some feedback on the general idea along with any ideas for implementation you may have.
@martinholmer @feenberg @MattHJensen @codykallen @Amy-Xu @GoFroggyRun
The text was updated successfully, but these errors were encountered: