-
Notifications
You must be signed in to change notification settings - Fork 30
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Impute elective DC pension contributions in PUF data #279
Conversation
@andersonfrailey, After commit a500802, the
Can you generate the same |
@martinholmer I was able to create a PUF with the same MD5 as you after your latest commit. |
@andersonfrailey said:
Thanks for checking, @andersonfrailey. That's good to know. |
@martinholmer, just reviewed the details of the PR and it looks good. Could you go into a little more detail about the hand calibration behind the |
@andersonfrailey said:
Are the changes in commit 706830c sufficient? |
@andersonfrailey said:
I also thought that a few days ago, but now I'm not so sure. Here is my concern. For the old PUF data, the PR #279 attempts to fix this problem by adding pension contribution logic as the last step in the final preparation of the
The numbers on these rows seem to come from the |
@martinholmer asked:
and
The answer to your first question is yes. I believe the answer to your second questions is also yes. I don't have firm confirmation of that at this time, but it's logical that the IRS would report wages in their estimates as they appear on the 1040 (net of contributions). With that in mind, I'd say this PR is correct in calculating the contributions after we've created |
@andersonfrailey gave detailed responses to my questions in taxdata PR #279 and then concluded:
Thanks for your response. |
This pull request does what the title says. The fact that PUF
e00200
earnings variables are net of defined-contribution (DC) pension contributions and that payroll taxes are calculated on gross earnings means that payroll tax liability has been under estimated when using the PUF data and that income tax liability uses the correct earnings concept. (Thee00200
variables in CPS data are gross earnings, which means that payroll tax liability is correct but income tax liability is over estimated. This pull request does nothing to fix that CPS data problem, although if the imputation procedure used here is well received it could be applied to the CPS data.) All this and other closely related topics were discussed at length in Tax-Calculator issue 1549 (opened on 2017-Sep-14) and before that in Tax-Calculator issue 1156 (opened on 2017-Jan-25).The amount of the pension contributions in 2011, which is the amount by which PUF earnings used in payroll tax calculations are under estimated, is about $220 billion according to recently published IRS W-2 data tabulations. So, we are talking about imputing a non-trivial amount of "missing" earnings in the PUF data. The same IRS tabulations show almost 47 million individuals (not filing units) making DC pension contributions in 2011. This implies a mean (positive) pension contribution of about $4,700 per person.
The details of the imputation procedure are discussed in the docstring at the top of the new
puf_data/impute_pencon.py
file. The basic idea is to use the W-2 data to compute the probability of a positive pension contribution for each age-wage cell and the pension contribution rate (as a fraction of wages) for each age-wage cell.@MattHJensen @feenberg @andersonfrailey @hdoupe @Amy-Xu @donboyd5