Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Targets for CPS imputation of taxable pension income, e01700 #159

Closed
martinholmer opened this issue Mar 1, 2018 · 7 comments
Closed

Targets for CPS imputation of taxable pension income, e01700 #159

martinholmer opened this issue Mar 1, 2018 · 7 comments

Comments

@martinholmer
Copy link
Contributor

The following code appears in the cps_data/finalprep.py file:

    # Split pentions and annuities using PUF ratio
    data['e01700'] = data['e01500'] * 0.1656

I don't see how the above code can be correct. In the 2011 IRS-SOI PUF, taxable pension income (e01700) as a fraction of total pension income (e01500) is far higher than 16 percent. Here is what the aggregate totals look like in the IRS-SOI documentation:

screen shot 2018-03-01 at 1 52 29 pm

My calculator says the fraction is nearly 64 percent.

Where did the 16 percent fraction come from?

@andersonfrailey
Copy link
Collaborator

I took a look in the raw CPS file at the relationship between E01500 and E01700 and got a couple of results that could be helpful.

Among tax units with a positive value for E01500, roughly 48% have fully taxable pension (E01500 == E01700), about 18% have no taxable pension (E01700 == 0) and the remainder are somewhere in between. So it doesn't appear that there is one rule we could follow to fix our issues with the non-filer pension values in the CPS.

@martinholmer
Copy link
Contributor Author

@andersonfrailey said:

Among tax units with a positive value for E01500, roughly 48% have fully taxable pension (E01500 == E01700), about 18% have no taxable pension (E01700 == 0) and the remainder are somewhere in between. So it doesn't appear that there is one rule we could follow to fix our issues with the non-filer pension values in the CPS.

Thanks for the helpful tabulation.

Are the 48% and 18% raw counts or are you using the filing unit's weight, s006?

Are the 48% higher income units?

What about this single (compound) rule for those with positive e01500?

  1. assign e1700 to be equal to e01500 with probability Prob1

  2. if not e01700==e01500, then assign e01700 to be equal to 0 with probability Prob2

  3. If neither assignment (in 1 or 2), then assign e01700 to be equal to Frac * e01500.

With three control parameters (Prob1, Prob2 and Frac) it would seem as if we could hit three targets (the fraction with e01700==e01500, the fraction with e01700==0, and the aggregate dollar ratio of taxable to total, which is roughly 64 percent in 2011 PUF).

Does this make sense?

@andersonfrailey
Copy link
Collaborator

Are the 48% and 18% raw counts or are you using the filing unit's weight, s006?

These were raw counts.

Are the 48% higher income units?

I plotted out E01700 / E01500 by AGI:
screen shot 2018-03-02 at 9 38 10 am

There doesn't seem to be much of a relationship between AGI and what portion of a unit's pensions are taxable.

What about this single (compound) rule for those with positive e01500?

  1. assign e1700 to be equal to e01500 with probability Prob1
  2. if not e01700==e01500, then assign e01700 to be equal to 0 with probability Prob2
  3. If neither assignment (in 1 or 2), then assign e01700 to be equal to Frac * e01500.
    With three control parameters (Prob1, Prob2 and Frac) it would seem as if we could hit three targets (the fraction with e01700==e01500, the fraction with e01700==0, and the aggregate dollar ratio of taxable to total, which is roughly 64 percent in 2011 PUF).

Would these assignments be random? Or did you have another method for assignment in mind?

@martinholmer
Copy link
Contributor Author

@andersonfrailey asked:

What about this single (compound) rule for those with positive e01500?

  1. assign e1700 to be equal to e01500 with probability Prob1
  2. if not e01700==e01500, then assign e01700 to be equal to 0 with probability Prob2
  3. If neither assignment (in 1 or 2), then assign e01700 to be equal to Frac * e01500.
    With three control parameters (Prob1, Prob2 and Frac) it would seem as if we could hit three targets (the fraction with e01700==e01500, the fraction with e01700==0, and the aggregate dollar ratio of taxable to total, which is roughly 64 percent in 2011 PUF).

Would these assignments be random? Or did you have another method for assignment in mind?

Yes. Set the random number seed and then draw a uniformly distributed random number for each funit.
Then use Prob1 and Prob2 to assign each funit to one of the three categories (probably using numpy where statements. Does that make sense?

@andersonfrailey
Copy link
Collaborator

Makes sense to me. I'll start working on getting this implemented.

@martinholmer
Copy link
Contributor Author

@andersonfrailey, Here are the three pension targets for the imputation of e01700 in the cps.csv.gz file.
First the SQL tabulation program and then (as a comment in that program) the tabulation results:

select "#m PUF funits", round(sum(s006)*1e-6,3)
   from dump
   where filer = 1;

select "#m PUF funits with pension>0", round(sum(s006)*1e-6,3)
   from dump
   where filer = 1
     and e01500 > 0;

select "#m PUF funits with pension>0; all taxable", round(sum(s006)*1e-6,3)
   from dump
   where filer = 1
     and e01500 > 0 and e01700 = e01500;

select "#m PUF funits with pension>0; no taxable", round(sum(s006)*1e-6,3)
   from dump
   where filer = 1
     and e01500 > 0 and e01700 = 0;

select "#m PUF funits with pension>0; some taxable; frac",
       round(sum(s006)*1e-6,3),
       round(sum(e01700*s006)/sum(e01500*s006),3)
   from dump
   where filer = 1
     and e01500 > 0 and e01700 != e01500 and e01700 != 0;

/*
$ tc puf.csv 2015 --sqldb
$ cat targets-pension.sql | sqlite3 puf-15-#-#-#.db
#m PUF funits|148.64
#m PUF funits with pension>0|30.924
#m PUF funits with pension>0; all taxable|18.915              ==> prob1 = 0.612
#m PUF funits with pension>0; no taxable|2.267                ==> prob2 = 0.073
#m PUF funits with pension>0; some taxable; frac|9.742|0.577  ==> frac = 0.577
*/

@martinholmer martinholmer changed the title Incorrect e01700 values in cps.csv file Targets for CPS imputation of taxable pension income, e01700 Mar 6, 2018
@martinholmer
Copy link
Contributor Author

The results reported in this issue have been incorporated in pull request #165.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants