Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Questions about extrapolated benefits #241

Closed
martinholmer opened this issue Jul 5, 2018 · 4 comments
Closed

Questions about extrapolated benefits #241

martinholmer opened this issue Jul 5, 2018 · 4 comments
Labels

Comments

@martinholmer
Copy link
Contributor

Given the fact that (1) the cps_stage4/extrapolation.py script on the taxdata master branch does not work, and (2) the taxcalc/cps_benefits.csv.gz file on the Tax-Calculator master branch was apparently generated with unknown weights, and (3) the test_extrapolation.py tests were never finished, I decided it would be prudent to compare the actual benefit dollar amounts and benefit recipient counts generated by Tax-Calculator with the dollar and count targets implied by the cps_stage4/growth_rates.csv data (which is used by the cps_stage4/extrapolation.py script.

I had difficulty doing this because when I looked in the C-TAM and taxdata repositories, I could not find the benefit dollar and recipient count targets for each benefit program by year. Did I miss that crucial information?

Given that I couldn't find that information, I developed a script to generate for each program year two targets: one for aggregate dollar benefits and the other for total number of participants or recipients. That script, which is called bentarget.py, tabulates the actual benefit dollars and recipient counts for 2014 using Tax-Calculator. And then those base-year targets are inflated to the comparison year using the appropriate growth rate from the taxdata/cps_stage4/growth_rates.csv file. Then in another script, which is called benactual.py, Tax-Calculator is used to tabulate the aggregate dollar benefits and total number of filing unit recipients for the comparison year. These two scripts write out the constructed targets and the actual Tax-Calculator results for both the base year (2014) and the comparison year (either 2015 or 2027) to files that have the same format. And finally, those two target and actual results files are compared with the Unix diff utility. I will post these two scripts in a subsequent comment in this issue.

Here I show the results of this exercise and then make some observations and ask some questions.
In all the results tables shown below, the third column is aggregate benefits in billions of dollars, the fourth column is number of recipients in millions of filing units, and the fifth column the the ratio of the two, which is the average benefit expressed in thousands of dollars per year.

2014 RESULTS

2014 ssi	  54.040   6.829     7.9
2014 mcare	 575.245  38.534    14.9
2014 mcaid	 368.102  27.905    13.2
2014 snap	  82.909  28.516     2.9
2014 wic	   3.602   4.817     0.7
2014 tanf	  30.891   3.388     9.1
2014 vet	 146.602   4.901    29.9
2014 housing	  32.604   4.626     7.0

2015 RESULTS with < denoting target results and > actual results

$ python benactual.py > benactual.res ;
  python bentarget.py > bentarget.res ;
  diff bentarget.res benactual.res
9,16c9,16
< 2015 ssi	  54.710   6.815     8.0
< 2015 mcare	 601.534  39.382    15.3
< 2015 mcaid	 412.237  30.006    13.7
< 2015 snap	  82.486  27.968     2.9
< 2015 wic	   3.631   4.681     0.8
< 2015 tanf	  27.213   2.394    11.4
< 2015 vet	 152.056   4.856    31.3
< 2015 housing	  33.540   4.635     7.2
---
> 2015 ssi	  54.762   6.836     8.0
> 2015 mcare	 602.434  39.613    15.2
> 2015 mcaid	 412.749  29.701    13.9
> 2015 snap	  82.556  27.969     3.0
> 2015 wic	   3.635   4.680     0.8
> 2015 tanf	   7.720   2.280     3.4
> 2015 vet	 152.295   4.909    31.0
> 2015 housing	  33.571   4.630     7.3

2027 RESULTS with < denoting target results and > actual results

$ python benactual.py > benactual.res ;
  python bentarget.py > bentarget.res ;
  diff bentarget.res benactual.res
9,16c9,16
< 2027 ssi	  73.667   7.123    10.3
< 2027 mcare	1336.409  53.477    25.0
< 2027 mcaid	 391.808  35.339    11.1
< 2027 snap	  78.871  27.022     2.9
< 2027 wic	   3.613   4.620     0.8
< 2027 tanf	  27.363   2.930     9.3
< 2027 vet	 168.622   4.634    36.4
< 2027 housing	  46.461   4.214    11.0
---
> 2027 ssi	  73.738   7.185    10.3
> 2027 mcare	1338.423  54.801    24.4
> 2027 mcaid	 392.290  35.621    11.0
> 2027 snap	  78.939  25.386     3.1
> 2027 wic	   3.617   4.539     0.8
> 2027 tanf	   7.763   2.313     3.4
> 2027 vet	 168.887   4.654    36.3
> 2027 housing	  46.505   4.206    11.1

OBSERVATONS
The actual Tax-Calculator-generated results are reasonably close to the pseudo targets in 2015 and 2027 for all benefit programs except one: TANF. There are enormous differences between the TANF targets and actuals in both comparison years. For example, in 2015 the actual dollar benefit total is $7.720 billion while the pseudo target is $27.213 billion, which is lower than the 2014 base amount of $30.891 billion. The 2015 actual is 72% lower than the target for that year. And things are the same in 2027: the actual benefit total is 72% lower than the target.

QUESTIONS

  1. Why are only the TANF actual and target benefits very far apart?

  2. Why are the pseudo targets (computed as described above) for SNAP, to take a random example, so different from the targets mentioned in the C-TAM repository? The SNAP/README.md file contains this:

Target data for the imputation came from official SNAP data. In Fiscal Year 2014, an average of 22.74 million households claimed about $70 billion in benefits.

But in calendar year 2014, the Tax-Calculator-generated total SNAP benefits is $82.909 billion, which is 18% above the $70 billion target mentioned in the C-TAM repository.

@Amy-Xu @andersonfrailey @hdoupe @MattHJensen

@martinholmer
Copy link
Contributor Author

benactual.py

from __future__ import print_function
from taxcalc import *

year = 2015

benefits = ['ssi', 'mcare', 'mcaid', 'snap', 'wic', 'tanf', 'vet', 'housing']


def results(calc):
    wght = calc.array('s006')
    for bname in benefits:
        ben = calc.array('{}_ben'.format(bname))
        benamt = (ben * wght).sum() * 1e-9
        benrec = wght[ben > 0].sum() * 1e-6
        benavg = benamt / benrec
        res = '{} {}\t{:8.3f}{:8.3f}{:8.1f}'.format(calc.current_year, bname,
                                                    benamt, benrec, benavg)
        print(res)


calc = Calculator(policy=Policy(),
                  records=Records.cps_constructor(),
                  verbose=False)
assert calc.current_year == 2014
calc.calc_all()
results(calc)
calc.advance_to_year(year)
calc.calc_all()
results(calc)

bentarget.py

from __future__ import print_function
from taxcalc import *

year = 2015

benefits = ['ssi', 'mcare', 'mcaid', 'snap', 'wic', 'tanf', 'vet', 'housing']


def results(yr, sval, bgr):
  for bname in benefits:
    benstart = sval.loc[bname, 'benamt']
    benfactor = 1.0 + bgr['{}_benefit_growth'.format(bname)][yr]
    benamt = benstart * benfactor
    recstart = sval.loc[bname, 'benrec']
    recfactor = 1.0 + bgr['{}_participation_growth'.format(bname)][yr]
    benrec = recstart * recfactor
    benavg = benamt / benrec
    res = '{} {}\t{:8.3f}{:8.3f}{:8.1f}'.format(yr, bname,
                                                benamt, benrec, benavg)
    print(res)


benres_col_names = ['year', 'name', 'benamt', 'benrec', 'benavg']
benact = pd.read_table('benactual.res', delim_whitespace=True,
                       header=None, names=benres_col_names)
pd.options.mode.chained_assignment = None
sval = benact[benact['year'] == 2014]
sval.index = sval['name']
sval.drop(['year', 'name', 'benavg'], axis='columns', inplace=True)

bgr = pd.read_csv('../taxdata/cps_stage4/growth_rates.csv', index_col=0)

results(2014, sval, bgr)
results(year, sval, bgr)

@MaxGhenis
Copy link
Contributor

Re: TANF, could PSLmodels/C-TAM#65 be relevant? Maybe some change didn't propagate?

@martinholmer
Copy link
Contributor Author

@MaxGhenis said regarding TANF differences reported in taxdata issue #241:

Re: TANF, could PSLmodels/C-TAM#65 be relevant? Maybe some change didn't propagate?

@Amy-Xu, what is the answer to @MaxGhenis' question?

@martinholmer
Copy link
Contributor Author

Closing this issue about the extrapolated benefits because the extrapolation.py script has been revised in pull request #242 and a test of the extrapolated benefits has been added in the test_benefits.py file.

There are still substantive issues with the extrapolated benefits, but these issues will be discussed in one or more new GitHub issues.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants