-
Notifications
You must be signed in to change notification settings - Fork 30
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Why is expanded income negative for the bottom decile? #143
Comments
@martinholmer, this is actually a quite common problem. One of the biggest problems with distributional analysis is income mismeasurement at the lowest income levels, in which substantial positive income is offset by comparable losses being written off. One approach used in the literature is to simply drop the bottom 5% of the income distribution, and then do the distributional analysis. Personally, I prefer to deal with such problems by dropping any individuals with negative expanded income or with tax liability greater than their expanded income. Based on a quick tabulation for 2017, there are 5066 filing units with negative expanded income, corresponding to 1.53 million tax filers. Of these 5066 units with negative expanded income, 1943 have a capital loss ( |
@codykallen, Thanks for your helpful comment in taxdata issue #143. |
I pulled some numbers based on 2014 CPS data in this notebook. CPS lacks
3.78% of all tax units, or 37.8% of the bottom decile, has zero expanded income.
0.14% of all tax units, or 1.4% of the bottom decile. Their aggregate expanded income is -$23.8B, an average of -$106k per tax unit. Here's what the distribution looks like:
41% (0.06% of 0.14% total) of this group has negative
Also: 0.23% of tax units have negative
63.6% of the bottom decile has positive expanded income. This group has an aggregate expanded income of $21.0B (short of the -$23.8B from the negative EI group), an average of $2,100 per tax unit. Based on the small set of tax units creating this issue, I'm tempted to discard them for my own analysis. |
@MaxGhenis, thanks for your CPS tabulations of negative expanded income in taxdata issue #143. The new CPS data file introduced in Tax-Calculator release 0.16.0 adds the consumption value of benefits provided by several benefit programs into expanded income. As a result, aggregate expanded income in the bottom expanded-income decile is now positive. This means that the misleading (but accurate) decile graph you describe in Tax-Calculator issue 1806 no longer appears when using CPS data. However, because the new PUF data file introduced in Tax-Calculator release 0.16.0 do not include these benefits, aggregate expanded income in the bottom decile is still negative when using PUF data. To resolve this problem, we have decided not to drop records from the tabulation, but rather do what other tax analysis groups do: offer the option of a quintile graph. The new graph is exactly the same as the decile graph except the percentage change in after-tax expanded income is shown for quintiles. (The new graph also shows the percentage change for subgroups of the top quintile, including the 80-90 percentile group, the 90-95, the 95-99, and the top one percent group.) Because the bottom quintile has positive aggregate expanded income, the misleading (but accurate) graph is never generated. This new quintile graph is available (along with the decile graph) as standard output went using the Thanks again for your help on this. |
@martinholmer thanks for the quintile graph function! I updated the notebook to use taxcalc 0.16.0 with benefits (which I'm extremely excited about!!) and look into the bottom quintile. Here's how the results shifted (I didn't calculate quintiles in the original analysis):
Given these negatives still affect the bottom decile by 28%, it seems like this may at least warrant a caveat in TaxBrain or something. Is there a source or justification for the decision of other tax analysis groups to include them? I couldn't find anything from TPC, TF, or CBO with a quick search. Certain use cases will also justify omitting or zeroing out negatives, for example calculating the Gini coefficient. Users could spin up their own code, but it might be worth some additional taxcalc code to standardize this at some point. |
@MaxGhenis mentioned:
Tax Policy Center omits those with negative income from distributional analyses but includes them in totals. From the footnotes to their distributional tables:
JCT excludes such taxpayers as well. From the footnotes to their distributional tables:
Tax Foundation's approach is not so clear in their publications, but a footnote from a 2009 TF working paper says,
CBO does this too. According to their report, "The Distribution of Household Income and Federal Taxes, 2013" (published August 2016):
|
Please continue this discussion in Tax-Calculator issue 1888, where @MaxGhenis's and @codykallen's 2/16/18 comments are reproduced. |
A Tax-Calculator user recently discovered that in the expanded-income-decile distribution table the bottom decile has negative aggregate expanded income, and that this is true for both the IRS
puf.csv
and thecps.csv
input data. The user pointed out how this plays havoc with the expanded-income-decile difference table and graph. See the full discussion in Tax-Calculator issue 1806.It's clear that the discussion in 1806 is short of facts. We, or at least I, don't have a clear understanding about why aggregate expanded income is negative for the bottom decile. Below are a few questions that we need answers to in order to know how to resolve the problem in 1806.
I imagine there are a number of poor people in the bottom decile who show up as having zero expanded income because neither input file currently includes benefits (such as SSI, food stamps, etc.). What fraction (using sample weights) of the bottom decile has zero expanded income?
I imagine (based largely on some @codykallen comments) that there are a number of people in the bottom decile who are not poor in the conventional sense, but have negative expanded income because they report large business losses. What fraction of the bottom decile has negative expanded income? Which input variables cause the negative expanded income for those people? What would this subgroup's aggregate expanded income be if one or more of the negative income variables were set to zero?
I imagine that there are people in the bottom decile who have small positive values for expanded income. What fraction of the bottom decile has positive expanded income? What is the aggregate expanded income of this subgroup?
Thanks for doing these tabulations and helping us figure out the best way to resolve issue 1806.
The text was updated successfully, but these errors were encountered: