Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Why is expanded income negative for the bottom decile? #143

Closed
martinholmer opened this issue Jan 4, 2018 · 7 comments
Closed

Why is expanded income negative for the bottom decile? #143

martinholmer opened this issue Jan 4, 2018 · 7 comments
Labels

Comments

@martinholmer
Copy link
Contributor

A Tax-Calculator user recently discovered that in the expanded-income-decile distribution table the bottom decile has negative aggregate expanded income, and that this is true for both the IRS puf.csv and the cps.csv input data. The user pointed out how this plays havoc with the expanded-income-decile difference table and graph. See the full discussion in Tax-Calculator issue 1806.

It's clear that the discussion in 1806 is short of facts. We, or at least I, don't have a clear understanding about why aggregate expanded income is negative for the bottom decile. Below are a few questions that we need answers to in order to know how to resolve the problem in 1806.

  1. I imagine there are a number of poor people in the bottom decile who show up as having zero expanded income because neither input file currently includes benefits (such as SSI, food stamps, etc.). What fraction (using sample weights) of the bottom decile has zero expanded income?

  2. I imagine (based largely on some @codykallen comments) that there are a number of people in the bottom decile who are not poor in the conventional sense, but have negative expanded income because they report large business losses. What fraction of the bottom decile has negative expanded income? Which input variables cause the negative expanded income for those people? What would this subgroup's aggregate expanded income be if one or more of the negative income variables were set to zero?

  3. I imagine that there are people in the bottom decile who have small positive values for expanded income. What fraction of the bottom decile has positive expanded income? What is the aggregate expanded income of this subgroup?

Thanks for doing these tabulations and helping us figure out the best way to resolve issue 1806.

@codykallen
Copy link

@martinholmer, this is actually a quite common problem. One of the biggest problems with distributional analysis is income mismeasurement at the lowest income levels, in which substantial positive income is offset by comparable losses being written off.

One approach used in the literature is to simply drop the bottom 5% of the income distribution, and then do the distributional analysis. Personally, I prefer to deal with such problems by dropping any individuals with negative expanded income or with tax liability greater than their expanded income.

Based on a quick tabulation for 2017, there are 5066 filing units with negative expanded income, corresponding to 1.53 million tax filers. Of these 5066 units with negative expanded income, 1943 have a capital loss (c23650 < 0), 1591 have a Schedule C (sole proprietorship) loss (e00900 < 0), and 3507 have a Schedule E (partnership, S corporation, and passive business) loss (e02000 < 0).

@martinholmer
Copy link
Contributor Author

@codykallen, Thanks for your helpful comment in taxdata issue #143.
But your tabulations answer only some of the questions we need answers to in order know how best to resolve Tax-Calculator issue 1806.

@MaxGhenis
Copy link
Contributor

MaxGhenis commented Jan 15, 2018

I pulled some numbers based on 2014 CPS data in this notebook. CPS lacks c23650 (capital gains/losses) and e02000 (Schedule E) so I couldn't replicate @codykallen's full analysis, but here are some takeaways with respect to @martinholmer's questions.

What fraction (using sample weights) of the bottom decile has zero expanded income?

3.78% of all tax units, or 37.8% of the bottom decile, has zero expanded income.

What fraction of the bottom decile has negative expanded income?

0.14% of all tax units, or 1.4% of the bottom decile. Their aggregate expanded income is -$23.8B, an average of -$106k per tax unit. Here's what the distribution looks like:

download 19

Which input variables cause the negative expanded income for those people?

41% (0.06% of 0.14% total) of this group has negative e00900 (Schedule C). However, 19% of this group has positive e00900, and overall this is a small share of losses (see below). If anyone has ideas on other factors available in CPS I can look into those too.

What would this subgroup's aggregate expanded income be if one or more of the negative income variables were set to zero?

e00900 totals -$36M for tax units with negative e00900 and negative expanded income, 0.15% of the -$23.8B total loss, so wouldn't change much.

Also: 0.23% of tax units have negative e00900 and positive expanded income.

What fraction of the bottom decile has positive expanded income? What is the aggregate expanded income of this subgroup?

63.6% of the bottom decile has positive expanded income. This group has an aggregate expanded income of $21.0B (short of the -$23.8B from the negative EI group), an average of $2,100 per tax unit.

Based on the small set of tax units creating this issue, I'm tempted to discard them for my own analysis.

@martinholmer
Copy link
Contributor Author

@MaxGhenis, thanks for your CPS tabulations of negative expanded income in taxdata issue #143.
Your results underline the role of large business losses in causing the bottom decile's aggregate expanded income to be negative.

The new CPS data file introduced in Tax-Calculator release 0.16.0 adds the consumption value of benefits provided by several benefit programs into expanded income. As a result, aggregate expanded income in the bottom expanded-income decile is now positive. This means that the misleading (but accurate) decile graph you describe in Tax-Calculator issue 1806 no longer appears when using CPS data.

However, because the new PUF data file introduced in Tax-Calculator release 0.16.0 do not include these benefits, aggregate expanded income in the bottom decile is still negative when using PUF data.

To resolve this problem, we have decided not to drop records from the tabulation, but rather do what other tax analysis groups do: offer the option of a quintile graph. The new graph is exactly the same as the decile graph except the percentage change in after-tax expanded income is shown for quintiles. (The new graph also shows the percentage change for subgroups of the top quintile, including the 80-90 percentile group, the 90-95, the 95-99, and the top one percent group.) Because the bottom quintile has positive aggregate expanded income, the misleading (but accurate) graph is never generated.

This new quintile graph is available (along with the decile graph) as standard output went using the tc --graphs option. It is also easy to generate from a Python script in the same way the decile graph is generated in the basic recipe in the Tax-Calculator Cookbook, except that instead of calling the decile_graph Calculator method the new quintile_graph method is called. This new method will be available when the Tax-Calculator pull request 1880 is merged into the master branch. The quintile graph will be available via the tc tool when the Tax-Calculator release after 0.16.0 becomes available.

Thanks again for your help on this.

@MaxGhenis
Copy link
Contributor

@martinholmer thanks for the quintile graph function! I updated the notebook to use taxcalc 0.16.0 with benefits (which I'm extremely excited about!!) and look into the bottom quintile. Here's how the results shifted (I didn't calculate quintiles in the original analysis):

Metric Without benefits (tc 0.15) With benefits (tc 0.16)
Share of bottom decile with negative after-tax income 1.4% 1.1%
Share of bottom decile with zero after-tax income 37.8% 0.75%
After-tax income of tax units with negative after-tax income -$23.8B -$21.8B
After-tax income of bottom decile with positive after-tax income $21.0B $78.4B
After-tax income of full bottom decile -$2.8B $56.6B
Negative tax units' reduction to bottom decile's after-tax income, relative to omitting -113% -28%
After-tax income of bottom quintile with positive after-tax income - $334B
After-tax income of full bottom quintile - $312B
Change in bottom quintile's after-tax income from negative tax units, relative to omitting - -6.5%

Given these negatives still affect the bottom decile by 28%, it seems like this may at least warrant a caveat in TaxBrain or something. Is there a source or justification for the decision of other tax analysis groups to include them? I couldn't find anything from TPC, TF, or CBO with a quick search.

Certain use cases will also justify omitting or zeroing out negatives, for example calculating the Gini coefficient. Users could spin up their own code, but it might be worth some additional taxcalc code to standardize this at some point.

@codykallen
Copy link

@MaxGhenis mentioned:

I couldn't find anything from TPC, TF, or CBO with a quick search.

Tax Policy Center omits those with negative income from distributional analyses but includes them in totals. From the footnotes to their distributional tables:

Tax units with negative adjusted gross income are excluded from their respective income class but are included in the totals.

JCT excludes such taxpayers as well. From the footnotes to their distributional tables:

Individuals who are dependents of other taxpayers and taxpayers with negative income are excluded from the analysis.

Tax Foundation's approach is not so clear in their publications, but a footnote from a 2009 TF working paper says,

Negative income famillies excluded from bottom quintile but included in totals.

CBO does this too. According to their report, "The Distribution of Household Income and Federal Taxes, 2013" (published August 2016):

If a household has negative income (that is, if its business or investment losses are larger than its other income), it is excluded from the lowest income group but included in totals.

@martinholmer
Copy link
Contributor Author

Please continue this discussion in Tax-Calculator issue 1888, where @MaxGhenis's and @codykallen's 2/16/18 comments are reproduced.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants