Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add TAXSIM-35 Validation: To replace PR #2453 #2619

Merged
merged 77 commits into from
May 5, 2023

Conversation

jdebacker
Copy link
Member

@jdebacker jdebacker commented Sep 6, 2021

This PR works off the contributions from @chusloj in PR #2453.

My goals are the following (over and above what was proposed in PR #2453):

  • Have PR from a branch controlled by an active contributor
  • Add test datasets for 2017
  • Verify that the input datasets produced by taxsim_input.py for the years 2017 and 2018 return the same results with TAXSIM-27 and TAXSIM-32 (or explain differences if not)
    • They do not. We find that a small number of records, mostly with capital losses, do not match exactly in the two calculators.
    • We have not yet been able to explain this.
    • Input files for 2017 produce the same results with TAXSIM-27 and 32.
    • Input file "a" for 2018 produces the same results with TAXSIM-27 and 32.
    • Input files "b" and "c" for 2018 produce different results for a small number of records with TAXSIM-27 and 32.
  • Verify that for years 2017 and 2018, the input datasets from the TAXSIM-27 validation (in /taxcalc/validation/taxsim27), produce the same output from TAXSIM-27 and TAXSIM-32 (or explain differences if not)
    • We'll call the exercise above sufficient. This is essentially the same check.
  • Verify that the differences between taxcalc ands TAXSIM-32 are the same as any differences between taxcalc and TAXSIM-27 (checked into the repo as /taxcalc/validation/taxsim27/{assumption set}{year}.taxdiffs-expect) using the input datasets in /taxcalc/validation/taxsim27
  • Verify that any differences between taxcalc and TAXSIM-32 the input datasets produced by taxsim_input.py for the years 2017 and 2018 can be explained the same reasons for any non-zero expected differences in /taxcalc/validation/taxsim27/{assumption set}{year}.taxdiffs-expect
  • Ensure that differences between taxcalc and TAXSIM-32 the input datasets produced by taxsim_input.py for the year 2019 are zero or can be explained (from TAXSIM-27 validation, it appears that for the a and b input datasets, there are few differences and they are all less than $1, but the c datasets may produce some larger differences (see, e.g., c18.taxdiffs-expect
  • Add utilities to more easily perform these validation exercises:
    • Output saved uses taxcalc variable names (rather than the v names from TAXSIM) to ease interpretation of output.
    • Descriptive tables are produced comparing input variables from observations that do not match and those that do (e.g., so one can easily see differences like `those that don't match have pass-through income while all those that match do not).
  • CSV files with TAXSIM and taxcalc intermediate variables (mapping TAXSIM to taxcalc variable names for ease of comparison) from samples of observations that do not match (e.g., to help identify where in the determination of the income tax amount calculations began to differ)

Other suggestions welcome.

cc @bodiyang @MattHJensen

@jdebacker jdebacker changed the title [WIP] Add TAXSIM-32 Validation: To replace PR #2453 [WIP] Add TAXSIM-35 Validation: To replace PR #2453 Mar 16, 2023
@jdebacker
Copy link
Member Author

Now using TAXSIM-35 for the validation...

@jdebacker jdebacker changed the title [WIP] Add TAXSIM-35 Validation: To replace PR #2453 Add TAXSIM-35 Validation: To replace PR #2453 May 3, 2023
@jdebacker jdebacker marked this pull request as ready for review May 3, 2023 23:16
@jdebacker jdebacker requested a review from MattHJensen May 3, 2023 23:16
@jdebacker
Copy link
Member Author

I believe this PR is ready. It adds utilities for testing against TAXSIM 35.

It does not update the expected files for 2017-2019 and does not add them for 2020 and 2021. This should be done after a couple issues are resolved, which the utilities included in this PR as helpful in identifying:

  1. Some differences between how Tax-Calculator and TAXSIM calculate the Recovery Rebate Credit in 2020 (it looks like a difference in how the phaseout is calculated.
  2. Some differences between how Tax-Calculator and TAXSIM calculator (or report) the child tax credit in 2021. Here, it looks like in many cases both yield the same tax liability, but report different amounts for the child tax credit amount (perhaps due to a difference in reporting the uncapped amount?).

I think work on those two issues should be done in subsequent PRs.

@feenberg
Copy link
Contributor

feenberg commented May 4, 2023 via email

@MattHJensen
Copy link
Contributor

Some differences between how Tax-Calculator and TAXSIM calculator (or report) the child tax credit in 2021. Here, it looks like in many cases both yield the same tax liability, but report different amounts for the child tax credit amount (perhaps due to a difference in reporting the uncapped amount?).

This might be due to TAXSIM35 adding the odc amount to the CTC amount. (HT Martin, who noted it here: #2658 (comment).

@MattHJensen
Copy link
Contributor

@jdebacker I support merging this and then tracking down additional differences in another PR if that's your preference. Looks really great.

@jdebacker jdebacker merged commit c5a6271 into PSLmodels:master May 5, 2023
@feenberg
Copy link
Contributor

feenberg commented May 5, 2023 via email

@jdebacker jdebacker mentioned this pull request May 5, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants