Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use sampling seed to standardize records subsample in test_pufcsv.py #869

Merged
merged 1 commit into from
Aug 22, 2016
Merged

Use sampling seed to standardize records subsample in test_pufcsv.py #869

merged 1 commit into from
Aug 22, 2016

Conversation

martinholmer
Copy link
Collaborator

This pull request eliminates randomness in the sub sample selected in the sampling test added in pull request #844 (as fixed in #864). It also does two other things. First, it moves the comparison of combined tax liabilities generated by the sub-sample and the full-sample into the test_agg() function in order to reduce test execution time. Second, it uses a sampling random-number seed that produces a relatively small difference under current-law policy between the sub-sample and full-sample combined tax liability. Below is the maximum (among the ten years of results from 2013 to 2022) relative difference between the sub-sample combined tax liability and the full-sample combined tax liability for each of nine tested sampling random-number seeds:

SEED  MAX RELATIVE DIFFERENCE (%)
---------------------------------
  10  -1.66 %
  20  -2.65
  30  +3.18
  40  -5.90
  50  +2.20
  60  -2.77
  70  +3.81
  80  +0.61
  90  -2.41
---------------------------------

These results show that picking the "right" sampling seed will make a big difference in user satisfaction.

@MattHJensen @feenberg @talumbau @Amy-Xu @GoFroggyRun @zrisher @codykallen

@martinholmer martinholmer changed the title Use sampling seed to standardize Records subsample in test_pufcsv.py Use sampling seed to standardize records subsample in test_pufcsv.py Aug 19, 2016
@codecov-io
Copy link

codecov-io commented Aug 19, 2016

Current coverage is 98.12% (diff: 100%)

Merging #869 into master will increase coverage by <.01%

@@             master       #869   diff @@
==========================================
  Files            13         13          
  Lines          1816       1818     +2   
  Methods           0          0          
  Messages          0          0          
  Branches          0          0          
==========================================
+ Hits           1782       1784     +2   
  Misses           34         34          
  Partials          0          0          

Powered by Codecov. Last update 9772cd1...430e82f

@martinholmer martinholmer merged commit 51f0ccc into PSLmodels:master Aug 22, 2016
@martinholmer martinholmer deleted the sample0 branch August 22, 2016 14:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants