Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use ParamTools #2401

Merged
merged 68 commits into from
May 7, 2020
Merged

Use ParamTools #2401

merged 68 commits into from
May 7, 2020

Conversation

hdoupe
Copy link
Collaborator

@hdoupe hdoupe commented Mar 2, 2020

This PR shows how ParamTools could be used to handle Tax-Calculator's parameters. I used a script to convert the configuration files from the Tax-Calculator format to the ParamTools format, but besides this difference, there are only a couple public-facing API changes:

  • The inflation_rates() and growth_rates() methods now return dictionaries where the keys are years and the values are the rates. It was easier for me to work with this format since most operations require you to get a value for a certain year at a time. This can easily be reverted.

  • The parameter values that are accessible as instance attributes (e.g. pol.II_em) are not casted down into a lower dimension. So, if the year is 2017, then pol.II_em returns [4050.0] instead of 4050.0. ParamTools does this to keep the results consistent for when you have pol.set_year(2017) or pol.set_year([2017, 2018]). The Calculator.policy_param method automatically casts the value into a lower dimension to preserve compatibility:

    In [2]: pol = Policy()                                                                                                                                                                        
    
    In [3]: pol.set_year(2017)                                                                                                                                                                    
    
    In [4]: pol.II_em                                                                                                                                                                             
    Out[4]: array([4050.])
    
    In [5]: pol.set_year([2017, 2018])                                                                                                                                                            
    
    In [6]: pol.II_em                                                                                                                                                                             
    Out[6]: array([4050.,    0.])
    
    In [7]: from taxcalc import Policy, Calculator, Records                                                                                                                                       
    
    In [8]: calc = Calculator(policy=pol, records=Records.cps_constructor())                                                                                                                      
    
    In [9]: calc.policy_param("II_em")                                                                                                                                                            
    Out[9]: 4050.0

There may be some other API changes that I have missed. I'll be going over the code to make sure that I have not missed anything else. If anyone is interested, I'd really appreciate them running their downstream projects or notebooks with the changes in this branch to make sure I haven't broken anything.

There are few notable places where I maintained consistency:

  • The _update method still takes reforms that are in the Tax-Calculator format. The reforms are reformatted and passed to the ParamTools adjust method. All classes that inherit from taxcalc.Parameters will have this behavior. That's why none of the reforms in this repo needed to be reformatted in this PR.

  • Getting the parameter value using a leading underscore still returns its values for all years:

    In [10]: pol._II_em                                                                                                                                                                           
    Out[10]: 
    array([3900.  , 3950.  , 4000.  , 4050.  , 4050.  ,    0.  ,    0.  ,
              0.  ,    0.  ,    0.  ,    0.  ,    0.  ,    0.  , 4901.  ,
           5004.41, 5108.5 , 5215.78])

    This is done with a custom attribute getter which checks if the attribute meets this pattern and then does the necessary operations to get the values for all years.

  • Other attributes and methods like start_year, current_year, and set_year were either copied over to the new Parameters class or re-written to wrap the ParamTools API to maintain consistency.

I opened this PR as a draft because I'm not sure if the Tax-Calculator project is interested in adopting ParamTools. After all, Tax-Calculator has a perfectly good parameters processing backend that is fast, reliable, and well-tested. This exercise was very helpful for ParamTools. It prompted me to improve ParamTools's performance, make it more extendable, add better error messages, add warnings, and the list goes on. I even spotted a small bug in Tax-Calculator (#2381) while I was re-implementing the inflation indexing algorithm with ParamTools. I also included detailed documentation on the indexing algorithm in the new adjust method, which could be helpful to have in the Tax-Calculator docs. So, I think this exercise was helpful for Tax-Calculator, too. I'm excited to hear people's feedback on this PR.

…'s a Policy parameter

- Updates test_decorators.py for change in Policy parameter value
  dimensionality
@hdoupe
Copy link
Collaborator Author

hdoupe commented Mar 2, 2020

The test errors look like pycodestyle errors. I'll push some fixes soon.

@hdoupe
Copy link
Collaborator Author

hdoupe commented Mar 4, 2020

I just pushed a whole bunch of codestyle fixes. I've been spoiled by my use of tools that format code for me. One thing that might be helpful for TC is to add a pre-commit hook that runs the pycodestyle command when you create a git commit (for example). If the code style test fails, then it won't let you make the commit.

@hdoupe
Copy link
Collaborator Author

hdoupe commented Mar 4, 2020

@MattHJensen asked yesterday on the TC community call for a TODO list for reviewers:

  • Make sure all of my test changes are sound.
    • I'll leave a comment on GitHub on places where I'd especially appreciate advice/review.
  • If you have any notebooks, workflows, analyses, etc. where you use TC, please run them and make sure your code isn't broken.
  • Check the recipes to make sure those are still good-to-go. (I'm happy to help)
  • Maybe @MattHJensen or @Peter-Metz would be interested in stepping through the adjust method with me to make sure it's clear and readable.
  • Yes/No on the 2 API changes I mentioned in the initial comment for this PR.
    • inflation_rates() / growth_rates() return dicts
    • Instance attributes on taxcalc.Parameters are not casted down to lower dimension.
  • Feedback on print_warnings vs ignore_warnings (e.g. Use ParamTools #2401 (comment))
  • Redefined parameters. (e.g. Use ParamTools #2401 (comment))

My TODO list is:

  • Go back over code to make sure I'm not missing API changes.
    • Bring back initialize method.
  • Comment on any test changes.
  • Add equivalent version of tests that I found helpful in the early stages of developing this PR.
  • Upstream the select_lt function into ParamTools.
  • Remove warnings in ParamTools that are shown when adjusting the value of a parameter for a year other than current year.
  • Take another pass through adjust code to see if I can make it clearer. (I went a little crazy with list and dict comprehension here.)
  • Make updates to parameter documentation code (i.e. code that creates HTML files for the parameters).

@MattHJensen @Peter-Metz Let me know if I'm missing anything here.

@codecov
Copy link

codecov bot commented Mar 4, 2020

Codecov Report

Merging #2401 into master will not change coverage.
The diff coverage is n/a.

Impacted file tree graph

@@           Coverage Diff           @@
##           master    #2401   +/-   ##
=======================================
  Coverage   99.92%   99.92%           
=======================================
  Files          13       13           
  Lines        2530     2530           
=======================================
  Hits         2528     2528           
  Misses          2        2           

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update a892c37...a892c37. Read the comment docs.

@Peter-Metz
Copy link
Contributor

@hdoupe I reviewed the changes to Tax-Calc tests and they look good to me! (I didn't check the tests that are now handled by ParamTools)

@MattHJensen

@hdoupe
Copy link
Collaborator Author

hdoupe commented Apr 27, 2020

Thanks @Peter-Metz. Per @MattHJensen's suggestion I'm going to add a ParamTools version of the array tests.

I'm going to have time in about an hour to try to knock out these tasks per your feedback:

  • Fix key order in JSON files
  • Add back print_warnings functionality
  • Add array tests

Let me know if I'm missing anything.

@hdoupe
Copy link
Collaborator Author

hdoupe commented Apr 27, 2020

@MattHJensen thanks for your feedback on how warnings should work in Tax-Calculator. I'm going to implement your suggestion in this PR.

@Peter-Metz
Copy link
Contributor

@hdoupe the only other thing i can think of is fixing the user guide. Let me know if I can help with any of these tasks.

@hdoupe
Copy link
Collaborator Author

hdoupe commented Apr 27, 2020

@hdoupe the only other thing i can think of is fixing the user guide. Let me know if I can help with any of these tasks.

Do you have time to look at the user guide? I modified the make_uguide.py file so that it runs, but I haven't looked at the output closely yet.

@Peter-Metz
Copy link
Contributor

@hdoupe Yep, I'll take a look tomorrow.

@jdebacker
Copy link
Member

I installed TC from this branch and successfully ran all tests of the latest version of Cost of Capital Calculator with it.

@rickecon or I will try with OG-USA soon.

cc @hdoupe @MattHJensen @Peter-Metz

@hdoupe
Copy link
Collaborator Author

hdoupe commented Apr 30, 2020

Thanks @jdebacker! This is much appreciated!

@hdoupe
Copy link
Collaborator Author

hdoupe commented May 1, 2020

Thanks @Peter-Metz. Per @MattHJensen's suggestion I'm going to add a ParamTools version of the array tests.

I'm going to have time in about an hour to try to knock out these tasks per your feedback:

* [x]  Fix key order in JSON files

* [x]  Add back `print_warnings` functionality

* [x]  Add array tests

Let me know if I'm missing anything.

These tasks are complete. @MattHJensen @Peter-Metz let me know how things look.

Here's the script I used to fix the key order:

import json

if __name__ == "__main__":
    to_convert = [
        "taxcalc/policy_current_law.json",
        "taxcalc/growdiff.json",
        "taxcalc/consumption.json",
    ]

    keys = [
        "title",
        "description",
        "notes",
        "section_1",
        "section_2",
        "section_3",
        "start_year",
        "indexable",
        "indexed",
        "type",
        "value",
        "validators",
        "compatible_data",
    ]

    for path in to_convert:
        print('converting', path)
        with open(path, "r") as f:
            defaults = json.loads(f.read())

        sorted_defaults = {}
        if "schema" in defaults:
            sorted_defaults["schema"] = defaults["schema"]

        for param, data in defaults.items():
            if param == "schema":
                continue
            assert not (data.keys() - set(keys)), f"{param} has extra keys: {data.keys()}"
            sorted_defaults[param] = {}
            for key in keys:
                if key in data:
                    sorted_defaults[param][key] = data[key]

        with open(path, "w") as f:
            f.write(json.dumps(sorted_defaults, indent=4))

@jdebacker
Copy link
Member

@hdoupe I ran through OG-USA with TC built from this branch after commit f2c1dbd and has no issues.

cc @MattHJensen @Peter-Metz

@hdoupe
Copy link
Collaborator Author

hdoupe commented May 3, 2020

Glad to hear it @jdebacker. Thanks again for testing out these changes.

@hdoupe
Copy link
Collaborator Author

hdoupe commented May 4, 2020

@Peter-Metz @MattHJensen #2401 is ready for another round of review. I think we should be pretty close to getting this merged.

@Peter-Metz are you still good with re-working the handling of the CPI_offset parameter in a follow up PR?

@Peter-Metz
Copy link
Contributor

Peter-Metz commented May 4, 2020

@hdoupe @MattHJensen changes look good to me and print_warnings functionality works as desired!

Hank - yeah, I have the CPI_offset fix pretty much teed up as soon as this PR is merged.

@hdoupe
Copy link
Collaborator Author

hdoupe commented May 4, 2020

@Peter-Metz sounds great. I'm excited to see your CPI_offset work.

@MattHJensen
Copy link
Contributor

@hdoupe, thanks very much for this PR and for so generously accommodating Peter and my requests through the review process. I am merging this PR now and expect it to appear in a new release within the month.

@MattHJensen
Copy link
Contributor

@Peter-Metz, thanks for all of your help with the review!

@MattHJensen MattHJensen merged commit 72f5de9 into PSLmodels:master May 7, 2020
@hdoupe
Copy link
Collaborator Author

hdoupe commented May 7, 2020

thanks very much for this PR and for so generously accommodating Peter and my requests through the review process

It was my pleasure. @MattHJensen and @Peter-Metz thanks for the thorough review!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants