Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add aggregate.py tests #315

Merged
merged 10 commits into from
Sep 5, 2017
Merged

Add aggregate.py tests #315

merged 10 commits into from
Sep 5, 2017

Conversation

hdoupe
Copy link
Contributor

@hdoupe hdoupe commented Aug 16, 2017

This PR adds tests for the remaining functions in aggregate.py.

@jdebacker I'm having some trouble generating data with the right dimensions for the functionrevenue. What are the expected dimensions for w , e and n?

@jdebacker
Copy link
Member

@hdoupe thanks for opening this PR.

Re dimensions:

  • w is a scalar or vector of varying length (up to length T+S)
  • e is S by J
  • n is S by J or T by S by J (where the first dimension will actually varying in length up to T+S - same as length of w)

@hdoupe
Copy link
Contributor Author

hdoupe commented Aug 31, 2017

For the function aggregates.revenue, I'm having trouble getting the case where I.ndim == 3 and etr_params.ndim == 3. @jdebacker do you have any idea what the dimensions should be to make this case work?

Also, I'm not sure if I can get the case where eter_params.ndim == 4. It seems like for every TPI case (which I think means I.ndim == 3) in the function revenue, etr_params is sliced by j. Since etr_params is at max 4 dimensions (at least I think this is the case), the highest dimension it can be when sent to tax.tau_income is three.

Compared to the other functions in aggregates.py, the revenue function is pretty complicated. So I saved the results from the runs on the test data and then the future results will be compared to the results that I saved this morning. @jdebacker and @rickecon does this sound sensible?

@jdebacker
Copy link
Member

@hdoupe The docstring for aggregates.revenue() doesn't see to be quite correct. etr_params is going to be TxSxJxM, where M = number of parameters in the ETR function. When slicing, we are always trying to keep that trailing dimension as the number of parameters. Note that The ETR function does not vary over J, but is tiled along the way to make array operation easier.

I is never more than three dimensions. It is TxSxJ.

Also, I'm not sure if I can get the case where eter_params.ndim == 4. It seems like for every TPI case (which I think means I.ndim == 3) in the function revenue, etr_params is sliced by j. Since etr_params is at max 4 dimensions (at least I think this is the case), the highest dimension it can be when sent to tax.tau_income is three.

I think what you are saying is right. etr_params can enter aggregates.revenue() as a 4D object, but after the slicing on lines 251-264, it is never more than 3D when passed to tax.tau_income(). But I'm not sure of the issue it raises -- other than the fact that we probably want to refactor the code to do some better array broadcasting in tax.py so that we don't need to loop over j in places like aggregates.revenue() so that thinks are cleaner and we avoid confusing slicing as much as possible.

@hdoupe
Copy link
Contributor Author

hdoupe commented Aug 31, 2017

@jdebacker Thanks for the quick reply.

I think what you are saying is right. etr_params can enter aggregates.revenue() as a 4D object, but after the slicing on lines 251-264, it is never more than 3D when passed to tax.tau_income(). But I'm not sure of the issue it raises

Ah right, I was thinking that we didn't need the if statement for etr_params.ndim == 4. But I looked again and saw that tax.tau_income is called from other places in the code where etr_params might be 4 dimensions.

@hdoupe
Copy link
Contributor Author

hdoupe commented Sep 1, 2017

I think this should be ready to merge if you guys are OK with the tests that I've added.

When I call aggregates.revenue using TPI parameters, it returns an an array of length T. I saved the output to a csv file and plan to compare that with future results. However, I could reduce the size of T to 30 or 40 and keep the arrays in the test_aggregates.py file. What do you guys think?

@jdebacker @rickecon

@jdebacker
Copy link
Member

jdebacker commented Sep 1, 2017

@hdoupe I prefer to keep things in the test_aggregates.py file. If the function works with T=30, I don't see why we'd expect any different behavior with T being larger.

@hdoupe
Copy link
Contributor Author

hdoupe commented Sep 1, 2017

@jdebacker Ok that makes sense. The most recent commit uses T = 30 and S = 20.

results
"""
T = 30
s, j = 20, 2
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hdoupe Might use capital letters for S and J (lower case in the code refers to an item in a range from 0 to S (or J), while S/J give the max number in that dimension.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, sounds good to me.


b = -0.1 + (7 * np.random.rand(T * s * j).reshape(T, s, j))
omega = 0.5 * np.random.rand(T * s).reshape(T, s, 1)
lambdas = 0.4 + (0.2 * np.random.rand(j).reshape(1, 1, j))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hdoupe Maybe doesn't matter for the calculations, but omega.sum() and lambdas.sum() should both be 1 in the model.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok. I can normalize omega and lambdas so that they sum to 1. How does that sound?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I do normalize omega, which dimensions should sum to 1?

@jdebacker
Copy link
Member

jdebacker commented Sep 1, 2017 via email

@hdoupe
Copy link
Contributor Author

hdoupe commented Sep 5, 2017

@jdebacker I made the suggested changes. Do you think this is ready to merge?

@jdebacker
Copy link
Member

@hdoupe Yes - thanks.

@jdebacker jdebacker merged commit fca99e0 into PSLmodels:master Sep 5, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants