-
Notifications
You must be signed in to change notification settings - Fork 30
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update Medicare and Medicaid values in cps.csv.gz file #185
Update Medicare and Medicaid values in cps.csv.gz file #185
Conversation
""" | ||
# replace medicare and medicaid |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This code should run faster than lines 348-354:
medicare_cols = 'MCARE_VAL' + pd.Series((np.arange(16) + 1).astype(str))
medicaid_cols = 'MCAID_VAL' + pd.Series((np.arange(16) + 1).astype(str))
count_medicare = data[medicare_cols].astype(bool).sum(axis=1)
count_medicaid = data[medicaid_cols].astype(bool).sum(axis=1)
See https://drive.google.com/file/d/1Fw8rcvcERKs9llMf6dfOVAPuqwuqUUah for an example.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the tip!
cps_data/finalprep.py
Outdated
medicaid_var = 'MCAID_VAL{}'.format(i) | ||
count_medicare += np.where(data[medicare_var] > 0, 1, 0) | ||
count_medicaid += np.where(data[medicaid_var] > 0, 1, 0) | ||
new_medicare = count_medicare * 12000 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are 12000 and 6000 necessary here? I think they just get divided away by the scale*
columns. My understanding is that the new benefit values equal the total current benefit values divided by the number of recipients. If so you can remove the new_medica*
columns and just use the count
s.
If I'm missing something and they are needed, WDYT about making them constants with a brief explanation of why they're those values?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think Anderson is implementing my proposal, which I described in the C-TAM issue. The 12k and 6k are sort of the insurance value based on imputed benefits from MEPS, even though more precisely they should be
I see where you're from. That would avoid the scaler step. But one caveat is that for medicare we still want to differentiate benefit by income group.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The income upper bound refers to individual level WAS. In our case, maybe we could say if either primary or secondary earner has WAS higher than $900k, we just gave them the lower value $8776. Do we even have people with that high income in CPS tax unit dataset? If we have a good number of them, it probably worths the time to use this table. If not, I think's it's easier to go with what Max suggests. The result should be the same.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We do have people with WAS higher than $900K. I'll switch the variables to constants, it would probably be worth differentiating between those higher earners just because there's a pretty significant difference in insurance value.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't $900K the upper bound for the quintile? So since it's the top quintile, it's just the maximum observed income? If so, since $0 is the upper bound for the 4th quintile, should anyone with positive income be assigned the $8,776 value? Probably some with $0 too, but that'd require some randomness; alternatively, those with positive incomes could be set to something lower than $8,776 such that the average when adjusting for the share with $0 is $8,776.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In fact I'm wondering if the quintile table in general would be simpler and more usable for this purpose if it only has two splits, for both Medicare and Medicaid: $0 income and >$0 income.
Yep! I posted the table for insurance value in my in-line comment. Being a lazy person, if I were doing this, I would define the constant at the top, and add a link to the general C-TAM documentation. |
Why exactly is the "insurance value" of Medicaid and Medicare being calculated by income group rather than for the whole population? |
@martinholmer This is what Dan suggested to me. @feenberg Dan, do you mind explain in detail here? Thanks! |
@martinholmer asked:
And @Amy-Xu responded:
The research I'm familiar with computes a single actuarial value of medical benefits for the whole recipient population (not for subgroups of the recipient population). But maybe I'm unaware of other approaches. Can you provide links to research papers that compute different actuarial values for different subgroups of the recipient population? |
I can imagine that the justification for different imputed insurance values by income is that higher income people have better health. But that hardly takes account of the value to the recipient of the insurance, which can be low for low-income households who have more pressing needs. I believe the government generally values the insurance at the amount of health care spending the family would otherwise have made. This seems too low. Finkelstein has a paper on this - we could borrow her numbers:https://economics.stanford.edu/sites/default/files/valueofmedicaid_nov17_2016.pdf but it doesn't provide different estimates by income. The numbers are probably more defensible than the alternatives. |
Dan’s comment doesn’t make sense to me. We need more discussion next week.
…Sent from my iPhone
On May 3, 2018, at 9:46 AM, Daniel Feenberg ***@***.***> wrote:
I can imagine that the justification for different imputed insurance values by income is that higher income people have better health. But that hardly takes account of the value to the recipient of the insurance, which can be low for low-income households who have more pressing needs. I believe the government generally values the insurance at the amount of health care spending the family would otherwise have made. This seems too low. Finkelstein has a paper on this - we could borrow her numbers:https://economics.stanford.edu/sites/default/files/valueofmedicaid_nov17_2016.pdf but it doesn't provide different estimates by income. The numbers are probably more defensible than the alternatives.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or mute the thread.
|
The difference in Medicare benefit values between those with income and those without seems significant enough to be worth modeling (30%+). Wouldn't supplemental insurance explain a lot of this? |
On Thu, 3 May 2018, Max Ghenis wrote:
The difference in Medicare benefit values between those with income and
those without seems significant enough to be worth modeling (30%+). Wouldn't
supplemental insurance explain a lot of this?
Medicare is a transfer from the healthy to the sick - that seems right and
it seems sensible that a measure of progressivity should include that
effect. What doesn't seem sensible to me is to add the transfer to income.
Sickness didn't make the person richer, why should we move them to a
higher income category? Do we even do that?
Note that I am seeming to suggest that we should use insurance value for
categorizing families by income, but add insurance benefits paid to
pre-benefit/post tax income when calculating the tax change and the degree
of progressivity. There is something to be said for this approach, but
perhaps it is not mainstream and should be reserved for a later advocate.
Daniel Feenberg
…
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or mute the
thread.[AHvQVSFnFOMxAOfX8dB3YSzmzL2URZEjks5tuzkfgaJpZM4Tn6eR.gif]
|
I agree with this, but I think it differs from the case of a Medicare recipient deriving less insurance value from Medicare because they have supplemental insurance. In this case it doesn't seem as out there to say the Medicare recipient without supplemental insurance is made "more richer" by Medicare than the one with supplemental insurance. That said, unless there's data on this, it's probably hard to untangle from health differences. ACA subsidies aren't included in taxdata right? Would any of these choices affect that if they were? |
@andersonfrailey seems like assigning every enrollees a simple average is the best at the moment according to issue PSLmodels/C-TAM#71. Could you revise the code when you get a chance? |
@Amy-Xu, yep I'll get to work on that. |
@Amy-Xu, can you confirm what the direction we decided to go in with this PR? |
@andersonfrailey Yep, we want to assign one uniform value to every beneficiary, and the value is calculated from total cost over total beneficiaries. |
@Amy-Xu got it. So just to be explicit, we're going to sum up the entire value of medicare and medicaid in the file, divide it by the total number of recipients, and assign that value to every beneficiary, correct? |
@andersonfrailey I think so. The imputed sum should be equal to the administrative total for the benefits. |
@Amy-Xu can you review my latest commit? |
tests/cps_agg_expected.txt
Outdated
mcare_ben 1700697749 | ||
mcaid_ben 904042846 | ||
mcare_ben 1778073024 | ||
mcaid_ben 888211102 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are these two the total amounts of benefits for medicare and medicaid respectively?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These are the unweighted totals, yes. The weighted totals didn't change
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got you.
@Amy-Xu are you comfortable with the latest changes in this PR? Can I go ahead at merge it at the end of the day? |
I think the code looks good to me, but it might be more helpful if the aggregates displayed are the weighted totals, because the changes will be more sensible and we can compare them with the administrative totals. That said, it could be an improvement later on. |
Thanks for taking a look, @Amy-Xu. We might add a test later that shows weighted aggregates. With regards to this PR though the aggregate totals have stayed the same. |
@andersonfrailey, Why does the test-calculated unweighted sum of Once #261 and #271 are merged (on Friday?), does it make sense to remove merge conflicts, and then double-check code changes, and then merge #185? |
@martinholmer, the unweighted sum of Once the two PR's you mentioned are merged I believe it does make sense to remove the merge conflicts here and merge. |
@andersonfrailey said:
OK. Thanks for the explanation. @andersonfrailey concluded:
OK. Do you think the merger of #185 can happen tomorrow (Friday)? When would you like me to merge #271, so that you have time to work on #185? |
@martinholmer, if you merge #271 tomorrow morning I'll have time to work on #185 right after lunch and barring any unforeseen obstacles have it ready to go in the afternoon. I personally plan on having #261 ready to go by 10 or so tomorrow morning. |
@andersonfrailey, I'll merge #271 and #272 no later than around noon, so that they will be both (along with #261) in the master branch after you get back from lunch. |
@andersonfrailey, I know you have asked @Amy-Xu several times during the development of PR #185 to check over the code changes and confirm that she thinks the code changes are doing the actuarial value calculation for Medicare and Medicaid correctly. But I am much less familiar with the CPS data file and the code in the If that is not correct, then we may need to think about the code changes in PR #185 in more detail. If that turns out to be true, then we should not rush to merge #185 today (Friday, August 10th). The same sort of situation arises in Medicaid, where it is common for only the kids in a family filing unit to be Medicaid (CHIP) recipients. So, for example in a family of five (husband, wife, and three kids covered by Medicaid CHIP), the Medicaid benefit for the filing unit could be just three (not five) times the actuarial value of Medicaid per Medicaid recipient. |
@martinholmer, assuming that both of those couples were assigned/non assigned correctly by C-TAM (two recipients in the first couple, one in the second), your example is correct. The new code in |
@andersonfrailey explained:
Thanks for the clarification. So, it seems as if pull request #185 is good to go (after merge conflicts are eliminated) this afternoon. |
Latest commits bring this up to date. Should be ready to go now @martinholmer. |
@andersonfrailey, Thanks for the up-to-date version of PR #185 and for fixing my mistake in the new |
This PR addresses C-TAM issue #68 and implements a change to the value of Medicare and Medicaid in the CPS file based on the discussion in the aforementioned issue.
@Amy-Xu, is what I've done here what you were suggesting?
I've labeled it "WIP" because I will need to go back and update stage 3 of the CPS file creation process once I've gotten the thumbs up from @Amy-Xu.
cc @MaxGhenis