You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
As commented in #124 it seems that when create_fake_icd() is called inside map() it doesn't resample the date, which leads to it only generating either ICD-10 codes or ICD-8 codes. Currently, it only generates ICD-10 codes, which is fine for development.
It's hard to see how we can generate two tables (lpr_adm and lpr_diag) independently where diagnosis codes in one are coherent with dates in the other. The current implementation of create_fake_icd() is able to take a date of the record and adjust the icd-version to correspond to the real data. However, since the date variable d_inddto is generated in lpr_adm, and the diagnosis code c_diag is generated in lpr_diag, the date information isn't available inside lpr_diag to aid diagnosis code generation.
I think the easiest solution is to just use a random sample of dates in lpr_adm and a random sample of icd8 and icd10 codes in lpr_diag and accept that some icd-8 codes in lpr_diag will be joined to dates after 1994 in lpr_adm where the icd-8 had been phased out in the real data.
Alternatively, we would have to join the two tables after generation, resample dates or diagnosis codes, and then split the merged table before saving the two components in register_data
The text was updated successfully, but these errors were encountered:
As commented in #124 it seems that when create_fake_icd() is called inside map() it doesn't resample the date, which leads to it only generating either ICD-10 codes or ICD-8 codes. Currently, it only generates ICD-10 codes, which is fine for development.
It's hard to see how we can generate two tables (
lpr_adm
andlpr_diag
) independently where diagnosis codes in one are coherent with dates in the other. The current implementation of create_fake_icd() is able to take a date of the record and adjust the icd-version to correspond to the real data. However, since the date variabled_inddto
is generated in lpr_adm, and the diagnosis codec_diag
is generated inlpr_diag
, the date information isn't available inside lpr_diag to aid diagnosis code generation.I think the easiest solution is to just use a random sample of dates in lpr_adm and a random sample of icd8 and icd10 codes in lpr_diag and accept that some icd-8 codes in
lpr_diag
will be joined to dates after 1994 inlpr_adm
where the icd-8 had been phased out in the real data.Alternatively, we would have to join the two tables after generation, resample dates or diagnosis codes, and then split the merged table before saving the two components in
register_data
The text was updated successfully, but these errors were encountered: