Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Several instances have head_age = 999 #27

Open
prrathi opened this issue Feb 24, 2021 · 3 comments
Open

Several instances have head_age = 999 #27

prrathi opened this issue Feb 24, 2021 · 3 comments

Comments

@prrathi
Copy link
Contributor

prrathi commented Feb 24, 2021

From the data info, this is for head ages that are unknown. If I'm grouping the instances by head age and income, should these instances with head_age = 999 be dropped?

@jdebacker
Copy link
Member

@prrathi If you are using the a pandas.groupby, it will group by all age values, including this indicator for a missing value. Whether you should drop these values might depend on what you are trying to do, but for most OG-USA calibrations of age-specific values, I'd drop these. Also, given the few observations above age 80, it maybe best to compute age-specific values for ages 20-80 and then we can interpolate values for ages above 80.

@prrathi
Copy link
Contributor Author

prrathi commented Feb 24, 2021

Should the observations above age 80 be combined with age 80, or completely separate? Also same question for the one observation with head age 18 and 36 observations with head age 19

@jdebacker
Copy link
Member

@prrathi for consistency at this point, let's not include those over 80 and under 20.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants