-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add some data for tests, and add_weights function #9
Add some data for tests, and add_weights function #9
Conversation
stop("Cannot find the defined strata column in the provided sample frame.") | ||
if(!strata_column_dataset %in% names(.dataset)) | ||
stop("Cannot find the defined strata column in the provided dataset.") | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps we can add -
- Error message if all the strata from the dataset are not available in the sample frame
- Warning message -if all the strata from the sample frame are not available in the dataset [I am suggesting warning, because sometimes its normal to have population data across the country, including the areas where we are having access issue]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- if there is a missing strata in the dataset. the weights df will return only NA.
- if you add a na.rm = T, the weights will including the population of the missing strata, hence not correct.
- I am not really sure what Mehedi wanted to achieve here. We can change the warning to error until he comes back.
#missing strata in dataset : does not run
set.seed(2133)
my_data <- data.frame(aa = runif(100),
strata = sample(LETTERS[1:5],size = 100, T))
my_sf <- data.frame(strata = LETTERS[1:6],
pop = c(10000,10000,20000,30000,5000,5000))
add_weights(my_data,
my_sf,
strata_column_dataset = "strata",
strata_column_sample = "strata",
population_column = "pop") %>%
dplyr::summarise(sum(weight))
#missing strata in sampling frame : ok
set.seed(2133)
my_data <- data.frame(aa = runif(100),
strata = sample(LETTERS[1:6],size = 100, T))
my_sf <- data.frame(strata = LETTERS[1:5],
pop = c(10000,10000,20000,30000,5000))
add_weights(my_data,
my_sf,
strata_column_dataset = "strata",
strata_column_sample = "strata",
population_column = "pop") %>%
dplyr::summarise(sum(weight))
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the third point, I think what Mehedi meant is that the sampling frame was done, then data collection happened in fewer locations than what was decided because of field reasons. I understand that this then requires either the change of the whole sample frame document to match the collected data, as filtering out the strata not collected in the sample frame will not resolve the issue.
I changed it for the time being to error as mentioned, but we can look again at it.
stop("Cannot find the defined strata column in the provided sample frame.") | ||
if(!strata_column_dataset %in% names(.dataset)) | ||
stop("Cannot find the defined strata column in the provided dataset.") | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- if there is a missing strata in the dataset. the weights df will return only NA.
- if you add a na.rm = T, the weights will including the population of the missing strata, hence not correct.
- I am not really sure what Mehedi wanted to achieve here. We can change the warning to error until he comes back.
#missing strata in dataset : does not run
set.seed(2133)
my_data <- data.frame(aa = runif(100),
strata = sample(LETTERS[1:5],size = 100, T))
my_sf <- data.frame(strata = LETTERS[1:6],
pop = c(10000,10000,20000,30000,5000,5000))
add_weights(my_data,
my_sf,
strata_column_dataset = "strata",
strata_column_sample = "strata",
population_column = "pop") %>%
dplyr::summarise(sum(weight))
#missing strata in sampling frame : ok
set.seed(2133)
my_data <- data.frame(aa = runif(100),
strata = sample(LETTERS[1:6],size = 100, T))
my_sf <- data.frame(strata = LETTERS[1:5],
pop = c(10000,10000,20000,30000,5000))
add_weights(my_data,
my_sf,
strata_column_dataset = "strata",
strata_column_sample = "strata",
population_column = "pop") %>%
dplyr::summarise(sum(weight))
No description provided.