You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
To be checked when Mehedi is back. If stratas are not present in the dataset, weights will be calculated over the total population which is incorrect. What was the idea behind having a warning rather than an error? Currently, we set it as error.
R/add_weights.R
stop("Cannot find the defined strata column in the provided sample frame.")
if(!strata_column_dataset %in% names(.dataset))
stop("Cannot find the defined strata column in the provided dataset.")
Error message if all the strata from the dataset are not available in the sample frame
Warning message -if all the strata from the sample frame are not available in the dataset [I am suggesting warning, because sometimes its normal to have population data across the country, including the areas where we are having access issue]
Member
Author
@AbrahamAz AbrahamAz 4 days ago
done
Member
@yannsay-impact yannsay-impact 4 days ago
if there is a missing strata in the dataset. the weights df will return only NA.
if you add a na.rm = T, the weights will including the population of the missing strata, hence not correct.
I am not really sure what Mehedi wanted to achieve here. We can change the warning to error until he comes back.
#missing strata in dataset : does not run
set.seed(2133)
my_data <- data.frame(aa = runif(100),
strata = sample(LETTERS[1:5],size = 100, T))
my_sf <- data.frame(strata = LETTERS[1:6],
pop = c(10000,10000,20000,30000,5000,5000))
add_weights(my_data,
my_sf,
strata_column_dataset = "strata",
strata_column_sample = "strata",
population_column = "pop") %>%
dplyr::summarise(sum(weight))
Member
Author
@AbrahamAz AbrahamAz 3 days ago
For the third point, I think what Mehedi meant is that the sampling frame was done, then data collection happened in fewer locations than what was decided because of field reasons. I understand that this then requires either the change of the whole sample frame document to match the collected data, as filtering out the strata not collected in the sample frame will not resolve the issue.
I changed it for the time being to error as mentioned, but we can look again at it.
The text was updated successfully, but these errors were encountered:
To be checked when Mehedi is back. If stratas are not present in the dataset, weights will be calculated over the total population which is incorrect. What was the idea behind having a warning rather than an error? Currently, we set it as error.
R/add_weights.R
stop("Cannot find the defined strata column in the provided sample frame.")
if(!strata_column_dataset %in% names(.dataset))
stop("Cannot find the defined strata column in the provided dataset.")
Member
@mhkhan27 mhkhan27 4 days ago
Perhaps we can add -
Error message if all the strata from the dataset are not available in the sample frame
Warning message -if all the strata from the sample frame are not available in the dataset [I am suggesting warning, because sometimes its normal to have population data across the country, including the areas where we are having access issue]
Member
Author
@AbrahamAz AbrahamAz 4 days ago
done
Member
@yannsay-impact yannsay-impact 4 days ago
if there is a missing strata in the dataset. the weights df will return only NA.
if you add a na.rm = T, the weights will including the population of the missing strata, hence not correct.
I am not really sure what Mehedi wanted to achieve here. We can change the warning to error until he comes back.
#missing strata in dataset : does not run
set.seed(2133)
my_data <- data.frame(aa = runif(100),
strata = sample(LETTERS[1:5],size = 100, T))
my_sf <- data.frame(strata = LETTERS[1:6],
pop = c(10000,10000,20000,30000,5000,5000))
add_weights(my_data,
my_sf,
strata_column_dataset = "strata",
strata_column_sample = "strata",
population_column = "pop") %>%
dplyr::summarise(sum(weight))
#missing strata in sampling frame : ok
set.seed(2133)
my_data <- data.frame(aa = runif(100),
strata = sample(LETTERS[1:6],size = 100, T))
my_sf <- data.frame(strata = LETTERS[1:5],
pop = c(10000,10000,20000,30000,5000))
add_weights(my_data,
my_sf,
strata_column_dataset = "strata",
strata_column_sample = "strata",
population_column = "pop") %>%
dplyr::summarise(sum(weight))
Member
Author
@AbrahamAz AbrahamAz 3 days ago
For the third point, I think what Mehedi meant is that the sampling frame was done, then data collection happened in fewer locations than what was decided because of field reasons. I understand that this then requires either the change of the whole sample frame document to match the collected data, as filtering out the strata not collected in the sample frame will not resolve the issue.
I changed it for the time being to error as mentioned, but we can look again at it.
The text was updated successfully, but these errors were encountered: