-
-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
total_vaccinations and daily_vaccinations for Saudi Arabia do not match #333
Comments
You're missing the last step of the process, which is the 7-day rolling average! :)
|
So essentially from the two intervals, you estimate daily vaccinations by distributing the total equally to the days in that particular interval. Since some intervals are less than 7 days long, you use multiple intervals while calculating the moving averages resulting in the cumulative total of the daily estimates being greater than the actual total recorded like in the above case. From my understanding of 7-day moving average, for this case, you will have 4 data points for the rolling averages which from your estimates, starts from |
That's because our 7-day window allows partial data. So the earliest result will be based on 1 point, the second one will be the average of 2 points, etc. And from the 7th one onwards, it's the average of the last 7 points. |
Pardon my ignorance. If I understood your explanation, essentially you are saying you calculate 1-day, 2-day, ..., n-day, ..., 7-day average at each n-th day and then continue with 7-day moving averages from the 7th data point on wards. If my interpretation of your explanation is correct don't you think the above figures are incorrect because I would expect the first 4 figures to be the const estimatedDailyVaccination = (178337 - 137862) / 4; // 10118.75
const smoothedEstimatedDailyVaccination1 = 10118.75 // For day 1
const smoothedEstimatedDailyVaccination2 = (10118.75 + 10118.75 ) / 2 // For day 2
// And so on |
No worries, it'll give me a good opportunity to explain this process fully and redirect people here if the same questions arise later. Here's the step-by-step process, based on our current data for Saudi Arabia:
Here's the complete spreadsheet, where I've left all the formulas in the cells, so you can check how I arrived at each number: saudi_arabia_example.xlsx |
Thanks for the spreadsheet formulas. I was actually using the same method except that I was starting from the first estimated value while calculating the rolling averages but you are starting from the lower observed value instead. |
Sorry to trouble you. Is there a function that can do this?
|
In one of the issues here, which I have failed to locate, I was made to understand that
daily_vaccinations
values are estimated fromtotal_vaccinations
values using interpolation for countries which do not report daily vaccination figures. Below is an extract of vaccination figures for Saudi Arabia.My understanding is that since
2021-01-08
,2021-01-09
and2021-01-10
do not havetotal_vaccinations
values, thedaily_vaccinations
are then estimated using the reportedtotal_vaccinations
figures for2021-01-07
and2021-01-11
which are137862
and178337
respectively. If the intermediate values are interpolated using the two values, then the cummulative sum of the estimated shouldn't exceed the second value from which they were estimated. But in this case,137862 + 23990 + 19366 + 17055 + 15667
equals213940
. Which is much greater than178337
. What am I missing here @edomt ?The text was updated successfully, but these errors were encountered: