This data set collates a growing number of critical indicators for assessment, monitoring and forecasting of the global COVID-19 situation. The data set is maintained by Starschema, an international data services consultancy.
As of On March 10, 2023, the Johns Hopkins Coronavirus Resource Center ceased its collecting and reporting of global COVID-19 data. Please use WHO, CDC or ECDC data sources for updated case counts.
A range of data sets have been published that are useful for monitoring and understanding the spread of COVID-19. Our efforts are intended to collate, curate and unify the most valuable data sources for enterprises, individuals and public health experts to assess the situation and make data-driven decisions. This single source easily blends with other data sources so you can analyze the movement of the SARS-CoV-2 pandemic over time, in any context.
Currently, the following data sets are included:
Name | Source | Table name |
---|---|---|
US COVID-19 testing and mortality | The COVID Tracking Project | CT_US_COVID_TESTS |
Global demographic data | The World Bank | DATABANK_DEMOGRAPHICS |
Global mobility data | GOOG_GLOBAL_MOBILITY_REPORT |
|
ACAPS public health restriction data | ACAPS via HDX | HDX_ACAPS |
Global data on healthcare providers | OpenStreetMap, via Healthsites.io | HS_BULK_DATA |
Travel restrictions by airline | World Food Programme via HDX | HUM_RESTRICTIONS_AIRLINE |
Travel restrictions by country | World Food Programme via HDX | HUM_RESTRICTIONS_COUNTRY |
Forecasts from IHME | IHME | IHME_COVID_19 |
Global case counts | JHU CSSE | JHU_COVID_19 |
US healthcare capacity by state, 2018 | The Henry J. Kaiser Family Foundation | KFF_HCP_CAPACITY |
ICU beds by county, US | The Henry J. Kaiser Family Foundation | KFF_US_ICU_BEDS |
US policy actions by state | The Henry J. Kaiser Family Foundation | KFF_US_POLICY_ACTIONS |
US actions to mitigate spread, by state | The Henry J. Kaiser Family Foundation | KFF_US_STATE_MITIGATIONS |
Table metadata | METADATA |
|
Detailed data on New York City | NYC DOHMH | NYC_HEALTH_TESTS |
US case and mortality counts, by county | The New York Times | NYT_US_COVID19 |
Italy case statistics, summary | Protezione Civile | PCM_DPS_COVID19 |
Italy case statistics, detailed | Protezione Civile | PCM_DPS_COVID19_DETAILS |
Detailed case counts and mortality by districts (Kreise), Germany | Robert Koch Institut | RKI_GER_COVID19_DASHBOARD |
Detailed case counts by province, sex and age band, Belgium | Sciensano | SCS_BE_DETAILED_PROVINCE_CASE_COUNTS |
Detailed hospitalisations by type of hospital care, Belgium | Sciensano | SCS_BE_DETAILED_HOSPITALISATIONS |
Detailed mortality by region, sex and age band, Belgium | Sciensano | SCS_BE_DETAILED_MORTALITY |
Number of tests performed by day, Belgium | Sciensano | SCS_BE_DETAILED_TESTS |
WHO situation reports | World Health Organization | WHO_SITUATION_REPORTS |
By convention, we unify geographies to ISO-3166-1 and ISO-3166-2 alpha-2 identifiers. We use pycountry
's country definitions and mappings.
Raw data is available through a range of availabilities.
The COVID-19 data set is available on Snowflake Data Exchange. This data set is continuously refreshed.
You can use the METADATA
table for metadata about each table, on a column level. Where the column is not specified, information pertains to the entire table.
Raw CSV files are available on AWS S3:
There is a Tableau Web Data Connector available for your use in Tableau to integrate the COVID-19 data set into your dashboards and analytical applications. Currently, this supports the JHU CSSE data set and the Italian case counts released by the Dipartimento delle Protezione Civile. The reach of the WDC is currently being expanded, please check back for details.
All applied transformation sets are documented in the Jupyter
notebooks in the notebooks/
folder.
The original data flow was designed by Allan Walker for Mapbox in Alteryx.
Use of this data source is subject to your implied acceptance of the following terms.
Data and transformations are provided 'as is', without any warranty or representation, express or implied, of correctness, usefulness or fitness to purpose. Starschema Inc. and its contributors disclaim all representations and warranties of any kind with respect to the data or code in this repository to the fullest extent permitted under applicable law.
The following data sets are subject to restrictions of use:
- JHU data sets: academic/research use only
- KFF data sets: non-commercial use only
- NYT data sets: non-commercial use only
The 2019 novel coronavirus (2019-nCoV)/COVID-19 outbreak is a rapidly evolving situation. Data may be out of date or incorrect due to reporting constraints. Before making healthcare or other personal decisions, please consult a physician licensed to practice in your jurisdiction and/or the website of the public health authorities in your jurisdiction, such as the CDC, Public Health England or Public Health Canada. Nothing in this repository is to be construed as medical advice.
To cite this work:
Foldi, T. and Csefalvay, K. 2019 Novel Coronavirus (2019-nCoV) and COVID-19 Unpivoted Data. Available on:
https://github.com/starschema/COVID-19-data
.