Skip to content

Latest commit

 

History

History
153 lines (120 loc) · 6.29 KB

README.md

File metadata and controls

153 lines (120 loc) · 6.29 KB

Simple Python analysis tools for COVID-19 / SARS-cov-19 data

These scripts are intended to help analyze data from several sources, including:

This contains:

  • a small library for pulling and updating data, managing a cache of the most recent data,
  • some exploitation tools using python's ̀pandas, matplotlib and seaborn packages,
  • additional tools based on Scikit Learn
  • jupyter notebook(s),

Jupyter notebooks included work with COVID-19 / SARS-cov-19 data provided by :

TOC

Gallery

See README-gallery.md for more, and README-Vaccine.md concerning vaccination data.

Comparison between largest 'départements'
From 'data.gouv.fr'
Vaccination
Impact of vaccination on hospitalizations and deaths
Comparison between some European countries
From 'data.europa.eu'

Functionality

  • Jupyter notebook(s): display data. Automatically make use of the latest version of the data provided, which is cached locally with update synchronization with the remote site (automatic, after prescribed time interval)
  • Python modules :
    • manage a local repository with files by handling file version/timestamp in file name.

      1. This is managed as a cache, where only a specified number of versions of each file are kept
      2. We limit cache size by erasing older versions
    • automate the transfer of files with information/directories located on the remote site:

      • identified by badge or tag on doc.data.gouv.fr. This uses the API (http based) documented at
      https://doc.data.gouv.fr/api/reference/#/datasets/list_datasets .
    • identified by a SPARQL filtering regular expression on https://data.europa.eu/ SPARQL entry point, using the

    • permit some inquiries on the downloaded/cached meta data describing the data loaded from the remote site

    • figureHelpers.py module:

      • some convenience tools to facilitate/automate making matplotlib figures. (Also looking forwards towards ̀seaborn`... after some wait... )
    • perform some data analyses (model parameter fitting), see: ./JupySessions/FIT-Data-FromGouv.ipynb

Bugs and changes

  • For more information on changes (and bugs), see the git log.

  • Concerning the French site data.gouv.fr, we evolve towards support of a larger subset of the API

  • Concerning the European site ̀data.europa.eu:

    1. an issue has been corrected, see README-Bug-X-EuRDF.md
    2. Change in format of the data files collected (introducing weekly data) is in process. More in README-Data.md.
  • TBD: reduce redundancy (same files with different extensions) in cached data, for now caches take more space than really needed

install requirement

Python

  • This requires Python 3, and has been tested on:
Python 3.6.5 Ubuntu 18.04 LTSbefore 3Q2020
Python 3.8.2 Ubuntu 20.04 LTbefore 1Q2021
Python 3.8.6 Ubuntu 20.10Current
  • In the current version, the library is dependent on some features from the IPython package, which comes with Jupyter. This constraint may be removed in the future.

Jupyter

This is used with Jupyter as natively integrated / installed in the Ubuntu 20.04 LTS distribution.

Libraries

pip install -U -R requirements.txt

Warning(s)

This is provided as is, see the LICENSE file. Development is ongoing.

References