Skip to content

COVID-19 Data Analytics using MapReduce paradigm on Apache Hadoop and Spark

Notifications You must be signed in to change notification settings

kedarbartake98/Covid19_hadoop_spark

Repository files navigation

Covid19_hadoop_spark

COVID-19 Data Analytics using MapReduce paradigm on Apache Hadoop and Spark

Description of Input files:

  • covid19_full_data.csv : Daily record of new cases and new deaths due to COVID-19 per country
  • populations.csv : Countrywise population data

Description of Tasks:

  • Task 1: Count number of cases per country until a certain date given the record of number of cases per day for each country in csv

  • Task 2: Count number of deaths per country within a date range given the record of number of new deaths per day for each country in csv file

  • Task 3: Count the number of cases per million people country-wise

  • Spark Task 1: Count number of cases per country in a certain date range given the record of number of cases per day for each country in csv

  • Spark Task 2: Count the number of cases per million people country-wise

About

COVID-19 Data Analytics using MapReduce paradigm on Apache Hadoop and Spark

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published