This Python project focuses on analyzing the Gapminder dataset, which includes information about countries and various indicators over time. The dataset contains the following columns: ['country', 'continent', 'year', 'lifeExp', 'pop', 'gdpPercap', 'iso_alpha', 'iso_num']. The analysis involves data wrangling, graphical visualization, and comparisons of countries based on their GDPs. Additionally, time series analysis is performed to examine the evolution of certain indicators, and population comparisons among continents are conducted.
Ensure you have the necessary Python libraries installed by running:
pip install pandas matplotlib seaborn
Files:
main.ipynb: Jupyter Notebook containing the main code for data analysis and visualization. data/gapminder.csv: The Gapminder dataset in CSV format.
Clone the repository:
git clone https://github.com/your-username/gapminder-analysis.git cd gapminder-analysis Run the Jupyter Notebook:
jupyter notebook main.ipynb Execute the cells in the notebook to reproduce the analysis and visualize the results.
- Data Wrangling: The dataset is loaded and cleaned to handle missing or inconsistent values.
- Graphical Visualization: Various graphs are generated to visualize the trends and patterns in the data. Comparison of countries based on GDP. Time series analysis of life expectancy, population, and GDP.
- Continent-wise Population Analysis: Evaluation of population distribution across continents. Identification of continents with the highest population.
The analysis code is documented with comments for clarity. Feel free to explore and modify the notebook for further analysis or to suit your specific needs.
The dataset used in this project is sourced from Gapminder (https://www.gapminder.org/).