-
Notifications
You must be signed in to change notification settings - Fork 30
/
Copy pathindex.qmd
193 lines (129 loc) · 5.5 KB
/
index.qmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
---
title: "Workshop Exercise: Emissions Data - Table, Map, and Chart"
format: html
---
# Introduction
In this exercise, you will work with a CO2 emissions dataset downloaded from Gapminder and produce a report with three tabs: a data table, a line chart, and a choropleth map.
The goal is to roughly replicate the [Our World in Data visualization page on consumption-based CO<sub>2</sub> emissions](https://ourworldindata.org/grapher/consumption-co2-emissions?tab=table&time=2000..2022).
Be sure to view that page to get an idea of the final product.
# Setup
- You should have forked and cloned this repository to your local machine.
- Now, create and select a virtual environment in VSCode.
- Install the following packages:
- pandas
- plotly
- itables
- ipykernel
- jupyter
- country_converter
- Download the data from [Gapminder](https://www.gapminder.org/data/) by selecting: *Environment > Emissions > CO2 Total emissions*, then downloading the CSV file into a `data` folder in your repository.
# Data Import
Run the following code to import the necessary libraries:
```{python}
import pandas as pd
import numpy as np
import plotly.express as px
from itables import show
import country_converter as coco
```
Load in your dataset from gapminder below. View it in your data viewer to get an idea of the structure.
```{python}
# Load the data
emissions = pd.read_csv("DATA PATH HERE")
```
# Initial Cleaning
In this dataset, some values are given in thousands, with a "k" used to represent the thousands. This will cause problems when we try to make these columns numeric. So we need to clean this. We'll do this for you, but pay close attention as you might need it for your final project.
First, let's see the issue:
```{python}
emissions.query("country == 'China'")[["country", "2020", "2021", "2022"]]
```
Notice the letter "k" at the end of "10.6k" as an example.
We can remove the "k" and multiply those values by 1000 with the following code:
```{python}
for col in ["2021", "2022"]:
has_k = emissions[col].str.contains("k")
values = emissions[col].str.replace("k", "")
emissions[col] = np.where(has_k, values.astype(float) * 1000, values.astype(float))
```
And check that it worked:
```{python}
emissions.query("country == 'China'")[["country", "2020", "2021", "2022"]]
```
# Table Section
Our goal is to create a table showing emissions for a few selected years and calculate absolute and relative changes.
1. Subset the data to include `Country`, `2000`, and `2022` columns only.
2. Calculate an "Absolute Change" column as the difference between 2022 and 2000.
3. Calculate a "Relative Change" column as the absolute change divided by the 2000 emissions, then multiplied by 100.
```{python}
# Subset the data to include `country`, `2000`, and `2022` columns only.
table_df =
# Calculate absolute change as the difference between 2022 and 2000
table_df["Absolute Change"] =
# Calculate relative change as the absolute change divided by the 2000 emissions, then multiplied by 100
table_df["Relative Change"] =
# Round to 0 decimal places, and add a % sign to the relative change
table_df["Relative Change"] = table_df["Relative Change"].round(0).astype(str) + "%"
```
Now we can display this as an interactive table with itables:
```{python}
show(table_df)
```
# Chart Section
Our goal is to create a line chart from 1990 to 2022 for a few selected countries.
1. Melt the original `emissions` dataset so that years become rows.
2. Filter from 1990 to 2022 only.
3. Choose 5 countries of your choice.
4. Create a line chart showing emissions over time for the selected countries with Plotly Express.
```{python}
# Melt the original `emissions` dataset. Your id_vars should be "country", your var_name should be "year" and your value_name should be "emissions".
emissions_long =
# Convert year to numeric using pd.to_numeric
emissions_long["year"] =
# Convert emissions to numeric using pd.to_numeric. Here, we also convert dashes to the minus sign
emissions_long["emissions"] = pd.to_numeric(emissions_long["emissions"].astype(str).str.replace("−", "-"))
# Query for years between 1990 and 2022 (that is 1990, 1991, ..., 2021, 2022)
emissions_long_1990_2022 =
# Query for 5 countries (adjust these to any countries you like)
emissions_long_subset =
# Create line chart. Year should be on the x-axis, emissions on the y-axis, and color should be by country.
fig_chart =
```
# Mapping Section
This part is done for you.
**Goal:** Create a choropleth map showing global emissions from 1990 to 2022.
This will be animated by year.
1. Ensure each country has a 3-letter ISO code. We'll use `country_converter` for that.
2. Create a map with `px.choropleth` and use `animation_frame` to show changes over time.
```{python}
# Convert country names to ISO3 codes
emissions_long_1990_2022["country_code"] = coco.convert(
emissions_long_1990_2022["country"], to="ISO3"
)
fig_map = px.choropleth(
emissions_long_1990_2022,
locations="country_code",
color="emissions",
hover_name="country",
animation_frame="year",
title="Global CO2 Emissions (1990-2022)",
)
fig_map.show()
```
# Final Tabset
Below, we place our results into a tabbed interface.
::: {.panel-tabset}
## Table
```{python}
show(table_df)
```
## Chart
```{python}
fig_chart.show()
```
## Map
```{python}
fig_map.show()
```
:::
# Deploying to GitHub Pages
As a final step, you should follow the steps outlined in the prework to deplioy your report to GitHub Pages. You will be asked to share a link to your report in the course portal