Are you bored with using the R iris, mtcars, ... datasets?
Then use the "Women in Parliament" data from the World Bank instead!. Great for teaching, learning, presentations or reprex. It has a hex sticker too!
Download: PNG (754x873) or SVG.
Download: PNG (3508x2480) or SVG.
Download: PNG (1280x640)
The raw data for "Proportion of seats held by women in national parliaments" ("single or lower parliamentary chambers only") by can be directly downloaded from:
As part of its "open data" mission the World Bank kindly offers "free and open access to global development data" licensed under the "Creative Commons Attribution 4.0 (CC-BY 4.0)".
The data originates from the "Inter-Parliamentary Union" (IPU) which provides an "Archive of statistical data on the percentage of women in national parliaments" going back to 1997 on a monthly basis:
The World Bank data is for “single or lower parliamentary chambers only”, while
the IPU also presents data for “Upper Houses or Senates”. Moreover, the IPU provides
the actual numbers used to calculate the percentages (which the World Bank does not).
The data has to be scraped from the IPU website (please check the robots.txt
file
first).
The data can be imported into R using the wbstats
package or by reading in the CSV file
available from the World Bank's website.
The R package wbstats
provides
access to World Bank's indicator data. Use the following code to get the women in
parliament data:
library(wbstats)
wip <- wb(indicator = "SG.GEN.PARL.ZS")
The R package WDI
also provides
access to World Bank's indicator data. Use the following code to get the women in
parliament data:
library(WDI)
wip <- WDI(indicator = "SG.GEN.PARL.ZS")
First download the latest CSV
file from:
Below I will refer to this file as "WiP-Data.csv
" but please use the actual
file name that you save it as.
library(data.table)
wipdt <- fread("WiP-Data.csv",
skip = 4, header = TRUE, check.names=TRUE)
WP <- melt(wipdt,
id.vars = grep("Name|Code", names(wipdt), value = TRUE),
measure = patterns("^X"),
variable.name = "YearC",
value.name = c("pctWiP"),
na.rm = TRUE)
WP[, Year:=as.numeric(gsub("[^[:digit:].]", "", YearC))][
, YearC:=NULL]
library(readr)
library(dplyr)
library(tidyr)
wiptv <- read_csv("WiP-Data.csv", skip = 4)
names(wiptv) <- make.names(names(wiptv))
wipTidy <- wiptv %>%
gather(key=YearC, value=pctWiP, starts_with("X"), na.rm=TRUE) %>%
mutate(Year = parse_number(YearC)) %>%
select(-YearC)
Use the following R guides to get ideas on how to teach using the women in parliament data:
- Women in Parliament -- data.table Edition (PDF)
- GitHub Repo: https://github.com/saghirb/WiP-rdatatable
The images were create by Marina Costa guided by Andreia Carlos and myself.
You can view Marina's great portfolio at:
Thank you Marina and Andreia as it was really nice to work with you both.