Data and code for
Victor Kristof, Aswin Suresh, Matthias Grossglauser, Patrick Thiran, War of Words II: Enriched Models of Law-Making Processes, The Web Conference 2021, April 19-23, 2021, Ljubljana, Slovenia.
Note: The repo for War of Words: The Competitive Dynamics of Legislative Processes is here.
From the root of the repo, install the requirements and local library:
pip install -r requirements.txt
pip install -e lib
Download the (raw) data from link
Put the canonical datasets (war-of-words-2-ep{7,8}.txt
) in a folder in the repo, for example:
mkdir -p data/canonical
If you don't want to generate the text embeddings from scratch, download text-embeddings.zip
and put the unzipped files in
mkdir data/text-embeddings
Also download the helpers helpers.zip
(a mapping of dossier references to their title and some MEPs metadata) and put the files in
mkdir data/helpers
You should also put the files containing the indices to split the data into train and test sets split-indices.zip
in
mkdir data/split-indices
You can generate the text embeddings for the 7th and 8th legislatures by running
python generate_embeddings.py --leg {7,8} --data_dir ../data/canonical --indices_dir ../data/split-indices --dossier2title_dir ../data/helpers --text_embeddings_dir ../data/text-embeddings
You can then generate "chronological" datasets, where edits are ordered according to the date of the dossiers:
cd 1-datasets
./1-split-chronologically.sh ../data/canonical/war-of-words-2-ep8.txt ../data/canonical/war-of-words-2-ep8-chronological.txt
To generate text embeddings for these, you can then run
cd 0-text-embeddings
python generate_embeddings.py --leg 8 --data_dir ../data/canonical --dossier2title_dir ../data/helpers --text_embeddings_dir ../data/text-embeddings --chronological
Map the text embeddings to the canonical datasets:
cd 1-datasets
./2-map-text-embeddings.sh ../data/canonical ../data/text-embeddings
Finally, process the datasets to create training sets and test sets:
mkdir pkl
./3-generate-datasets.sh ../data/canonical pkl
To train the models, you define an "experiment" in a JSON file (see examples in the train-def
folder).
You then train all the models, as defined in the JSON files, by running:
mkdir trained-models
python train.py --definition train-def/ep8.json --data_dir path/to/processed/datasets --hyperparams_dir hyperparams --models trained-models
Run all three definitions (ep7.json
, ep8.json
, and ep8-chronological.json
) to train all the models in the paper.
Similarly to the training, you define "experiments" for evaluation. You then evaluate all experiments by running:
mkdir results
python eval.py --definition eval-def/ep8.json --data_dir ../1-datasets/pkl --models_dir ../2-training/trained-models --save_results results
Run all four definitions (ep7.json
, ep8.json
, ep8-chronological.json
, and ep8-conflict_size.json
) to evaluate all experiments in the paper.
You finally reproduce the analysis in the paper by running the scripts in the folder 4-analysis
:
python results.py --results ../3-evaluation/results --save-as figures/results.pdf
python improvement.py --results ../3-evaluation/results --save-as figures/improvement.pdf
python explicit-features.py --model ../2-training/trained-models/ep8-all_features-latent-text.fit --dossier-titles ../data/helpers/dossier-titles.json --meps ../data/helpers/meps.json
python error-analysis.py --save-as figures/error-analysis.pdf
The interpretation of the explicit features is in 4-analysis/explicit-features.ipynb
.
The interpretation of the latent features is in 4-analysis/notebooks/latent-features.ipynb
.
The interpretation of the text features is in 4-analysis/text-features.ipynb
.
This project requires Python 3.6.
To cite this work, use:
@inproceedings{kristof2021war,
author = {Kristof, Victor and Suresh, Aswin and Grossglauser, Matthias and Thiran, Patrick},
title = {War of Words II: Enriched Models for Law-Making Processes},
year = {2021},
booktitle = {Proceedings of The Web Conference 2021},
TODO: pages = {2803–2809},
numpages = {12},
location = {Ljubljana, Solvenia},
series = {WWW '21}
}