A web crawler to pull Lincoln Douglas tournament round results from tabroom. Analyzes the round results through raw counts. Automates running statistics on the data. Used in creating the Statistical Analysis of Side Bias in Lincoln Douglas Debate article series published by the National Symposium for Debate.
For Java: jsoup
For Python:
pip install numpy
pip install pandas
pip install statsmodels
Create a "url.csv" file in the format:
Date | Season | Topic | Tournament | LD Tabroom Results URL |
---|
Compile with jsoup jar file: javac -cp ./jsoup-1.11.3.jar ./*.java
Running on Mac OS X: java -cp .:jsoup-1.11.3.jar Main
Running on Windows: java -cp .;jsoup-1.11.3.jar Main
On every run, data pulled from every url in "url.csv" will be parsed and added to the end of "_raw.csv" and "_numeric.csv" in raw and summary formats respectively. In addition, "_topicBias.csv" and "_tournament.csv" store summary counts for each topic and tournament.
Run: python statistics.py
A new "_stats.csv" will be created with statistics from the data scraped above. It will use "_topicBias.csv" and conduct a two-sided one-proportion z-test on the data. The format will be:
Topic | Neg Round Win % | p-value | Neg Ballot Win % | p-value |
---|
- Java 7
- Python 3.6